All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
@ 2014-09-01 13:33 john terragon
  2014-09-01 16:02 ` Chris Mason
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-01 13:33 UTC (permalink / raw)
  To: linux-btrfs

Hi.

I'm not sure if this is related to the hung task problem that I've
been seeing in this ml for a while. But  I've been having this
seemingly related problem with 3.15, 3.16 and now 3.17-rc3 (which, if
I'm not mistaken, should have a fix for the hung task problem). So
here it is: I have a usb flash drive with btrfs (on top of dmcrypt)
usually mounted with these options

rw,noatime,compress-force=zlib,ssd,space_cache,autodefrag

When I try to rsync the usb flash drive I get a truck-load of "INFO:
task rsync:2524 blocked for more than 120 seconds" as you can see
below.
The rsync process crawls into an almost complete stop and I can't even
kill it. I know the usb key is OK because I've tried the same thing
with ext4 and xfs and everything went fine.

Any ideas?

Thanks
John


[ 2763.077502] INFO: task rsync:2524 blocked for more than 120 seconds.
[ 2763.077513]       Not tainted 3.17.0-rc3-cu3 #1
[ 2763.077516] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2763.077521] rsync           D ffff880347b63840     0  2524   2523 0x00000000
[ 2763.077531]  ffff880347b633f0 0000000000000082 0000000000013200
ffff880347b2bfd8
[ 2763.077540]  0000000000013200 ffff880347b633f0 ffff8803f73af660
ffff880347b2baa0
[ 2763.077546]  ffff8803f73af664 ffff880347b633f0 00000000ffffffff
ffff8803f73af668
[ 2763.077554] Call Trace:
[ 2763.077573]  [<ffffffff8149d290>] ? schedule_preempt_disabled+0x20/0x60
[ 2763.077582]  [<ffffffff8149ed1b>] ? __mutex_lock_slowpath+0x14b/0x1d0
[ 2763.077593]  [<ffffffff810a7f6a>] ? del_timer_sync+0x4a/0x60
[ 2763.077601]  [<ffffffff8149edb6>] ? mutex_lock+0x16/0x25
[ 2763.077656]  [<ffffffffa03c789e>] ?
btrfs_wait_ordered_roots+0x3e/0x1f0 [btrfs]
[ 2763.077682]  [<ffffffffa039819a>] ? flush_space+0x1ea/0x4b0 [btrfs]
[ 2763.077706]  [<ffffffffa0391a65>] ? get_alloc_profile+0x85/0x1c0 [btrfs]
[ 2763.077730]  [<ffffffffa0397f51>] ? can_overcommit+0x81/0xe0 [btrfs]
[ 2763.077755]  [<ffffffffa0398800>] ?
reserve_metadata_bytes+0x1c0/0x3d0 [btrfs]
[ 2763.077780]  [<ffffffffa0398e58>] ? btrfs_block_rsv_add+0x28/0x50 [btrfs]
[ 2763.077811]  [<ffffffffa03afd72>] ? start_transaction+0x442/0x500 [btrfs]
[ 2763.077839]  [<ffffffffa03a2694>] ?
btrfs_check_dir_item_collision+0x74/0x100 [btrfs]
[ 2763.077871]  [<ffffffffa03bbe1f>] ? btrfs_rename2+0x15f/0x6d0 [btrfs]
[ 2763.077880]  [<ffffffff8105fbeb>] ? capable_wrt_inode_uidgid+0x4b/0x60
[ 2763.077887]  [<ffffffff8105fb00>] ? cap_validate_magic+0x100/0x100
[ 2763.077897]  [<ffffffff811685a1>] ? vfs_rename+0x5a1/0x790
[ 2763.077905]  [<ffffffff81164700>] ? follow_managed+0x2a0/0x2b0
[ 2763.077913]  [<ffffffff81168c13>] ? SYSC_renameat2+0x483/0x530
[ 2763.077922]  [<ffffffff81175bad>] ? notify_change+0x2cd/0x380
[ 2763.077927]  [<ffffffff8115d918>] ? __sb_end_write+0x28/0x60
[ 2763.077937]  [<ffffffff8125a528>] ? lockref_put_or_lock+0x48/0x80
[ 2763.077943]  [<ffffffff811709ed>] ? dput+0xad/0x170
[ 2763.077951]  [<ffffffff81163ffd>] ? path_put+0xd/0x20
[ 2763.077958]  [<ffffffff8115a321>] ? SyS_chmod+0x41/0x90
[ 2763.077966]  [<ffffffff814a0a29>] ? system_call_fastpath+0x16/0x1b
[ 2883.203005] INFO: task kworker/u16:11:1617 blocked for more than 120 seconds.
[ 2883.203017]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.203020] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.203024] kworker/u16:11  D ffff8804185d1740     0  1617      2 0x00000000
[ 2883.203085] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 2883.203091]  ffff8804185d12f0 0000000000000046 0000000000013200
ffff880435f3bfd8
[ 2883.203099]  0000000000013200 ffff8804185d12f0 ffff8803b86bef00
ffff8803e9a761f0
[ 2883.203106]  ffff8803e9a761f0 0000000000000001 0000000000000000
ffff8803aece6520
[ 2883.203113] Call Trace:
[ 2883.203149]  [<ffffffffa03ae717>] ?
wait_current_trans.isra.22+0x97/0xf0 [btrfs]
[ 2883.203161]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.203190]  [<ffffffffa03afbd8>] ? start_transaction+0x2a8/0x500 [btrfs]
[ 2883.203221]  [<ffffffffa03b7970>] ?
btrfs_finish_ordered_io+0x250/0x5c0 [btrfs]
[ 2883.203230]  [<ffffffff8100b579>] ? __switch_to+0x119/0x580
[ 2883.203261]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.203272]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.203280]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.203288]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.203294]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.203301]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203309]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.203315]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203332] INFO: task btrfs-transacti:2126 blocked for more than
120 seconds.
[ 2883.203336]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.203338] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.203341] btrfs-transacti D ffff8803e98a2860     0  2126      2 0x00000000
[ 2883.203348]  ffff8803e98a2410 0000000000000046 0000000000013200
ffff8803e9af3fd8
[ 2883.203355]  0000000000013200 ffff8803e98a2410 ffff88045fa13af0
ffff8803e9af3b48
[ 2883.203361]  ffff88045fdb2928 0000000000000002 ffffffff8149d750
ffff8803e9af3c08
[ 2883.203368] Call Trace:
[ 2883.203378]  [<ffffffff8149d750>] ? bit_wait+0x40/0x40
[ 2883.203386]  [<ffffffff8149d144>] ? io_schedule+0x94/0x120
[ 2883.203394]  [<ffffffff8149d773>] ? bit_wait_io+0x23/0x40
[ 2883.203402]  [<ffffffff8149d495>] ? __wait_on_bit+0x55/0x80
[ 2883.203410]  [<ffffffff810f7c0e>] ? wait_on_page_bit+0x6e/0x80
[ 2883.203418]  [<ffffffff8108c490>] ? autoremove_wake_function+0x30/0x30
[ 2883.203425]  [<ffffffff810f7cf0>] ? filemap_fdatawait_range+0xd0/0x160
[ 2883.203459]  [<ffffffffa03c81d2>] ?
btrfs_wait_ordered_range+0x62/0x120 [btrfs]
[ 2883.203490]  [<ffffffffa03ef4f7>] ?
__btrfs_write_out_cache+0x327/0x3e0 [btrfs]
[ 2883.203520]  [<ffffffffa03ba36c>] ? btrfs_iget+0x20c/0x6b0 [btrfs]
[ 2883.203551]  [<ffffffffa03ef899>] ? btrfs_write_out_cache+0xa9/0xe0 [btrfs]
[ 2883.203578]  [<ffffffffa039f80a>] ?
btrfs_write_dirty_block_groups+0x59a/0x680 [btrfs]
[ 2883.203606]  [<ffffffffa03ad495>] ? commit_cowonly_roots+0x185/0x240 [btrfs]
[ 2883.203634]  [<ffffffffa03af3a4>] ?
btrfs_commit_transaction+0x424/0x9b0 [btrfs]
[ 2883.203660]  [<ffffffffa03af9bb>] ? start_transaction+0x8b/0x500 [btrfs]
[ 2883.203686]  [<ffffffffa03ab07d>] ? transaction_kthread+0x1ad/0x240 [btrfs]
[ 2883.203712]  [<ffffffffa03aaed0>] ?
btrfs_cleanup_transaction+0x4d0/0x4d0 [btrfs]
[ 2883.203719]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.203725]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203732]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.203738]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203745] INFO: task kworker/u16:0:2287 blocked for more than 120 seconds.
[ 2883.203748]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.203751] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.203754] kworker/u16:0   D ffff88044b336d50     0  2287      2 0x00000000
[ 2883.203784] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 2883.203788]  ffff88044b336900 0000000000000046 0000000000013200
ffff880362893fd8
[ 2883.203795]  0000000000013200 ffff88044b336900 ffff8803b86bef00
ffff8803e9a761f0
[ 2883.203802]  ffff8803e9a761f0 0000000000000001 0000000000000000
ffff8803b45650c0
[ 2883.203808] Call Trace:
[ 2883.203835]  [<ffffffffa03ae717>] ?
wait_current_trans.isra.22+0x97/0xf0 [btrfs]
[ 2883.203843]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.203869]  [<ffffffffa03afbd8>] ? start_transaction+0x2a8/0x500 [btrfs]
[ 2883.203898]  [<ffffffffa03b7970>] ?
btrfs_finish_ordered_io+0x250/0x5c0 [btrfs]
[ 2883.203905]  [<ffffffff8100b579>] ? __switch_to+0x119/0x580
[ 2883.203934]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.203943]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.203951]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.203958]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.203964]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.203970]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203977]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.203983]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.203988] INFO: task kworker/u16:1:2395 blocked for more than 120 seconds.
[ 2883.203991]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.203994] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.203997] kworker/u16:1   D ffff880362469780     0  2395      2 0x00000000
[ 2883.204022] Workqueue: btrfs-flush_delalloc
btrfs_flush_delalloc_helper [btrfs]
[ 2883.204025]  ffff880362469330 0000000000000046 0000000000013200
ffff88036328ffd8
[ 2883.204032]  0000000000013200 ffff880362469330 ffff8803bea0d9a0
ffff8803bea0da18
[ 2883.204038]  ffff8803e990d840 ffff88044dc33100 0000000000000000
0000000000000000
[ 2883.204045] Call Trace:
[ 2883.204074]  [<ffffffffa03c7a9d>] ?
btrfs_start_ordered_extent+0x4d/0x90 [btrfs]
[ 2883.204081]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.204108]  [<ffffffffa03c7afc>] ?
btrfs_run_ordered_extent_work+0x1c/0x30 [btrfs]
[ 2883.204133]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.204141]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.204148]  [<ffffffff8106a993>] ? pwq_activate_delayed_work+0x23/0x30
[ 2883.204156]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.204163]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.204169]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.204175]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204181]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.204187]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204192] INFO: task kworker/u16:3:2461 blocked for more than 120 seconds.
[ 2883.204195]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.204198] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.204200] kworker/u16:3   D ffff88042b771740     0  2461      2 0x00000000
[ 2883.204225] Workqueue: btrfs-flush_delalloc
btrfs_flush_delalloc_helper [btrfs]
[ 2883.204228]  ffff88042b7712f0 0000000000000046 0000000000013200
ffff880349157fd8
[ 2883.204234]  0000000000013200 ffff88042b7712f0 ffff8803a5362808
ffff8803a5362880
[ 2883.204241]  ffff8803e990d840 ffff88044dc33100 0000000000000000
0000000000000000
[ 2883.204248] Call Trace:
[ 2883.204274]  [<ffffffffa03c7a9d>] ?
btrfs_start_ordered_extent+0x4d/0x90 [btrfs]
[ 2883.204281]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.204306]  [<ffffffffa03c7afc>] ?
btrfs_run_ordered_extent_work+0x1c/0x30 [btrfs]
[ 2883.204330]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.204337]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.204344]  [<ffffffff8106a993>] ? pwq_activate_delayed_work+0x23/0x30
[ 2883.204352]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.204359]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.204365]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.204371]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204377]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.204383]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204388] INFO: task kworker/u16:4:2462 blocked for more than 120 seconds.
[ 2883.204391]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.204394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.204396] kworker/u16:4   D ffff880348f257c0     0  2462      2 0x00000000
[ 2883.204421] Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
[ 2883.204424]  ffff880348f25370 0000000000000046 0000000000013200
ffff880348f2bfd8
[ 2883.204431]  0000000000013200 ffff880348f25370 ffff8803b86bef00
ffff8803e9a761f0
[ 2883.204437]  ffff8803e9a761f0 0000000000000001 0000000000000000
ffff8803b4565200
[ 2883.204444] Call Trace:
[ 2883.204468]  [<ffffffffa03ae717>] ?
wait_current_trans.isra.22+0x97/0xf0 [btrfs]
[ 2883.204475]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.204499]  [<ffffffffa03afbd8>] ? start_transaction+0x2a8/0x500 [btrfs]
[ 2883.204525]  [<ffffffffa03b7970>] ?
btrfs_finish_ordered_io+0x250/0x5c0 [btrfs]
[ 2883.204531]  [<ffffffff8100b579>] ? __switch_to+0x119/0x580
[ 2883.204556]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.204564]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.204571]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.204579]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.204584]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.204591]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204597]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.204603]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204607] INFO: task kworker/u16:5:2473 blocked for more than 120 seconds.
[ 2883.204610]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.204613] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.204616] kworker/u16:5   D ffff880426eb39c0     0  2473      2 0x00000000
[ 2883.204639] Workqueue: btrfs-flush_delalloc
btrfs_flush_delalloc_helper [btrfs]
[ 2883.204642]  ffff880426eb3570 0000000000000046 0000000000013200
ffff8800681b7fd8
[ 2883.204648]  0000000000013200 ffff880426eb3570 ffff8803a8f4c808
ffff8803a8f4c880
[ 2883.204655]  ffff8803e990d840 ffff88044dc33100 0000000000000000
0000000000000000
[ 2883.204662] Call Trace:
[ 2883.204688]  [<ffffffffa03c7a9d>] ?
btrfs_start_ordered_extent+0x4d/0x90 [btrfs]
[ 2883.204695]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.204720]  [<ffffffffa03c7afc>] ?
btrfs_run_ordered_extent_work+0x1c/0x30 [btrfs]
[ 2883.204743]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.204750]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.204758]  [<ffffffff8106a993>] ? pwq_activate_delayed_work+0x23/0x30
[ 2883.204765]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.204772]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.204778]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.204784]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204790]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.204796]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204801] INFO: task kworker/u16:6:2480 blocked for more than 120 seconds.
[ 2883.204805]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.204807] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.204810] kworker/u16:6   D ffff88044d610e50     0  2480      2 0x00000000
[ 2883.204832] Workqueue: btrfs-flush_delalloc
btrfs_flush_delalloc_helper [btrfs]
[ 2883.204835]  ffff88044d610a00 0000000000000046 0000000000013200
ffff880347a0ffd8
[ 2883.204842]  0000000000013200 ffff88044d610a00 ffff8803a80adcd0
ffff8803a80add48
[ 2883.204849]  ffff8803e990d840 ffff88044dc33100 0000000000000000
0000000000000000
[ 2883.204855] Call Trace:
[ 2883.204880]  [<ffffffffa03c7a9d>] ?
btrfs_start_ordered_extent+0x4d/0x90 [btrfs]
[ 2883.204887]  [<ffffffff8108c460>] ? prepare_to_wait_event+0xf0/0xf0
[ 2883.204911]  [<ffffffffa03c7afc>] ?
btrfs_run_ordered_extent_work+0x1c/0x30 [btrfs]
[ 2883.204933]  [<ffffffffa03dbb8f>] ? normal_work_helper+0xaf/0x190 [btrfs]
[ 2883.204941]  [<ffffffff8106cd47>] ? process_one_work+0x167/0x380
[ 2883.204948]  [<ffffffff8106a993>] ? pwq_activate_delayed_work+0x23/0x30
[ 2883.204955]  [<ffffffff8106d9b4>] ? worker_thread+0x114/0x480
[ 2883.204962]  [<ffffffff8106d8a0>] ? rescuer_thread+0x2b0/0x2b0
[ 2883.204968]  [<ffffffff81072248>] ? kthread+0xb8/0xd0
[ 2883.204974]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204980]  [<ffffffff814a097c>] ? ret_from_fork+0x7c/0xb0
[ 2883.204986]  [<ffffffff81072190>] ? kthread_create_on_node+0x170/0x170
[ 2883.204992] INFO: task rsync:2524 blocked for more than 120 seconds.
[ 2883.204995]       Not tainted 3.17.0-rc3-cu3 #1
[ 2883.204997] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[ 2883.205000] rsync           D ffff880347b63840     0  2524   2523 0x00000000
[ 2883.205006]  ffff880347b633f0 0000000000000082 0000000000013200
ffff880347b2bfd8
[ 2883.205013]  0000000000013200 ffff880347b633f0 ffff8803f73af660
ffff880347b2baa0
[ 2883.205019]  ffff8803f73af664 ffff880347b633f0 00000000ffffffff
ffff8803f73af668
[ 2883.205026] Call Trace:
[ 2883.205035]  [<ffffffff8149d290>] ? schedule_preempt_disabled+0x20/0x60
[ 2883.205043]  [<ffffffff8149ed1b>] ? __mutex_lock_slowpath+0x14b/0x1d0
[ 2883.205053]  [<ffffffff810a7f6a>] ? del_timer_sync+0x4a/0x60
[ 2883.205061]  [<ffffffff8149edb6>] ? mutex_lock+0x16/0x25
[ 2883.205085]  [<ffffffffa03c789e>] ?
btrfs_wait_ordered_roots+0x3e/0x1f0 [btrfs]
[ 2883.205109]  [<ffffffffa039819a>] ? flush_space+0x1ea/0x4b0 [btrfs]
[ 2883.205131]  [<ffffffffa0391a65>] ? get_alloc_profile+0x85/0x1c0 [btrfs]
[ 2883.205153]  [<ffffffffa0397f51>] ? can_overcommit+0x81/0xe0 [btrfs]
[ 2883.205176]  [<ffffffffa0398800>] ?
reserve_metadata_bytes+0x1c0/0x3d0 [btrfs]
[ 2883.205200]  [<ffffffffa0398e58>] ? btrfs_block_rsv_add+0x28/0x50 [btrfs]
[ 2883.205224]  [<ffffffffa03afd72>] ? start_transaction+0x442/0x500 [btrfs]
[ 2883.205249]  [<ffffffffa03a2694>] ?
btrfs_check_dir_item_collision+0x74/0x100 [btrfs]
[ 2883.205276]  [<ffffffffa03bbe1f>] ? btrfs_rename2+0x15f/0x6d0 [btrfs]
[ 2883.205283]  [<ffffffff8105fbeb>] ? capable_wrt_inode_uidgid+0x4b/0x60
[ 2883.205290]  [<ffffffff8105fb00>] ? cap_validate_magic+0x100/0x100
[ 2883.205300]  [<ffffffff811685a1>] ? vfs_rename+0x5a1/0x790
[ 2883.205307]  [<ffffffff81164700>] ? follow_managed+0x2a0/0x2b0
[ 2883.205315]  [<ffffffff81168c13>] ? SYSC_renameat2+0x483/0x530
[ 2883.205325]  [<ffffffff81175bad>] ? notify_change+0x2cd/0x380
[ 2883.205331]  [<ffffffff8115d918>] ? __sb_end_write+0x28/0x60
[ 2883.205341]  [<ffffffff8125a528>] ? lockref_put_or_lock+0x48/0x80
[ 2883.205348]  [<ffffffff811709ed>] ? dput+0xad/0x170
[ 2883.205355]  [<ffffffff81163ffd>] ? path_put+0xd/0x20
[ 2883.205363]  [<ffffffff8115a321>] ? SyS_chmod+0x41/0x90
[ 2883.205370]  [<ffffffff814a0a29>] ? system_call_fastpath+0x16/0x1b

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-01 13:33 kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds john terragon
@ 2014-09-01 16:02 ` Chris Mason
  2014-09-01 16:36   ` john terragon
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Mason @ 2014-09-01 16:02 UTC (permalink / raw)
  To: john terragon, linux-btrfs

On 09/01/2014 09:33 AM, john terragon wrote:
> Hi.
> 
> I'm not sure if this is related to the hung task problem that I've
> been seeing in this ml for a while. But  I've been having this
> seemingly related problem with 3.15, 3.16 and now 3.17-rc3 (which, if
> I'm not mistaken, should have a fix for the hung task problem). So
> here it is: I have a usb flash drive with btrfs (on top of dmcrypt)
> usually mounted with these options
> 
> rw,noatime,compress-force=zlib,ssd,space_cache,autodefrag
> 
> When I try to rsync the usb flash drive I get a truck-load of "INFO:
> task rsync:2524 blocked for more than 120 seconds" as you can see
> below.
> The rsync process crawls into an almost complete stop and I can't even
> kill it. I know the usb key is OK because I've tried the same thing
> with ext4 and xfs and everything went fine.

This does have all of our fixes for hangs.  Does the rsync eventually
complete?  Or do we just sit there forever?

-chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-01 16:02 ` Chris Mason
@ 2014-09-01 16:36   ` john terragon
  2014-09-02  5:20     ` Duncan
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-01 16:36 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

I was trying it again and it seems to have completed, albeit very
slowly (even for an usb flash drive). Was the 3.14 series the last
immune one from this problem? Should I try the latest 3.14.x?

Thanks
John

On Mon, Sep 1, 2014 at 6:02 PM, Chris Mason <clm@fb.com> wrote:
> On 09/01/2014 09:33 AM, john terragon wrote:
>> Hi.
>>
>> I'm not sure if this is related to the hung task problem that I've
>> been seeing in this ml for a while. But  I've been having this
>> seemingly related problem with 3.15, 3.16 and now 3.17-rc3 (which, if
>> I'm not mistaken, should have a fix for the hung task problem). So
>> here it is: I have a usb flash drive with btrfs (on top of dmcrypt)
>> usually mounted with these options
>>
>> rw,noatime,compress-force=zlib,ssd,space_cache,autodefrag
>>
>> When I try to rsync the usb flash drive I get a truck-load of "INFO:
>> task rsync:2524 blocked for more than 120 seconds" as you can see
>> below.
>> The rsync process crawls into an almost complete stop and I can't even
>> kill it. I know the usb key is OK because I've tried the same thing
>> with ext4 and xfs and everything went fine.
>
> This does have all of our fixes for hangs.  Does the rsync eventually
> complete?  Or do we just sit there forever?
>
> -chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-01 16:36   ` john terragon
@ 2014-09-02  5:20     ` Duncan
  2014-09-02  6:12       ` john terragon
  2014-09-03  3:32       ` Zygo Blaxell
  0 siblings, 2 replies; 19+ messages in thread
From: Duncan @ 2014-09-02  5:20 UTC (permalink / raw)
  To: linux-btrfs

john terragon posted on Mon, 01 Sep 2014 18:36:49 +0200 as excerpted:

> I was trying it again and it seems to have completed, albeit very slowly
> (even for an usb flash drive). Was the 3.14 series the last immune one
> from this problem? Should I try the latest 3.14.x?

The 3.14 series was before the switch to generic kworker threads, while 
btrfs still had its own custom work-queue threads.  There was known to be 
a very specific problem with the kworker threads, but in 3.17-rc3 that 
should be fixed.

So it may well be a problem with btrfs in general, at least as it exists 
today and historically, in which case 3.14.x won't help you much if at 
all.

But I'd definitely recommend trying it.  If 3.14 is significantly faster 
and it's repeatedly so, then there's obviously some other regression, 
either with kworker threads or with something else, since then.  If not, 
then at least we know for sure kworker threads aren't a factor, since 
3.14 was previous to them entering the picture.


The other possibility I'm aware of would be erase-block related.  I see 
you're using autodefrag so it shouldn't be direct file fragmentation, but 
particularly if the filesystem has been used for some time, it might be 
the firmware trying to shuffle things around and having trouble due to 
having already used up all the known-free erase blocks so it's having to 
stop and free one by shifting things around every time it needs another 
one, and that's what's taking the time.

What does btrfs fi show say about free space (the device line (lines, for 
multi-device btrfs) size vs. used, not the top line, is the interesting 
bit)?  What does btrfs fi df say for data and metadata (total vs. used)?

For btrfs fi df ideally your data/metadata spread between used and total 
shouldn't be too large (a few gig for data and a gig or so for metadata 
isn't too bad, assuming a large enough device, of course).  If it is, a 
balance may be in order, perhaps using the -dusage=20 and/or -musage=20 
style options to keep it from rebalancing everything (read up on the wiki 
and choose your number, 5 might be good if there's plenty of room, you 
might need 50 or higher if you're close to full, more than about 80 and 
you might as well just use -d or -m and forget the usage bit).

Similarly, for btrfs fi show, you want as much space as possible left, 
several gigs at least if your device isn't too small for that to be 
practical.  Again, if btrfs fi df is out of balance it'll use more space 
in show as well, and a balance should retrieve some of it.

Once you have some space to work with (or before the balance if you 
suspect your firmware is SERIOUSLY out of space and shuffling, as that'll 
slow the balance down too, and again after), try running fstrim on the 
device.  It may or may not work on that device, but if it does and the 
firmware /was/ out of space and having to shuffle hard, it could improve 
performance *DRAMATICALLY*.  The reason being that on devices where it 
works, fstrim will tell the firmware what blocks are free, allowing it 
more flexibility in erase-block shuffling.

If that makes a big difference, you can /try/ the discard mount option.  
Tho doing the trim/discard as part of normal operations can slow them 
down some too.  The alternative would be to simply run fstrim 
periodically, perhaps every Nth rsync or some such.  Note that as the 
fstrim manpage says, the output of fstrim run repeatedly will be the 
same, since it only knows what areas are candidates to trim, not which 
ones are already trimmed, but it shouldn't hurt the device any to 
repeatedly fstrim it, and if you do it every N rsyncs, it should keep 
things from getting too bad again.

The other thing to consider if you haven't already is the ssd_spread 
mount option.  The documentation suggests it can be helpful on lower 
quality SSDs and USB sticks which fits your use-case, so I'd try it.  Tho 
it probably won't work at its ideal unless you do a fresh mkfs (or near 
full balance with it enabled).  But it's something to at least consider 
and possibly try if you haven't.  Depending on the firmware and erase-
block layout, it could help.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02  5:20     ` Duncan
@ 2014-09-02  6:12       ` john terragon
  2014-09-02  6:40         ` Duncan
  2014-09-03  3:32       ` Zygo Blaxell
  1 sibling, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-02  6:12 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

I will definitely try the latest 3.14.x (never had any problem of this
kind with it). And I'll look into the other possibilities you pointed
out. However what I can tell you right now is this:

-the filesystem was "new". I've been bitten by this bug with 3.15 and
3.16 and I kept
 trying to do the same thing (create the fs, rsync or cp the same
stuff) to see if it
 got better.

-there does not seem to be a problem of space because the volume is
about 14G and in the end about 8G are usually occupied (when the
process terminates). I always used compression one way or another,
either forced or not and either lzo of zlib. Maybe I should try
without compression.

-it's not one specific usb flash drive. I tried several ones and I
always get the same behaviour.

-The process freezes for several minutes. It's completely frozen, no
I/O. So even if the firmware of the usb key is shuffling things around
blocking everything, it shouldn't take all that time for a small
amount of data. Also, as I mentioned, I tried ext4 and xfs and the
data seems to be written in a continuous way, without any big lock
(even though I realize that ext4 and xfs have very different writing
patterns than a cow filesystem, so I can't be sure it's significant).

Thanks
John






On Tue, Sep 2, 2014 at 7:20 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> john terragon posted on Mon, 01 Sep 2014 18:36:49 +0200 as excerpted:
>
>> I was trying it again and it seems to have completed, albeit very slowly
>> (even for an usb flash drive). Was the 3.14 series the last immune one
>> from this problem? Should I try the latest 3.14.x?
>
> The 3.14 series was before the switch to generic kworker threads, while
> btrfs still had its own custom work-queue threads.  There was known to be
> a very specific problem with the kworker threads, but in 3.17-rc3 that
> should be fixed.
>
> So it may well be a problem with btrfs in general, at least as it exists
> today and historically, in which case 3.14.x won't help you much if at
> all.
>
> But I'd definitely recommend trying it.  If 3.14 is significantly faster
> and it's repeatedly so, then there's obviously some other regression,
> either with kworker threads or with something else, since then.  If not,
> then at least we know for sure kworker threads aren't a factor, since
> 3.14 was previous to them entering the picture.
>
>
> The other possibility I'm aware of would be erase-block related.  I see
> you're using autodefrag so it shouldn't be direct file fragmentation, but
> particularly if the filesystem has been used for some time, it might be
> the firmware trying to shuffle things around and having trouble due to
> having already used up all the known-free erase blocks so it's having to
> stop and free one by shifting things around every time it needs another
> one, and that's what's taking the time.
>
> What does btrfs fi show say about free space (the device line (lines, for
> multi-device btrfs) size vs. used, not the top line, is the interesting
> bit)?  What does btrfs fi df say for data and metadata (total vs. used)?
>
> For btrfs fi df ideally your data/metadata spread between used and total
> shouldn't be too large (a few gig for data and a gig or so for metadata
> isn't too bad, assuming a large enough device, of course).  If it is, a
> balance may be in order, perhaps using the -dusage=20 and/or -musage=20
> style options to keep it from rebalancing everything (read up on the wiki
> and choose your number, 5 might be good if there's plenty of room, you
> might need 50 or higher if you're close to full, more than about 80 and
> you might as well just use -d or -m and forget the usage bit).
>
> Similarly, for btrfs fi show, you want as much space as possible left,
> several gigs at least if your device isn't too small for that to be
> practical.  Again, if btrfs fi df is out of balance it'll use more space
> in show as well, and a balance should retrieve some of it.
>
> Once you have some space to work with (or before the balance if you
> suspect your firmware is SERIOUSLY out of space and shuffling, as that'll
> slow the balance down too, and again after), try running fstrim on the
> device.  It may or may not work on that device, but if it does and the
> firmware /was/ out of space and having to shuffle hard, it could improve
> performance *DRAMATICALLY*.  The reason being that on devices where it
> works, fstrim will tell the firmware what blocks are free, allowing it
> more flexibility in erase-block shuffling.
>
> If that makes a big difference, you can /try/ the discard mount option.
> Tho doing the trim/discard as part of normal operations can slow them
> down some too.  The alternative would be to simply run fstrim
> periodically, perhaps every Nth rsync or some such.  Note that as the
> fstrim manpage says, the output of fstrim run repeatedly will be the
> same, since it only knows what areas are candidates to trim, not which
> ones are already trimmed, but it shouldn't hurt the device any to
> repeatedly fstrim it, and if you do it every N rsyncs, it should keep
> things from getting too bad again.
>
> The other thing to consider if you haven't already is the ssd_spread
> mount option.  The documentation suggests it can be helpful on lower
> quality SSDs and USB sticks which fits your use-case, so I'd try it.  Tho
> it probably won't work at its ideal unless you do a fresh mkfs (or near
> full balance with it enabled).  But it's something to at least consider
> and possibly try if you haven't.  Depending on the firmware and erase-
> block layout, it could help.
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02  6:12       ` john terragon
@ 2014-09-02  6:40         ` Duncan
  2014-09-02 19:56           ` john terragon
  2014-09-03  2:44           ` Chris Murphy
  0 siblings, 2 replies; 19+ messages in thread
From: Duncan @ 2014-09-02  6:40 UTC (permalink / raw)
  To: linux-btrfs

john terragon posted on Tue, 02 Sep 2014 08:12:36 +0200 as excerpted:

> I will definitely try the latest 3.14.x (never had any problem of this
> kind with it). And I'll look into the other possibilities you pointed
> out. However what I can tell you right now is this:
> 
> -the filesystem was "new". I've been bitten by this bug with 3.15 and
> 3.16 and I kept
>  trying to do the same thing (create the fs, rsync or cp the same
> stuff) to see if it
>  got better.

OK.  I had read your post as implying that the filesystem had been around 
since before 3.14, in which case the firmware shuffling could well have 
been a factor.  If it was a brand new filesystem, then likely not, as 
mkfs.btrfs tries to do a trim of the whole filesystem range before it 
sets up.

But that does remind me of one other possibility I had thought to mention 
and then forgot... that's even more likely now that it's known to be a 
new filesystem...

I don't recall the btrfs-progs version, but somewhere along the line one 
other thing of potential interest changed:

Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it defaults 
to 16 KiB as that's far better for most usage.  I wonder if USB sticks 
are an exception...

Since this is set at mkfs time, trying a 3.14 series kernel with current 
mkfs.btrfs defaults shouldn't change things; if the 16 KiB nodesize is 
why it's slow, it should still be slow with the 3.14 series kernel.

Conversely, if this is the problem, specifically creating the filesystem 
with --nodeside 4k should fix it, and it should stay fixed regardless of 
what kernel you use with it, 3.14, 3.16, 3.17-rcX, shouldn't matter.

And that'd be a very useful thing to put on the wiki as well, should it 
be found to be the case.  So please test and post if it helps (and feel 
free to put it on the wiki too if it works)!  =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02  6:40         ` Duncan
@ 2014-09-02 19:56           ` john terragon
  2014-09-02 20:10             ` Chris Mason
  2014-09-03  2:44           ` Chris Murphy
  1 sibling, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-02 19:56 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

Nice...now I get the hung task even with 3.14.17.... And I tried with
4K for node and leaf size...same result. And to top it all off, today
I've been bitten by the bug also on my main root fs (which is on two
fast ssd), although with 3.16.1.

Is it at least safe for the data? I mean, as long as the hung process
terminates and no other error shows up, can I at least be sure that
the data written is correct?

Thanks
John


On Tue, Sep 2, 2014 at 8:40 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> john terragon posted on Tue, 02 Sep 2014 08:12:36 +0200 as excerpted:
>
>> I will definitely try the latest 3.14.x (never had any problem of this
>> kind with it). And I'll look into the other possibilities you pointed
>> out. However what I can tell you right now is this:
>>
>> -the filesystem was "new". I've been bitten by this bug with 3.15 and
>> 3.16 and I kept
>>  trying to do the same thing (create the fs, rsync or cp the same
>> stuff) to see if it
>>  got better.
>
> OK.  I had read your post as implying that the filesystem had been around
> since before 3.14, in which case the firmware shuffling could well have
> been a factor.  If it was a brand new filesystem, then likely not, as
> mkfs.btrfs tries to do a trim of the whole filesystem range before it
> sets up.
>
> But that does remind me of one other possibility I had thought to mention
> and then forgot... that's even more likely now that it's known to be a
> new filesystem...
>
> I don't recall the btrfs-progs version, but somewhere along the line one
> other thing of potential interest changed:
>
> Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it defaults
> to 16 KiB as that's far better for most usage.  I wonder if USB sticks
> are an exception...
>
> Since this is set at mkfs time, trying a 3.14 series kernel with current
> mkfs.btrfs defaults shouldn't change things; if the 16 KiB nodesize is
> why it's slow, it should still be slow with the 3.14 series kernel.
>
> Conversely, if this is the problem, specifically creating the filesystem
> with --nodeside 4k should fix it, and it should stay fixed regardless of
> what kernel you use with it, 3.14, 3.16, 3.17-rcX, shouldn't matter.
>
> And that'd be a very useful thing to put on the wiki as well, should it
> be found to be the case.  So please test and post if it helps (and feel
> free to put it on the wiki too if it works)!  =:^)
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02 19:56           ` john terragon
@ 2014-09-02 20:10             ` Chris Mason
  2014-09-02 20:23               ` john terragon
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Mason @ 2014-09-02 20:10 UTC (permalink / raw)
  To: john terragon, Duncan; +Cc: linux-btrfs

> On 09/02/2014 03:56 PM, john terragon wrote:
> Nice...now I get the hung task even with 3.14.17.... And I tried with
> 4K for node and leaf size...same result. And to top it all off, today
> I've been bitten by the bug also on my main root fs (which is on two
> fast ssd), although with 3.16.1.
> 
> Is it at least safe for the data? I mean, as long as the hung process
> terminates and no other error shows up, can I at least be sure that
> the data written is correct?

Your traces are a little different.  The ENOSPC code is throttling
things to make sure you have enough room for the writes you're doing.
The code we have in 3.17-rc3 (or my for-linus branch) are the best
choices right now.  You can pull that down to 3.16 if you want all the
fixes on a more stable kernel.

Nailing down the ENOSPC code is going to be a little different, I think
autodefrag probably isn't interacting well with being short on space and
encryption.  This is leading to much more IO than we'd normally do, and
dm-crypt makes it fairly intensive.

Can you try flipping off autodefrag?

-chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02 20:10             ` Chris Mason
@ 2014-09-02 20:23               ` john terragon
  2014-09-02 20:48                 ` john terragon
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-02 20:23 UTC (permalink / raw)
  To: Chris Mason; +Cc: Duncan, linux-btrfs

I don't know what to tell you about the ENOSPC code being heavily
involved. At this point I'm using this simple test to see if things
improve:

-freshly created btrfs on dmcrypt,
-rsync some stuff (since the fs is empty I could just use cp but I
keep the test the same as it was when I had the problem for the first
time)
-note: the rsynced stuff is about the size of the volume but with
compression I always end up with 1/2 to 3/4 free space

I'm not sure how do I even get close to involving the ENOSPC code but
probably I'm not fully aware of the inner workings of btrfs.

> Can you try flipping off autodefrag?

As soon as the damn unkillable rsync decides to obey the kill -9...

Thanks

John

On Tue, Sep 2, 2014 at 10:10 PM, Chris Mason <clm@fb.com> wrote:
>> On 09/02/2014 03:56 PM, john terragon wrote:
>> Nice...now I get the hung task even with 3.14.17.... And I tried with
>> 4K for node and leaf size...same result. And to top it all off, today
>> I've been bitten by the bug also on my main root fs (which is on two
>> fast ssd), although with 3.16.1.
>>
>> Is it at least safe for the data? I mean, as long as the hung process
>> terminates and no other error shows up, can I at least be sure that
>> the data written is correct?
>
> Your traces are a little different.  The ENOSPC code is throttling
> things to make sure you have enough room for the writes you're doing.
> The code we have in 3.17-rc3 (or my for-linus branch) are the best
> choices right now.  You can pull that down to 3.16 if you want all the
> fixes on a more stable kernel.
>
> Nailing down the ENOSPC code is going to be a little different, I think
> autodefrag probably isn't interacting well with being short on space and
> encryption.  This is leading to much more IO than we'd normally do, and
> dm-crypt makes it fairly intensive.
>
> Can you try flipping off autodefrag?
>
> -chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02 20:23               ` john terragon
@ 2014-09-02 20:48                 ` john terragon
  2014-09-03  1:31                   ` john terragon
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-02 20:48 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

OK, so I'm using 3.17-rc3, same test on a flash usb drive, no
autodefrag. The situation is even stranger. The rsync is clearly
stuck, it's trying to write the same file for much more than 120 secs.
However dmesg is clean, no "INFO: task kworker/u16:11:1763 blocked for
more than 120 seconds" or anything.
df is responsive but shows no increase in used space.
Consider that with autodefrag this bug is completely "reliable", the
hung-task info starts to show up almost immediately.

Oh wait (I'm live...) now rsync is unstuck, files are being written
and df shows an increase in used space. BUT, still no hung-task
message in the kernel log, even though rsync was actually stuck for
several minutes.

So, to summarize, same conditions except no autodefrag. Result:
process stuck for way more than 120 secs but this time no complaints
in the kernel log.

Thanks
John



On Tue, Sep 2, 2014 at 10:23 PM, john terragon <jterragon@gmail.com> wrote:
> I don't know what to tell you about the ENOSPC code being heavily
> involved. At this point I'm using this simple test to see if things
> improve:
>
> -freshly created btrfs on dmcrypt,
> -rsync some stuff (since the fs is empty I could just use cp but I
> keep the test the same as it was when I had the problem for the first
> time)
> -note: the rsynced stuff is about the size of the volume but with
> compression I always end up with 1/2 to 3/4 free space
>
> I'm not sure how do I even get close to involving the ENOSPC code but
> probably I'm not fully aware of the inner workings of btrfs.
>
>> Can you try flipping off autodefrag?
>
> As soon as the damn unkillable rsync decides to obey the kill -9...
>
> Thanks
>
> John
>
> On Tue, Sep 2, 2014 at 10:10 PM, Chris Mason <clm@fb.com> wrote:
>>> On 09/02/2014 03:56 PM, john terragon wrote:
>>> Nice...now I get the hung task even with 3.14.17.... And I tried with
>>> 4K for node and leaf size...same result. And to top it all off, today
>>> I've been bitten by the bug also on my main root fs (which is on two
>>> fast ssd), although with 3.16.1.
>>>
>>> Is it at least safe for the data? I mean, as long as the hung process
>>> terminates and no other error shows up, can I at least be sure that
>>> the data written is correct?
>>
>> Your traces are a little different.  The ENOSPC code is throttling
>> things to make sure you have enough room for the writes you're doing.
>> The code we have in 3.17-rc3 (or my for-linus branch) are the best
>> choices right now.  You can pull that down to 3.16 if you want all the
>> fixes on a more stable kernel.
>>
>> Nailing down the ENOSPC code is going to be a little different, I think
>> autodefrag probably isn't interacting well with being short on space and
>> encryption.  This is leading to much more IO than we'd normally do, and
>> dm-crypt makes it fairly intensive.
>>
>> Can you try flipping off autodefrag?
>>
>> -chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02 20:48                 ` john terragon
@ 2014-09-03  1:31                   ` john terragon
  2014-09-03 12:36                     ` Chris Mason
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-03  1:31 UTC (permalink / raw)
  To: Chris Mason; +Cc: linux-btrfs

Rsync finished. FWIW in the end it reported an average speed of about
 900K/sec. Without autodefrag there have been no messages about hung
kworkers even though rsync seemingly keeps getting hung for several
minutes throughout the whole execution.

Thanks
John



On Tue, Sep 2, 2014 at 10:48 PM, john terragon <jterragon@gmail.com> wrote:
> OK, so I'm using 3.17-rc3, same test on a flash usb drive, no
> autodefrag. The situation is even stranger. The rsync is clearly
> stuck, it's trying to write the same file for much more than 120 secs.
> However dmesg is clean, no "INFO: task kworker/u16:11:1763 blocked for
> more than 120 seconds" or anything.
> df is responsive but shows no increase in used space.
> Consider that with autodefrag this bug is completely "reliable", the
> hung-task info starts to show up almost immediately.
>
> Oh wait (I'm live...) now rsync is unstuck, files are being written
> and df shows an increase in used space. BUT, still no hung-task
> message in the kernel log, even though rsync was actually stuck for
> several minutes.
>
> So, to summarize, same conditions except no autodefrag. Result:
> process stuck for way more than 120 secs but this time no complaints
> in the kernel log.
>
> Thanks
> John
>
>
>
> On Tue, Sep 2, 2014 at 10:23 PM, john terragon <jterragon@gmail.com> wrote:
>> I don't know what to tell you about the ENOSPC code being heavily
>> involved. At this point I'm using this simple test to see if things
>> improve:
>>
>> -freshly created btrfs on dmcrypt,
>> -rsync some stuff (since the fs is empty I could just use cp but I
>> keep the test the same as it was when I had the problem for the first
>> time)
>> -note: the rsynced stuff is about the size of the volume but with
>> compression I always end up with 1/2 to 3/4 free space
>>
>> I'm not sure how do I even get close to involving the ENOSPC code but
>> probably I'm not fully aware of the inner workings of btrfs.
>>
>>> Can you try flipping off autodefrag?
>>
>> As soon as the damn unkillable rsync decides to obey the kill -9...
>>
>> Thanks
>>
>> John
>>
>> On Tue, Sep 2, 2014 at 10:10 PM, Chris Mason <clm@fb.com> wrote:
>>>> On 09/02/2014 03:56 PM, john terragon wrote:
>>>> Nice...now I get the hung task even with 3.14.17.... And I tried with
>>>> 4K for node and leaf size...same result. And to top it all off, today
>>>> I've been bitten by the bug also on my main root fs (which is on two
>>>> fast ssd), although with 3.16.1.
>>>>
>>>> Is it at least safe for the data? I mean, as long as the hung process
>>>> terminates and no other error shows up, can I at least be sure that
>>>> the data written is correct?
>>>
>>> Your traces are a little different.  The ENOSPC code is throttling
>>> things to make sure you have enough room for the writes you're doing.
>>> The code we have in 3.17-rc3 (or my for-linus branch) are the best
>>> choices right now.  You can pull that down to 3.16 if you want all the
>>> fixes on a more stable kernel.
>>>
>>> Nailing down the ENOSPC code is going to be a little different, I think
>>> autodefrag probably isn't interacting well with being short on space and
>>> encryption.  This is leading to much more IO than we'd normally do, and
>>> dm-crypt makes it fairly intensive.
>>>
>>> Can you try flipping off autodefrag?
>>>
>>> -chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02  6:40         ` Duncan
  2014-09-02 19:56           ` john terragon
@ 2014-09-03  2:44           ` Chris Murphy
  2014-09-03  5:37             ` Duncan
  2014-09-03  6:03             ` john terragon
  1 sibling, 2 replies; 19+ messages in thread
From: Chris Murphy @ 2014-09-03  2:44 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 2, 2014, at 12:40 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> 
> Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it defaults 
> to 16 KiB as that's far better for most usage.  I wonder if USB sticks 
> are an exception…

USB sticks > 1 GB get 16KB nodesize also. At <= 1 GB, mixed-bg is default as is 4KB nodesize. Probably because queue/rotational is 1 for USB sticks, they mount without ssd or ssd_spread which may be unfortunate (I haven't benchmarked it but I suspect ssd_spread would work well for USB sticks).

It was suggested a while ago that maybe mixed-bg should apply to larger volumes, maybe up to 8GB or 16GB?


Chris Murphy


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-02  5:20     ` Duncan
  2014-09-02  6:12       ` john terragon
@ 2014-09-03  3:32       ` Zygo Blaxell
  1 sibling, 0 replies; 19+ messages in thread
From: Zygo Blaxell @ 2014-09-03  3:32 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2710 bytes --]

On Tue, Sep 02, 2014 at 05:20:29AM +0000, Duncan wrote:
> suspect your firmware is SERIOUSLY out of space and shuffling, as that'll 
> slow the balance down too, and again after), try running fstrim on the 
> device.  It may or may not work on that device, but if it does and the 
> firmware /was/ out of space and having to shuffle hard, it could improve 
> performance *DRAMATICALLY*.  The reason being that on devices where it 
> works, fstrim will tell the firmware what blocks are free, allowing it 
> more flexibility in erase-block shuffling.
> 
> If that makes a big difference, you can /try/ the discard mount option.  
> Tho doing the trim/discard as part of normal operations can slow them 
> down some too.  The alternative would be to simply run fstrim 
> periodically, perhaps every Nth rsync or some such.  Note that as the 
> fstrim manpage says, the output of fstrim run repeatedly will be the 
> same, since it only knows what areas are candidates to trim, not which 
> ones are already trimmed, but it shouldn't hurt the device any to 
> repeatedly fstrim it, and if you do it every N rsyncs, it should keep 
> things from getting too bad again.

Note that dm-crypt does not pass discards to the underlying block device
by default for security reasons (john didn't mention the dm-crypt options
he was using).  cryptsetup has the --allow-discards option, /etc/crypttab
has the discard option to enable this.


I've seen hung task timeouts on several filesystems under 3.14.17 and
3.15.8-9 (mostly on spinning disks with dm-crypt and lvm2 underneath,
but sometimes without either).  I adjusted kernel.hung_task_timeout_secs
from 120 to 960 and started running balances regularly, which helps
mitigate this problem, but not eliminate it (ironically, when a balance
is resumed at boot, it's usually one of the hung tasks in the kernel log).

A fairly good way to see this is to run 'btrfs fi defrag' on large files,
'btrfs balance' with large extents on the filesystem, or write a big
file quickly (1GB+ in <30 sec).  If a filesystem is more than 90% full
and free space is heavily fragmented (especially by rolling snapshots),
allocating large contiguous areas seems to take a long time, and it
seems to block some or all other allocations at the same time (I haven't
rigorously identified these, but it seems to include everything that calls
fsync() or performs certain metadata operations).  The writes usually do
finish in a few minutes, but write latency (measured by timing a 'mkdir'
call at regular intervals) can spike as high as 9+ hours.  Most people
(and watchdog robots) are reaching for the RESET button in less than
five minutes.  :-/


[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03  2:44           ` Chris Murphy
@ 2014-09-03  5:37             ` Duncan
  2014-09-03  6:03             ` john terragon
  1 sibling, 0 replies; 19+ messages in thread
From: Duncan @ 2014-09-03  5:37 UTC (permalink / raw)
  To: linux-btrfs

Chris Murphy posted on Tue, 02 Sep 2014 20:44:06 -0600 as excerpted:

> On Sep 2, 2014, at 12:40 AM, Duncan <1i5t5.duncan@cox.net> wrote:
>> 
>> Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it
>> defaults to 16 KiB as that's far better for most usage.  I wonder if
>> USB sticks are an exception…
> 
> USB sticks > 1 GB get 16KB nodesize also.

Seems you read into that a meaning I didn't even consider when I wrote 
it. =:^/

I /meant/ that AFAIK mkfs.btrfs did the usual 16K thing, but that perhaps 
(some, depending on erase-block size) USB sticks are an exception to 16K 
being better than 4K thing.

> At <= 1 GB, mixed-bg is
> default as is 4KB nodesize. Probably because queue/rotational is 1 for
> USB sticks, they mount without ssd or ssd_spread which may be
> unfortunate (I haven't benchmarked it but I suspect ssd_spread would
> work well for USB sticks).

I did mention ssd_spread somewhere in my replies, due to the same 
suspicion.  Good to see you have the same suspicion I do. =:^)

> It was suggested a while ago that maybe mixed-bg should apply to larger
> volumes, maybe up to 8GB or 16GB?

Indeed.  Considering the default data chunk size is 1 GiB, that as the 
cutoff for default mixed-bg mode seems kinda low.  If anything, I think 
16 GiB is still a low cutoff value, at least as long as the only way to 
reclaim out-of-balance data/metadata assigned chunks is via a manual 
balance.  I'd suggest a 32 GiB cutover by default.

Tho if btrfs gets a good auto-balance-trigger mechanism that say triggers 
when allocated (fi show device figure) is within say 10% of total 
filesystem space (rounded up to the nearest GiB, minimum 2 GiB) AND data 
or metadata has more than say 10% spread between used and allocated (fi 
df, again rounded up, 2 GiB minimum), that can arguably come down a 
notch, to say 16 GiB.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03  2:44           ` Chris Murphy
  2014-09-03  5:37             ` Duncan
@ 2014-09-03  6:03             ` john terragon
  2014-09-03  9:14               ` Liu Bo
  1 sibling, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-03  6:03 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

I tried the same routine on 32GB usb sticks. Same exact problems. 32GB
seems a bit much for a --mixed btrfs.
I haven't tried ssd_spread, maybe it's beneficial. However, as I wrote
above, disabling autodefrag gets rid completely of the "INFO: hung
task" messages but even though the kernel doesn't complain about
blocked kworkers, the rsync process still  blocks for several minutes
throughout the whole copy.


On Wed, Sep 3, 2014 at 4:44 AM, Chris Murphy <lists@colorremedies.com> wrote:
>
> On Sep 2, 2014, at 12:40 AM, Duncan <1i5t5.duncan@cox.net> wrote:
>>
>> Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it defaults
>> to 16 KiB as that's far better for most usage.  I wonder if USB sticks
>> are an exception...
>
> USB sticks > 1 GB get 16KB nodesize also. At <= 1 GB, mixed-bg is default as is 4KB nodesize. Probably because queue/rotational is 1 for USB sticks, they mount without ssd or ssd_spread which may be unfortunate (I haven't benchmarked it but I suspect ssd_spread would work well for USB sticks).
>
> It was suggested a while ago that maybe mixed-bg should apply to larger volumes, maybe up to 8GB or 16GB?
>
>
> Chris Murphy
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03  6:03             ` john terragon
@ 2014-09-03  9:14               ` Liu Bo
  0 siblings, 0 replies; 19+ messages in thread
From: Liu Bo @ 2014-09-03  9:14 UTC (permalink / raw)
  To: john terragon; +Cc: Chris Murphy, Btrfs BTRFS

On Wed, Sep 03, 2014 at 08:03:51AM +0200, john terragon wrote:
> I tried the same routine on 32GB usb sticks. Same exact problems. 32GB
> seems a bit much for a --mixed btrfs.
> I haven't tried ssd_spread, maybe it's beneficial. However, as I wrote
> above, disabling autodefrag gets rid completely of the "INFO: hung
> task" messages but even though the kernel doesn't complain about
> blocked kworkers, the rsync process still  blocks for several minutes
> throughout the whole copy.

It's very nice to know that you can reproduce it with autodefrag.

I made a few analysis on the provided blocked stacks, the key is what blocks 
writing free space cache's pages to finish, it sits on wait_on_page_bit() which
expects a WRITEBACK bit.

Could you please paste the output of sysrq-w and sysrq-t when you get that hang?

thanks,
-liubo

> 
> 
> On Wed, Sep 3, 2014 at 4:44 AM, Chris Murphy <lists@colorremedies.com> wrote:
> >
> > On Sep 2, 2014, at 12:40 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> >>
> >> Mkfs.btrfs used to default to 4 KiB node/leaf sizes; now days it defaults
> >> to 16 KiB as that's far better for most usage.  I wonder if USB sticks
> >> are an exception...
> >
> > USB sticks > 1 GB get 16KB nodesize also. At <= 1 GB, mixed-bg is default as is 4KB nodesize. Probably because queue/rotational is 1 for USB sticks, they mount without ssd or ssd_spread which may be unfortunate (I haven't benchmarked it but I suspect ssd_spread would work well for USB sticks).
> >
> > It was suggested a while ago that maybe mixed-bg should apply to larger volumes, maybe up to 8GB or 16GB?
> >
> >
> > Chris Murphy
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03  1:31                   ` john terragon
@ 2014-09-03 12:36                     ` Chris Mason
  2014-09-03 14:11                       ` john terragon
  0 siblings, 1 reply; 19+ messages in thread
From: Chris Mason @ 2014-09-03 12:36 UTC (permalink / raw)
  To: john terragon; +Cc: linux-btrfs

On 09/02/2014 09:31 PM, john terragon wrote:
> Rsync finished. FWIW in the end it reported an average speed of about
>  900K/sec. Without autodefrag there have been no messages about hung
> kworkers even though rsync seemingly keeps getting hung for several
> minutes throughout the whole execution.

So lets take a step back and figure out how fast the usb stick actually is.
This will erase your usb stick, but give us an idea of its performance:

dd if=/dev/zero of=/dev/<wherever you mkfs btrfs> bs=20M oflag=direct count=100

Note again, the above command will erase your usb stick ;)  Use whatever device name
you've been sending to mkfs.btrfs

The kernel will allow a pretty significant amount of ram to be dirtied before 
forcing writeback, which is why you're seeing rsync stall at seemingly strange 
intervals.  In the base of btrfs with compression, we add some worker threads between 
rsync and the device, and these may be turning the writeback into a somewhat 
more bursty operation.

-chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03 12:36                     ` Chris Mason
@ 2014-09-03 14:11                       ` john terragon
  2014-09-03 15:02                         ` Chris Murphy
  0 siblings, 1 reply; 19+ messages in thread
From: john terragon @ 2014-09-03 14:11 UTC (permalink / raw)
  To: Chris Mason; +Cc: Btrfs BTRFS

I wasn't sure what you meant with <wherever you mkfs btrfs> so I dd'd
all the three possible cases:

1) here's the dmcrypt device on which I mkfs.btrfs

   2097152000 bytes (2.1 GB) copied, 487.265 s, 4.3 MB/s

2) here's the partition of the usb stick (which has another partition
containing /boot) on top of which the dmcrypt device is created

  2097152000 bytes (2.1 GB) copied, 449.693 s, 4.7 MB/s

3) here's the whole usb stick device

  2097152000 bytes (2.1 GB) copied, 448.003 s, 4.7 MB/s

It's a usb2 device but doesn't it seem kind of slow?

Thanks
John


On Wed, Sep 3, 2014 at 2:36 PM, Chris Mason <clm@fb.com> wrote:
> On 09/02/2014 09:31 PM, john terragon wrote:
>> Rsync finished. FWIW in the end it reported an average speed of about
>>  900K/sec. Without autodefrag there have been no messages about hung
>> kworkers even though rsync seemingly keeps getting hung for several
>> minutes throughout the whole execution.
>
> So lets take a step back and figure out how fast the usb stick actually is.
> This will erase your usb stick, but give us an idea of its performance:
>
> dd if=/dev/zero of=/dev/<wherever you mkfs btrfs> bs=20M oflag=direct count=100
>
> Note again, the above command will erase your usb stick ;)  Use whatever device name
> you've been sending to mkfs.btrfs
>
> The kernel will allow a pretty significant amount of ram to be dirtied before
> forcing writeback, which is why you're seeing rsync stall at seemingly strange
> intervals.  In the base of btrfs with compression, we add some worker threads between
> rsync and the device, and these may be turning the writeback into a somewhat
> more bursty operation.
>
> -chris

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds
  2014-09-03 14:11                       ` john terragon
@ 2014-09-03 15:02                         ` Chris Murphy
  0 siblings, 0 replies; 19+ messages in thread
From: Chris Murphy @ 2014-09-03 15:02 UTC (permalink / raw)
  To: Btrfs BTRFS


On Sep 3, 2014, at 8:11 AM, john terragon <jterragon@gmail.com> wrote:

> It's a usb2 device but doesn't it seem kind of slow?

Not atypical, I have one that's the same, and another that's ~21MB/s, both are USB 2. 

[Certain older Apple Mac firmware boot faster with the slow stick than the fast one, and it turns out the block size matters. Block size 512 bytes is insanely slow (as in 100KB/s) on the "fast" stick, whereas a block size of even 32k puts it to 20+MB/s. So I think the older firmware must be initially asking for 512 byte blocks, once the kernel takes over the performance is very good.]

Chris Murphy


^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2014-09-03 15:02 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-01 13:33 kernel 3.17-rc3: task rsync:2524 blocked for more than 120 seconds john terragon
2014-09-01 16:02 ` Chris Mason
2014-09-01 16:36   ` john terragon
2014-09-02  5:20     ` Duncan
2014-09-02  6:12       ` john terragon
2014-09-02  6:40         ` Duncan
2014-09-02 19:56           ` john terragon
2014-09-02 20:10             ` Chris Mason
2014-09-02 20:23               ` john terragon
2014-09-02 20:48                 ` john terragon
2014-09-03  1:31                   ` john terragon
2014-09-03 12:36                     ` Chris Mason
2014-09-03 14:11                       ` john terragon
2014-09-03 15:02                         ` Chris Murphy
2014-09-03  2:44           ` Chris Murphy
2014-09-03  5:37             ` Duncan
2014-09-03  6:03             ` john terragon
2014-09-03  9:14               ` Liu Bo
2014-09-03  3:32       ` Zygo Blaxell

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.