All of lore.kernel.org
 help / color / mirror / Atom feed
* Forever blocked in bit_wait with kernel 3.19
@ 2015-02-12 11:12 Steven Schlansker
       [not found] ` <84643B78-D39C-430C-BE12-BBC8DFD7B69F@lksmith.net>
  2015-02-18  6:26 ` Omar Sandoval
  0 siblings, 2 replies; 7+ messages in thread
From: Steven Schlansker @ 2015-02-12 11:12 UTC (permalink / raw)
  To: linux-btrfs

[ Please CC me on replies, I'm not on the list ]
[ This is a followup to http://www.spinics.net/lists/linux-btrfs/msg41496.html ]

Hello linux-btrfs,
I've been having troubles keeping my Apache Mesos / Docker slave nodes stable.  After some period of load, tasks begin to hang.  Once this happens task after task ends up waiting at the same point, never to return.  The system quickly becomes unusable and must be terminated.

After the previous issues, I was encouraged to upgrade and retry.  I am now running

Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
Btrfs v3.18.2 (and this version was also used to mkfs)

root@ip-10-30-38-86:~# btrfs fi show
Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
	Total devices 2 FS bytes used 3.92GiB
	devid    1 size 74.99GiB used 4.01GiB path /dev/xvdc
	devid    2 size 74.99GiB used 4.01GiB path /dev/xvdd

Btrfs v3.18.2

Data, RAID0: total=6.00GiB, used=3.69GiB
System, RAID0: total=16.00MiB, used=16.00KiB
Metadata, RAID0: total=2.00GiB, used=229.30MiB
GlobalReserve, single: total=80.00MiB, used=0.00B

This is the first hung task:

[146280.252086] INFO: task java:28252 blocked for more than 120 seconds.
[146280.252096]       Tainted: G            E  3.19.0 #1
[146280.252098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[146280.252102] java            D ffff8805584df528     0 28252   1400 0x00000000
[146280.252106]  ffff8805584df528 ffff880756a24aa0 0000000000014100 ffff8805584dffd8
[146280.252108]  0000000000014100 ffff8807567c31c0 ffff880756a24aa0 ffff8805584df5d0
[146280.252109]  ffff88075a314a00 ffff8805584df5d0 ffff88077c3f8ce8 0000000000000002
[146280.252111] Call Trace:
[146280.252120]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
[146280.252122]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
[146280.252125]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
[146280.252127]  [<ffffffff8194ec05>] __wait_on_bit+0x65/0x90
[146280.252131]  [<ffffffff81169ad7>] wait_on_page_bit+0xc7/0xd0
[146280.252134]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
[146280.252137]  [<ffffffff8117d9ed>] shrink_page_list+0x2fd/0xa90
[146280.252139]  [<ffffffff8117e7ad>] shrink_inactive_list+0x1cd/0x590
[146280.252141]  [<ffffffff8117f5b5>] shrink_lruvec+0x5f5/0x810
[146280.252144]  [<ffffffff81086d01>] ? pwq_activate_delayed_work+0x31/0x90
[146280.252146]  [<ffffffff8117f867>] shrink_zone+0x97/0x240
[146280.252148]  [<ffffffff8117fd75>] do_try_to_free_pages+0x155/0x440
[146280.252150]  [<ffffffff81180257>] try_to_free_mem_cgroup_pages+0xa7/0x130
[146280.252154]  [<ffffffff811d2931>] try_charge+0x151/0x620
[146280.252158]  [<ffffffff81815a05>] ? tcp_schedule_loss_probe+0x145/0x1e0
[146280.252160]  [<ffffffff811d6f48>] mem_cgroup_try_charge+0x98/0x110
[146280.252164]  [<ffffffff8170957e>] ? __alloc_skb+0x7e/0x2b0
[146280.252166]  [<ffffffff8116accf>] __add_to_page_cache_locked+0x7f/0x290
[146280.252169]  [<ffffffff8116af28>] add_to_page_cache_lru+0x28/0x80
[146280.252171]  [<ffffffff8116b00f>] pagecache_get_page+0x8f/0x1c0
[146280.252173]  [<ffffffff81952570>] ? _raw_spin_unlock_bh+0x20/0x40
[146280.252189]  [<ffffffffa0045935>] prepare_pages.isra.19+0xc5/0x180 [btrfs]
[146280.252199]  [<ffffffffa00464ec>] __btrfs_buffered_write+0x1cc/0x590 [btrfs]
[146280.252208]  [<ffffffffa0049c07>] btrfs_file_write_iter+0x287/0x510 [btrfs]
[146280.252211]  [<ffffffff813f7076>] ? aa_path_perm+0xd6/0x170
[146280.252214]  [<ffffffff811dfd91>] new_sync_write+0x81/0xb0
[146280.252216]  [<ffffffff811e0537>] vfs_write+0xb7/0x1f0
[146280.252217]  [<ffffffff81950636>] ? mutex_lock+0x16/0x37
[146280.252219]  [<ffffffff811e1146>] SyS_write+0x46/0xb0
[146280.252221]  [<ffffffff819529ed>] system_call_fastpath+0x16/0x1b

Here is a slightly different stacktrace:

[158880.240245] INFO: task kworker/u16:6:13974 blocked for more than 120 seconds.
[158880.240249]       Tainted: G            E  3.19.0 #1
[158880.240252] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[158880.240254] kworker/u16:6   D ffff88064e7b76c8     0 13974      2 0x00000000
[158880.240259] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
[158880.240260]  ffff88064e7b76c8 ffff88066f0c18e0 0000000000014100 ffff88064e7b7fd8
[158880.240262]  0000000000014100 ffffffff8201e4a0 ffff88066f0c18e0 ffff88077c3e06e8
[158880.240264]  ffff88075a214a00 ffff88077c3e06e8 ffff88064e7b7770 0000000000000002
[158880.240266] Call Trace:
[158880.240268]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
[158880.240270]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
[158880.240273]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
[158880.240275]  [<ffffffff8194ed9b>] __wait_on_bit_lock+0x4b/0xb0
[158880.240277]  [<ffffffff81169f2e>] __lock_page+0xae/0xb0
[158880.240279]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
[158880.240289]  [<ffffffffa00501bd>] lock_delalloc_pages+0x13d/0x1d0 [btrfs]
[158880.240299]  [<ffffffffa005fc8a>] ? btrfs_map_block+0x1a/0x20 [btrfs]
[158880.240308]  [<ffffffffa0050476>] ? find_delalloc_range.constprop.46+0xa6/0x160 [btrfs]
[158880.240318]  [<ffffffffa0052cb3>] find_lock_delalloc_range+0x143/0x1f0 [btrfs]
[158880.240326]  [<ffffffffa00534e0>] ? end_extent_writepage+0xa0/0xa0 [btrfs]
[158880.240335]  [<ffffffffa0052de1>] writepage_delalloc.isra.32+0x81/0x160 [btrfs]
[158880.240343]  [<ffffffffa0053fab>] __extent_writepage+0xbb/0x2a0 [btrfs]
[158880.240350]  [<ffffffffa00544ca>] extent_write_cache_pages.isra.29.constprop.49+0x33a/0x3f0 [btrfs]
[158880.240359]  [<ffffffffa0055f1d>] extent_writepages+0x4d/0x70 [btrfs]
[158880.240368]  [<ffffffffa0039090>] ? btrfs_submit_direct+0x7a0/0x7a0 [btrfs]
[158880.240371]  [<ffffffff8109c0a0>] ? default_wake_function+0x10/0x20
[158880.240378]  [<ffffffffa00360a8>] btrfs_writepages+0x28/0x30 [btrfs]
[158880.240380]  [<ffffffff81176d2e>] do_writepages+0x1e/0x40
[158880.240383]  [<ffffffff81209400>] __writeback_single_inode+0x40/0x220
[158880.240385]  [<ffffffff81209f13>] writeback_sb_inodes+0x263/0x430
[158880.240387]  [<ffffffff8120a17f>] __writeback_inodes_wb+0x9f/0xd0
[158880.240389]  [<ffffffff8120a3f3>] wb_writeback+0x243/0x2c0
[158880.240392]  [<ffffffff8120c9b3>] bdi_writeback_workfn+0x113/0x440
[158880.240394]  [<ffffffff810981bc>] ? finish_task_switch+0x6c/0x1a0
[158880.240397]  [<ffffffff81088f3f>] process_one_work+0x14f/0x3f0
[158880.240399]  [<ffffffff810896a1>] worker_thread+0x121/0x4e0
[158880.240402]  [<ffffffff81089580>] ? rescuer_thread+0x3a0/0x3a0
[158880.240404]  [<ffffffff8108ea72>] kthread+0xd2/0xf0
[158880.240406]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180
[158880.240408]  [<ffffffff8195293c>] ret_from_fork+0x7c/0xb0
[158880.240411]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180


Help!
Thanks,
Steven


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
       [not found] ` <84643B78-D39C-430C-BE12-BBC8DFD7B69F@lksmith.net>
@ 2015-02-12 18:50   ` Steven Schlansker
  2015-02-12 18:59     ` Lucas Smith
  0 siblings, 1 reply; 7+ messages in thread
From: Steven Schlansker @ 2015-02-12 18:50 UTC (permalink / raw)
  To: Lucas Smith; +Cc: linux-btrfs

Hi Lucas, we use Java to run largely our own programs.  None do anything special or interesting with the disk, it is simply where we deploy our .jar files and scratch space for e.g. logs 

The fragmentation idea is interesting, but it seems unlikely that the disk would be fatally fragmented at ~ 5% utilization

We do not have any swap configured, whether on btrfs or otherwise.

Java version is 8u25.

Thanks,
Steven

> On Feb 12, 2015, at 4:33 PM, Lucas Smith <vedalken@lksmith.net> wrote:
> 
> Steven: as an avid Java user and now btrfs user, a few things come to mind but are based on what you're running inside the jvm. What program are you using Java to run? and I know kvm disk images need to have the +C attribute to prevent fragmentation of those images. You could try that however, I have reason to believe your jvm is being weird with regards to the pagefile Aka swapfile. What version of Java are you running?
> end
> 
> On February 12, 2015 5:12:25 AM CST, Steven Schlansker <sschlansker@opentable.com> wrote:
> [ Please CC me on replies, I'm not on the list ]
> [ This is a followup to http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
> 
> Hello linux-btrfs,
> I've been having troubles keeping my Apache Mesos / Docker slave nodes stable.  After some period of load, tasks begin to hang.  Once this happens task after task ends up waiting at the same point, never to return.  The system quickly becomes unusable and must be terminated.
> 
> After the previous issues, I was encouraged to upgrade and retry.  I am now running
> 
> Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Btrfs v3.18.2 (and this version was also used to mkfs)
> 
> root@ip-10-30-38-86:~# btrfs fi show
> Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
>  Total devices 2 FS bytes used 3.92GiB
>  devid    1 size 74.99GiB used 4.01GiB path /dev/xvdc
> 
> devid    2 size 74.99GiB used 4.01GiB path /dev/xvdd
> 
> 
> Btrfs v3.18.2
> 
> Data, RAID0: total=6.00GiB, used=3.69GiB
> System, RAID0: total=16.00MiB, used=16.00KiB
> Metadata, RAID0: total=2.00GiB, used=229.30MiB
> GlobalReserve, single: total=80.00MiB, used=0.00B
> 
> This is the first hung task:
> 
> [146280.252086] INFO: task java:28252 blocked for more than 120 seconds.
> [146280.252096]       Tainted: G            E  3.19.0 #1
> [146280.252098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [146280.252102] java            D ffff8805584df528     0 28252   1400 0x00000000
> [146280.252106]  ffff8805584df528 ffff880756a24aa0 0000000000014100 ffff8805584dffd8
> [146280.252108]  0000000000014100 ffff8807567c31c0 ffff880756a24aa0 ffff8805584df5d0
> [146280.252109]  ffff88075a314a00 ffff8805584df5d0 ffff88077c3f8ce8 0000000000000002
> [146280.252111] Call Trace:
> [146280.252120] 
> [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
> 
> [146280.252122]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
> [146280.252125]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
> [146280.252127]  [<ffffffff8194ec05>] __wait_on_bit+0x65/0x90
> [146280.252131]  [<ffffffff81169ad7>] wait_on_page_bit+0xc7/0xd0
> [146280.252134]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
> [146280.252137]  [<ffffffff8117d9ed>] shrink_page_list+0x2fd/0xa90
> [146280.252139]  [<ffffffff8117e7ad>] shrink_inactive_list+0x1cd/0x590
> [146280.252141]  [<ffffffff8117f5b5>] shrink_lruvec+0x5f5/0x810
> [146280.252144]  [<ffffffff81086d01>] ? pwq_activate_delayed_work+0x31/0x90
> [146280.252146]  [<ffffffff8117f867>] shrink_zone+0x97/0x240
> [146280.252148]  [<ffffffff8117fd75>] do_try_to_free_pages+0x155/0x440
> [146280.252150]  [<ffffffff81180257>] try_to_free_mem_cgroup_pages+0xa7/0x130
> [146280.252154]  [<ffffffff811d2931>] try_charge+0x151/0x620
> [146280.252158]  [<ffffffff81815a05>] ? tcp_schedule_loss_probe+0x145/0x1e0
> [146280.252160]  [<ffffffff811d6f48>] mem_cgroup_try_charge+0x98/0x110
> [146280.252164]  [<ffffffff8170957e>] ? __alloc_skb+0x7e/0x2b0
> [146280.252166]  [<ffffffff8116accf>] __add_to_page_cache_locked+0x7f/0x290
> [146280.252169]  [<ffffffff8116af28>] add_to_page_cache_lru+0x28/0x80
> [146280.252171]  [<ffffffff8116b00f>] pagecache_get_page+0x8f/0x1c0
> [146280.252173]  [<ffffffff81952570>] ? _raw_spin_unlock_bh+0x20/0x40
> [146280.252189]  [<ffffffffa0045935>] prepare_pages.isra.19+0xc5/0x180 [btrfs]
> [146280.252199]  [<ffffffffa00464ec>] __btrfs_buffered_write+0x1cc/0x590 [btrfs]
> [146280.252208]  [<ffffffffa0049c07>] btrfs_file_write_iter+0x287/0x510 [btrfs]
> [146280.252211]  [<ffffffff813f7076>] ? aa_path_perm+0xd6/0x170
> [146280.252214]  [<ffffffff811dfd91>] new_sync_write+0x81/0xb0
> [146280.252216]  [<ffffffff811e0537>] vfs_write+0xb7/0x1f0
> [146280.252217]  [<ffffffff81950636>] ? mutex_lock+0x16/0x37
> [146280.252219]  [<ffffffff811e1146>] SyS_write+0x46/0xb0
> [146280.252221]  [<ffffffff819529ed>] system_call_fastpath+0x16/0x1b
> 
> Here is a slightly different stacktrace:
> 
> [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than 120 seconds.
> [158880.240249]       Tainted: G            E  3.19.0 #1
> [158880.240252] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [158880.240254] kworker/u16:6   D ffff88064e7b76c8     0 13974      2 0x00000000
> [158880.240259] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
> [158880.240260]  ffff88064e7b76c8 ffff88066f0c18e0 0000000000014100 ffff88064e7b7fd8
> [158880.240262]  0000000000014100 ffffffff8201e4a0 ffff88066f0c18e0
> ffff88077c3e06e8
> 
> [158880.240264]  ffff88075a214a00 ffff88077c3e06e8 ffff88064e7b7770 0000000000000002
> [158880.240266] Call Trace:
> [158880.240268]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
> [158880.240270]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
> [158880.240273]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
> [158880.240275]  [<ffffffff8194ed9b>] __wait_on_bit_lock+0x4b/0xb0
> [158880.240277]  [<ffffffff81169f2e>] __lock_page+0xae/0xb0
> [158880.240279]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
> [158880.240289]  [<ffffffffa00501bd>] lock_delalloc_pages+0x13d/0x1d0 [btrfs]
> [158880.240299]  [<ffffffffa005fc8a>] ? btrfs_map_block+0x1a/0x20 [btrfs]
> [158880.240308]  [<ffffffffa0050476>] ? find_delalloc_range.constprop.46+0xa6/0x160 [btrfs]
> [158880.240318]  [<ffffffffa0052cb3>] find_lock_delalloc_range+0x143/0x1f0 [btrfs]
> [158880.240326] 
> [<ffffffffa00534e0>] ? end_extent_writepage+0xa0/0xa0 [btrfs]
> 
> [158880.240335]  [<ffffffffa0052de1>] writepage_delalloc.isra.32+0x81/0x160 [btrfs]
> [158880.240343]  [<ffffffffa0053fab>] __extent_writepage+0xbb/0x2a0 [btrfs]
> [158880.240350]  [<ffffffffa00544ca>] extent_write_cache_pages.isra.29.constprop.49+0x33a/0x3f0 [btrfs]
> [158880.240359]  [<ffffffffa0055f1d>] extent_writepages+0x4d/0x70 [btrfs]
> [158880.240368]  [<ffffffffa0039090>] ? btrfs_submit_direct+0x7a0/0x7a0 [btrfs]
> [158880.240371]  [<ffffffff8109c0a0>] ? default_wake_function+0x10/0x20
> [158880.240378]  [<ffffffffa00360a8>] btrfs_writepages+0x28/0x30 [btrfs]
> [158880.240380]  [<ffffffff81176d2e>] do_writepages+0x1e/0x40
> [158880.240383]  [<ffffffff81209400>] __writeback_single_inode+0x40/0x220
> [158880.240385]  [<ffffffff81209f13>] writeback_sb_inodes+0x263/0x430
> [158880.240387]  [<ffffffff8120a17f>]
> __writeback_inodes_wb+0x9f/0xd0
> 
> [158880.240389]  [<ffffffff8120a3f3>] wb_writeback+0x243/0x2c0
> [158880.240392]  [<ffffffff8120c9b3>] bdi_writeback_workfn+0x113/0x440
> [158880.240394]  [<ffffffff810981bc>] ? finish_task_switch+0x6c/0x1a0
> [158880.240397]  [<ffffffff81088f3f>] process_one_work+0x14f/0x3f0
> [158880.240399]  [<ffffffff810896a1>] worker_thread+0x121/0x4e0
> [158880.240402]  [<ffffffff81089580>] ? rescuer_thread+0x3a0/0x3a0
> [158880.240404]  [<ffffffff8108ea72>] kthread+0xd2/0xf0
> [158880.240406]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180
> [158880.240408]  [<ffffffff8195293c>] ret_from_fork+0x7c/0xb0
> [158880.240411]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180
> 
> 
> Help!
> Thanks,
> Steven
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to
> majordomo@vger.kernel.org
> 
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
  2015-02-12 18:50   ` Steven Schlansker
@ 2015-02-12 18:59     ` Lucas Smith
  2015-02-12 19:18       ` Lucas Smith
  2015-02-13 11:19       ` Steven Schlansker
  0 siblings, 2 replies; 7+ messages in thread
From: Lucas Smith @ 2015-02-12 18:59 UTC (permalink / raw)
  To: Steven Schlansker; +Cc: linux-btrfs

Steven, 

The only reason I brought up swap space is that it seems the system may be trying to utilize that due to low physical memory. How much RAM in the machine running Docker? The main thing that makes me want to believe it's RAM is this: [146280.252150]  [<ffffffff81180257>]
>try_to_free_mem_cgroup_pages+0xa7/0x130

Lucas
end
-- above line is for a mailing list.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

On February 12, 2015 12:50:26 PM CST, Steven Schlansker <sschlansker@opentable.com> wrote:
>Hi Lucas, we use Java to run largely our own programs.  None do
>anything special or interesting with the disk, it is simply where we
>deploy our .jar files and scratch space for e.g. logs 
>
>The fragmentation idea is interesting, but it seems unlikely that the
>disk would be fatally fragmented at ~ 5% utilization
>
>We do not have any swap configured, whether on btrfs or otherwise.
>
>Java version is 8u25.
>
>Thanks,
>Steven
>
>> On Feb 12, 2015, at 4:33 PM, Lucas Smith <vedalken@lksmith.net>
>wrote:
>> 
>> Steven: as an avid Java user and now btrfs user, a few things come to
>mind but are based on what you're running inside the jvm. What program
>are you using Java to run? and I know kvm disk images need to have the
>+C attribute to prevent fragmentation of those images. You could try
>that however, I have reason to believe your jvm is being weird with
>regards to the pagefile Aka swapfile. What version of Java are you
>running?
>> end
>> 
>> On February 12, 2015 5:12:25 AM CST, Steven Schlansker
><sschlansker@opentable.com> wrote:
>> [ Please CC me on replies, I'm not on the list ]
>> [ This is a followup to
>http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
>> 
>> Hello linux-btrfs,
>> I've been having troubles keeping my Apache Mesos / Docker slave
>nodes stable.  After some period of load, tasks begin to hang.  Once
>this happens task after task ends up waiting at the same point, never
>to return.  The system quickly becomes unusable and must be terminated.
>> 
>> After the previous issues, I was encouraged to upgrade and retry.  I
>am now running
>> 
>> Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64
>GNU/Linux
>> Btrfs v3.18.2 (and this version was also used to mkfs)
>> 
>> root@ip-10-30-38-86:~# btrfs fi show
>> Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
>>  Total devices 2 FS bytes used 3.92GiB
>>  devid    1 size 74.99GiB used 4.01GiB path /dev/xvdc
>> 
>> devid    2 size 74.99GiB used 4.01GiB path /dev/xvdd
>> 
>> 
>> Btrfs v3.18.2
>> 
>> Data, RAID0: total=6.00GiB, used=3.69GiB
>> System, RAID0: total=16.00MiB, used=16.00KiB
>> Metadata, RAID0: total=2.00GiB, used=229.30MiB
>> GlobalReserve, single: total=80.00MiB, used=0.00B
>> 
>> This is the first hung task:
>> 
>> [146280.252086] INFO: task java:28252 blocked for more than 120
>seconds.
>> [146280.252096]       Tainted: G            E  3.19.0 #1
>> [146280.252098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>disables this message.
>> [146280.252102] java            D ffff8805584df528     0 28252   1400
>0x00000000
>> [146280.252106]  ffff8805584df528 ffff880756a24aa0 0000000000014100
>ffff8805584dffd8
>> [146280.252108]  0000000000014100 ffff8807567c31c0 ffff880756a24aa0
>ffff8805584df5d0
>> [146280.252109]  ffff88075a314a00 ffff8805584df5d0 ffff88077c3f8ce8
>0000000000000002
>> [146280.252111] Call Trace:
>> [146280.252120] 
>> [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
>> 
>> [146280.252122]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
>> [146280.252125]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
>> [146280.252127]  [<ffffffff8194ec05>] __wait_on_bit+0x65/0x90
>> [146280.252131]  [<ffffffff81169ad7>] wait_on_page_bit+0xc7/0xd0
>> [146280.252134]  [<ffffffff810b0840>] ?
>autoremove_wake_function+0x40/0x40
>> [146280.252137]  [<ffffffff8117d9ed>] shrink_page_list+0x2fd/0xa90
>> [146280.252139]  [<ffffffff8117e7ad>]
>shrink_inactive_list+0x1cd/0x590
>> [146280.252141]  [<ffffffff8117f5b5>] shrink_lruvec+0x5f5/0x810
>> [146280.252144]  [<ffffffff81086d01>] ?
>pwq_activate_delayed_work+0x31/0x90
>> [146280.252146]  [<ffffffff8117f867>] shrink_zone+0x97/0x240
>> [146280.252148]  [<ffffffff8117fd75>]
>do_try_to_free_pages+0x155/0x440
>> [146280.252150]  [<ffffffff81180257>]
>try_to_free_mem_cgroup_pages+0xa7/0x130
>> [146280.252154]  [<ffffffff811d2931>] try_charge+0x151/0x620
>> [146280.252158]  [<ffffffff81815a05>] ?
>tcp_schedule_loss_probe+0x145/0x1e0
>> [146280.252160]  [<ffffffff811d6f48>]
>mem_cgroup_try_charge+0x98/0x110
>> [146280.252164]  [<ffffffff8170957e>] ? __alloc_skb+0x7e/0x2b0
>> [146280.252166]  [<ffffffff8116accf>]
>__add_to_page_cache_locked+0x7f/0x290
>> [146280.252169]  [<ffffffff8116af28>] add_to_page_cache_lru+0x28/0x80
>> [146280.252171]  [<ffffffff8116b00f>] pagecache_get_page+0x8f/0x1c0
>> [146280.252173]  [<ffffffff81952570>] ? _raw_spin_unlock_bh+0x20/0x40
>> [146280.252189]  [<ffffffffa0045935>]
>prepare_pages.isra.19+0xc5/0x180 [btrfs]
>> [146280.252199]  [<ffffffffa00464ec>]
>__btrfs_buffered_write+0x1cc/0x590 [btrfs]
>> [146280.252208]  [<ffffffffa0049c07>]
>btrfs_file_write_iter+0x287/0x510 [btrfs]
>> [146280.252211]  [<ffffffff813f7076>] ? aa_path_perm+0xd6/0x170
>> [146280.252214]  [<ffffffff811dfd91>] new_sync_write+0x81/0xb0
>> [146280.252216]  [<ffffffff811e0537>] vfs_write+0xb7/0x1f0
>> [146280.252217]  [<ffffffff81950636>] ? mutex_lock+0x16/0x37
>> [146280.252219]  [<ffffffff811e1146>] SyS_write+0x46/0xb0
>> [146280.252221]  [<ffffffff819529ed>] system_call_fastpath+0x16/0x1b
>> 
>> Here is a slightly different stacktrace:
>> 
>> [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than
>120 seconds.
>> [158880.240249]       Tainted: G            E  3.19.0 #1
>> [158880.240252] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>disables this message.
>> [158880.240254] kworker/u16:6   D ffff88064e7b76c8     0 13974      2
>0x00000000
>> [158880.240259] Workqueue: writeback bdi_writeback_workfn
>(flush-btrfs-1)
>> [158880.240260]  ffff88064e7b76c8 ffff88066f0c18e0 0000000000014100
>ffff88064e7b7fd8
>> [158880.240262]  0000000000014100 ffffffff8201e4a0 ffff88066f0c18e0
>> ffff88077c3e06e8
>> 
>> [158880.240264]  ffff88075a214a00 ffff88077c3e06e8 ffff88064e7b7770
>0000000000000002
>> [158880.240266] Call Trace:
>> [158880.240268]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
>> [158880.240270]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
>> [158880.240273]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
>> [158880.240275]  [<ffffffff8194ed9b>] __wait_on_bit_lock+0x4b/0xb0
>> [158880.240277]  [<ffffffff81169f2e>] __lock_page+0xae/0xb0
>> [158880.240279]  [<ffffffff810b0840>] ?
>autoremove_wake_function+0x40/0x40
>> [158880.240289]  [<ffffffffa00501bd>] lock_delalloc_pages+0x13d/0x1d0
>[btrfs]
>> [158880.240299]  [<ffffffffa005fc8a>] ? btrfs_map_block+0x1a/0x20
>[btrfs]
>> [158880.240308]  [<ffffffffa0050476>] ?
>find_delalloc_range.constprop.46+0xa6/0x160 [btrfs]
>> [158880.240318]  [<ffffffffa0052cb3>]
>find_lock_delalloc_range+0x143/0x1f0 [btrfs]
>> [158880.240326] 
>> [<ffffffffa00534e0>] ? end_extent_writepage+0xa0/0xa0 [btrfs]
>> 
>> [158880.240335]  [<ffffffffa0052de1>]
>writepage_delalloc.isra.32+0x81/0x160 [btrfs]
>> [158880.240343]  [<ffffffffa0053fab>] __extent_writepage+0xbb/0x2a0
>[btrfs]
>> [158880.240350]  [<ffffffffa00544ca>]
>extent_write_cache_pages.isra.29.constprop.49+0x33a/0x3f0 [btrfs]
>> [158880.240359]  [<ffffffffa0055f1d>] extent_writepages+0x4d/0x70
>[btrfs]
>> [158880.240368]  [<ffffffffa0039090>] ?
>btrfs_submit_direct+0x7a0/0x7a0 [btrfs]
>> [158880.240371]  [<ffffffff8109c0a0>] ?
>default_wake_function+0x10/0x20
>> [158880.240378]  [<ffffffffa00360a8>] btrfs_writepages+0x28/0x30
>[btrfs]
>> [158880.240380]  [<ffffffff81176d2e>] do_writepages+0x1e/0x40
>> [158880.240383]  [<ffffffff81209400>]
>__writeback_single_inode+0x40/0x220
>> [158880.240385]  [<ffffffff81209f13>] writeback_sb_inodes+0x263/0x430
>> [158880.240387]  [<ffffffff8120a17f>]
>> __writeback_inodes_wb+0x9f/0xd0
>> 
>> [158880.240389]  [<ffffffff8120a3f3>] wb_writeback+0x243/0x2c0
>> [158880.240392]  [<ffffffff8120c9b3>]
>bdi_writeback_workfn+0x113/0x440
>> [158880.240394]  [<ffffffff810981bc>] ? finish_task_switch+0x6c/0x1a0
>> [158880.240397]  [<ffffffff81088f3f>] process_one_work+0x14f/0x3f0
>> [158880.240399]  [<ffffffff810896a1>] worker_thread+0x121/0x4e0
>> [158880.240402]  [<ffffffff81089580>] ? rescuer_thread+0x3a0/0x3a0
>> [158880.240404]  [<ffffffff8108ea72>] kthread+0xd2/0xf0
>> [158880.240406]  [<ffffffff8108e9a0>] ?
>kthread_create_on_node+0x180/0x180
>> [158880.240408]  [<ffffffff8195293c>] ret_from_fork+0x7c/0xb0
>> [158880.240411]  [<ffffffff8108e9a0>] ?
>kthread_create_on_node+0x180/0x180
>> 
>> 
>> Help!
>> Thanks,
>> Steven
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe
>linux-btrfs" in
>> the body of a message to
>> majordomo@vger.kernel.org
>> 
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>--
>To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
  2015-02-12 18:59     ` Lucas Smith
@ 2015-02-12 19:18       ` Lucas Smith
  2015-02-13 11:19       ` Steven Schlansker
  1 sibling, 0 replies; 7+ messages in thread
From: Lucas Smith @ 2015-02-12 19:18 UTC (permalink / raw)
  To: Steven Schlansker; +Cc: linux-btrfs

Also,

Another thought came to me. It seems that the system only has issues when a sync operation happens. As to Why, I don't know but Maybe someone else on the list can shed some light on this.
end
-- above line is for a mailing list.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

On February 12, 2015 12:59:36 PM CST, Lucas Smith <vedalken@lksmith.net> wrote:
>Steven, 
>
>The only reason I brought up swap space is that it seems the system may
>be trying to utilize that due to low physical memory. How much RAM in
>the machine running Docker? The main thing that makes me want to
>believe it's RAM is this: [146280.252150]  [<ffffffff81180257>]
>>try_to_free_mem_cgroup_pages+0xa7/0x130
>
>Lucas
>end
>-- above line is for a mailing list.


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
  2015-02-12 18:59     ` Lucas Smith
  2015-02-12 19:18       ` Lucas Smith
@ 2015-02-13 11:19       ` Steven Schlansker
  1 sibling, 0 replies; 7+ messages in thread
From: Steven Schlansker @ 2015-02-13 11:19 UTC (permalink / raw)
  To: Lucas Smith; +Cc: linux-btrfs


> On Feb 12, 2015, at 6:59 PM, Lucas Smith <vedalken@lksmith.net> wrote:
> The only reason I brought up swap space is that it seems the system may be trying to utilize that due to low physical memory. How much RAM in the machine running Docker? The main thing that makes me want to believe it's RAM is this: [146280.252150]  [<ffffffff81180257>]
>> try_to_free_mem_cgroup_pages+0xa7/0x130

We have Nagios monitoring of free system memory, and the alerts did not fire, so I do not think the system was critically low on memory.  However, it is possible that memory was full with cached data (write-back cache of disk?). These machines have 32GB RAM.

We do use cgroups to limit memory of our containers, and containers are killed due to OOM conditions fairly frequently (developers make mistakes!) And most of the stack traces do refer to cgroup memory accounting somewhere.  Is it possible this is a bad interaction of cgroup memory limits, the page cache, and btrfs?

> On February 12, 2015 12:50:26 PM CST, Steven Schlansker <sschlansker@opentable.com> wrote:
>> Hi Lucas, we use Java to run largely our own programs.  None do
>> anything special or interesting with the disk, it is simply where we
>> deploy our .jar files and scratch space for e.g. logs 
>> 
>> The fragmentation idea is interesting, but it seems unlikely that the
>> disk would be fatally fragmented at ~ 5% utilization
>> 
>> We do not have any swap configured, whether on btrfs or otherwise.
>> 
>> Java version is 8u25.
>> 
>> Thanks,
>> Steven
>> 
>>> On Feb 12, 2015, at 4:33 PM, Lucas Smith <vedalken@lksmith.net>
>> wrote:
>>> 
>>> Steven: as an avid Java user and now btrfs user, a few things come to
>> mind but are based on what you're running inside the jvm. What program
>> are you using Java to run? and I know kvm disk images need to have the
>> +C attribute to prevent fragmentation of those images. You could try
>> that however, I have reason to believe your jvm is being weird with
>> regards to the pagefile Aka swapfile. What version of Java are you
>> running?
>>> end
>>> 
>>> On February 12, 2015 5:12:25 AM CST, Steven Schlansker
>> <sschlansker@opentable.com> wrote:
>>> [ Please CC me on replies, I'm not on the list ]
>>> [ This is a followup to
>> http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
>>> 
>>> Hello linux-btrfs,
>>> I've been having troubles keeping my Apache Mesos / Docker slave
>> nodes stable.  After some period of load, tasks begin to hang.  Once
>> this happens task after task ends up waiting at the same point, never
>> to return.  The system quickly becomes unusable and must be terminated.
>>> 
>>> After the previous issues, I was encouraged to upgrade and retry.  I
>> am now running
>>> 
>>> Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64
>> GNU/Linux
>>> Btrfs v3.18.2 (and this version was also used to mkfs)
>>> 
>>> root@ip-10-30-38-86:~# btrfs fi show
>>> Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
>>> Total devices 2 FS bytes used 3.92GiB
>>> devid    1 size 74.99GiB used 4.01GiB path /dev/xvdc
>>> 
>>> devid    2 size 74.99GiB used 4.01GiB path /dev/xvdd
>>> 
>>> 
>>> Btrfs v3.18.2
>>> 
>>> Data, RAID0: total=6.00GiB, used=3.69GiB
>>> System, RAID0: total=16.00MiB, used=16.00KiB
>>> Metadata, RAID0: total=2.00GiB, used=229.30MiB
>>> GlobalReserve, single: total=80.00MiB, used=0.00B
>>> 
>>> This is the first hung task:
>>> 
>>> [146280.252086] INFO: task java:28252 blocked for more than 120
>> seconds.
>>> [146280.252096]       Tainted: G            E  3.19.0 #1
>>> [146280.252098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>>> [146280.252102] java            D ffff8805584df528     0 28252   1400
>> 0x00000000
>>> [146280.252106]  ffff8805584df528 ffff880756a24aa0 0000000000014100
>> ffff8805584dffd8
>>> [146280.252108]  0000000000014100 ffff8807567c31c0 ffff880756a24aa0
>> ffff8805584df5d0
>>> [146280.252109]  ffff88075a314a00 ffff8805584df5d0 ffff88077c3f8ce8
>> 0000000000000002
>>> [146280.252111] Call Trace:
>>> [146280.252120] 
>>> [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
>>> 
>>> [146280.252122]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
>>> [146280.252125]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
>>> [146280.252127]  [<ffffffff8194ec05>] __wait_on_bit+0x65/0x90
>>> [146280.252131]  [<ffffffff81169ad7>] wait_on_page_bit+0xc7/0xd0
>>> [146280.252134]  [<ffffffff810b0840>] ?
>> autoremove_wake_function+0x40/0x40
>>> [146280.252137]  [<ffffffff8117d9ed>] shrink_page_list+0x2fd/0xa90
>>> [146280.252139]  [<ffffffff8117e7ad>]
>> shrink_inactive_list+0x1cd/0x590
>>> [146280.252141]  [<ffffffff8117f5b5>] shrink_lruvec+0x5f5/0x810
>>> [146280.252144]  [<ffffffff81086d01>] ?
>> pwq_activate_delayed_work+0x31/0x90
>>> [146280.252146]  [<ffffffff8117f867>] shrink_zone+0x97/0x240
>>> [146280.252148]  [<ffffffff8117fd75>]
>> do_try_to_free_pages+0x155/0x440
>>> [146280.252150]  [<ffffffff81180257>]
>> try_to_free_mem_cgroup_pages+0xa7/0x130
>>> [146280.252154]  [<ffffffff811d2931>] try_charge+0x151/0x620
>>> [146280.252158]  [<ffffffff81815a05>] ?
>> tcp_schedule_loss_probe+0x145/0x1e0
>>> [146280.252160]  [<ffffffff811d6f48>]
>> mem_cgroup_try_charge+0x98/0x110
>>> [146280.252164]  [<ffffffff8170957e>] ? __alloc_skb+0x7e/0x2b0
>>> [146280.252166]  [<ffffffff8116accf>]
>> __add_to_page_cache_locked+0x7f/0x290
>>> [146280.252169]  [<ffffffff8116af28>] add_to_page_cache_lru+0x28/0x80
>>> [146280.252171]  [<ffffffff8116b00f>] pagecache_get_page+0x8f/0x1c0
>>> [146280.252173]  [<ffffffff81952570>] ? _raw_spin_unlock_bh+0x20/0x40
>>> [146280.252189]  [<ffffffffa0045935>]
>> prepare_pages.isra.19+0xc5/0x180 [btrfs]
>>> [146280.252199]  [<ffffffffa00464ec>]
>> __btrfs_buffered_write+0x1cc/0x590 [btrfs]
>>> [146280.252208]  [<ffffffffa0049c07>]
>> btrfs_file_write_iter+0x287/0x510 [btrfs]
>>> [146280.252211]  [<ffffffff813f7076>] ? aa_path_perm+0xd6/0x170
>>> [146280.252214]  [<ffffffff811dfd91>] new_sync_write+0x81/0xb0
>>> [146280.252216]  [<ffffffff811e0537>] vfs_write+0xb7/0x1f0
>>> [146280.252217]  [<ffffffff81950636>] ? mutex_lock+0x16/0x37
>>> [146280.252219]  [<ffffffff811e1146>] SyS_write+0x46/0xb0
>>> [146280.252221]  [<ffffffff819529ed>] system_call_fastpath+0x16/0x1b
>>> 
>>> Here is a slightly different stacktrace:
>>> 
>>> [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than
>> 120 seconds.
>>> [158880.240249]       Tainted: G            E  3.19.0 #1
>>> [158880.240252] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
>> disables this message.
>>> [158880.240254] kworker/u16:6   D ffff88064e7b76c8     0 13974      2
>> 0x00000000
>>> [158880.240259] Workqueue: writeback bdi_writeback_workfn
>> (flush-btrfs-1)
>>> [158880.240260]  ffff88064e7b76c8 ffff88066f0c18e0 0000000000014100
>> ffff88064e7b7fd8
>>> [158880.240262]  0000000000014100 ffffffff8201e4a0 ffff88066f0c18e0
>>> ffff88077c3e06e8
>>> 
>>> [158880.240264]  ffff88075a214a00 ffff88077c3e06e8 ffff88064e7b7770
>> 0000000000000002
>>> [158880.240266] Call Trace:
>>> [158880.240268]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
>>> [158880.240270]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
>>> [158880.240273]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
>>> [158880.240275]  [<ffffffff8194ed9b>] __wait_on_bit_lock+0x4b/0xb0
>>> [158880.240277]  [<ffffffff81169f2e>] __lock_page+0xae/0xb0
>>> [158880.240279]  [<ffffffff810b0840>] ?
>> autoremove_wake_function+0x40/0x40
>>> [158880.240289]  [<ffffffffa00501bd>] lock_delalloc_pages+0x13d/0x1d0
>> [btrfs]
>>> [158880.240299]  [<ffffffffa005fc8a>] ? btrfs_map_block+0x1a/0x20
>> [btrfs]
>>> [158880.240308]  [<ffffffffa0050476>] ?
>> find_delalloc_range.constprop.46+0xa6/0x160 [btrfs]
>>> [158880.240318]  [<ffffffffa0052cb3>]
>> find_lock_delalloc_range+0x143/0x1f0 [btrfs]
>>> [158880.240326] 
>>> [<ffffffffa00534e0>] ? end_extent_writepage+0xa0/0xa0 [btrfs]
>>> 
>>> [158880.240335]  [<ffffffffa0052de1>]
>> writepage_delalloc.isra.32+0x81/0x160 [btrfs]
>>> [158880.240343]  [<ffffffffa0053fab>] __extent_writepage+0xbb/0x2a0
>> [btrfs]
>>> [158880.240350]  [<ffffffffa00544ca>]
>> extent_write_cache_pages.isra.29.constprop.49+0x33a/0x3f0 [btrfs]
>>> [158880.240359]  [<ffffffffa0055f1d>] extent_writepages+0x4d/0x70
>> [btrfs]
>>> [158880.240368]  [<ffffffffa0039090>] ?
>> btrfs_submit_direct+0x7a0/0x7a0 [btrfs]
>>> [158880.240371]  [<ffffffff8109c0a0>] ?
>> default_wake_function+0x10/0x20
>>> [158880.240378]  [<ffffffffa00360a8>] btrfs_writepages+0x28/0x30
>> [btrfs]
>>> [158880.240380]  [<ffffffff81176d2e>] do_writepages+0x1e/0x40
>>> [158880.240383]  [<ffffffff81209400>]
>> __writeback_single_inode+0x40/0x220
>>> [158880.240385]  [<ffffffff81209f13>] writeback_sb_inodes+0x263/0x430
>>> [158880.240387]  [<ffffffff8120a17f>]
>>> __writeback_inodes_wb+0x9f/0xd0
>>> 
>>> [158880.240389]  [<ffffffff8120a3f3>] wb_writeback+0x243/0x2c0
>>> [158880.240392]  [<ffffffff8120c9b3>]
>> bdi_writeback_workfn+0x113/0x440
>>> [158880.240394]  [<ffffffff810981bc>] ? finish_task_switch+0x6c/0x1a0
>>> [158880.240397]  [<ffffffff81088f3f>] process_one_work+0x14f/0x3f0
>>> [158880.240399]  [<ffffffff810896a1>] worker_thread+0x121/0x4e0
>>> [158880.240402]  [<ffffffff81089580>] ? rescuer_thread+0x3a0/0x3a0
>>> [158880.240404]  [<ffffffff8108ea72>] kthread+0xd2/0xf0
>>> [158880.240406]  [<ffffffff8108e9a0>] ?
>> kthread_create_on_node+0x180/0x180
>>> [158880.240408]  [<ffffffff8195293c>] ret_from_fork+0x7c/0xb0
>>> [158880.240411]  [<ffffffff8108e9a0>] ?
>> kthread_create_on_node+0x180/0x180
>>> 
>>> 
>>> Help!
>>> Thanks,
>>> Steven
>>> 
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe
>> linux-btrfs" in
>>> the body of a message to
>>> majordomo@vger.kernel.org
>>> 
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> 
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs"
>> in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
  2015-02-12 11:12 Forever blocked in bit_wait with kernel 3.19 Steven Schlansker
       [not found] ` <84643B78-D39C-430C-BE12-BBC8DFD7B69F@lksmith.net>
@ 2015-02-18  6:26 ` Omar Sandoval
  2015-02-18 17:49   ` Steven Schlansker
  1 sibling, 1 reply; 7+ messages in thread
From: Omar Sandoval @ 2015-02-18  6:26 UTC (permalink / raw)
  To: Steven Schlansker; +Cc: linux-btrfs

On Thu, Feb 12, 2015 at 11:12:25AM +0000, Steven Schlansker wrote:
> [ Please CC me on replies, I'm not on the list ]
> [ This is a followup to http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
> 
> Hello linux-btrfs,
> I've been having troubles keeping my Apache Mesos / Docker slave nodes stable.  After some period of load, tasks begin to hang.  Once this happens task after task ends up waiting at the same point, never to return.  The system quickly becomes unusable and must be terminated.
> 
> After the previous issues, I was encouraged to upgrade and retry.  I am now running
> 
> Linux 3.19.0 #1 SMP Mon Feb 9 09:43:11 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> Btrfs v3.18.2 (and this version was also used to mkfs)
> 
> root@ip-10-30-38-86:~# btrfs fi show
> Label: none  uuid: 0e8c3f1d-b07b-4643-9834-a41dafb80257
> 	Total devices 2 FS bytes used 3.92GiB
> 	devid    1 size 74.99GiB used 4.01GiB path /dev/xvdc
> 	devid    2 size 74.99GiB used 4.01GiB path /dev/xvdd
> 
> Btrfs v3.18.2
> 
> Data, RAID0: total=6.00GiB, used=3.69GiB
> System, RAID0: total=16.00MiB, used=16.00KiB
> Metadata, RAID0: total=2.00GiB, used=229.30MiB
> GlobalReserve, single: total=80.00MiB, used=0.00B
> 
> This is the first hung task:
> 
> [146280.252086] INFO: task java:28252 blocked for more than 120 seconds.
> [146280.252096]       Tainted: G            E  3.19.0 #1
> [146280.252098] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [146280.252102] java            D ffff8805584df528     0 28252   1400 0x00000000
> [146280.252106]  ffff8805584df528 ffff880756a24aa0 0000000000014100 ffff8805584dffd8
> [146280.252108]  0000000000014100 ffff8807567c31c0 ffff880756a24aa0 ffff8805584df5d0
> [146280.252109]  ffff88075a314a00 ffff8805584df5d0 ffff88077c3f8ce8 0000000000000002
> [146280.252111] Call Trace:
> [146280.252120]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
> [146280.252122]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
> [146280.252125]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
> [146280.252127]  [<ffffffff8194ec05>] __wait_on_bit+0x65/0x90
> [146280.252131]  [<ffffffff81169ad7>] wait_on_page_bit+0xc7/0xd0
> [146280.252134]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
> [146280.252137]  [<ffffffff8117d9ed>] shrink_page_list+0x2fd/0xa90
> [146280.252139]  [<ffffffff8117e7ad>] shrink_inactive_list+0x1cd/0x590
> [146280.252141]  [<ffffffff8117f5b5>] shrink_lruvec+0x5f5/0x810
> [146280.252144]  [<ffffffff81086d01>] ? pwq_activate_delayed_work+0x31/0x90
> [146280.252146]  [<ffffffff8117f867>] shrink_zone+0x97/0x240
> [146280.252148]  [<ffffffff8117fd75>] do_try_to_free_pages+0x155/0x440
> [146280.252150]  [<ffffffff81180257>] try_to_free_mem_cgroup_pages+0xa7/0x130
> [146280.252154]  [<ffffffff811d2931>] try_charge+0x151/0x620
> [146280.252158]  [<ffffffff81815a05>] ? tcp_schedule_loss_probe+0x145/0x1e0
> [146280.252160]  [<ffffffff811d6f48>] mem_cgroup_try_charge+0x98/0x110
> [146280.252164]  [<ffffffff8170957e>] ? __alloc_skb+0x7e/0x2b0
> [146280.252166]  [<ffffffff8116accf>] __add_to_page_cache_locked+0x7f/0x290
> [146280.252169]  [<ffffffff8116af28>] add_to_page_cache_lru+0x28/0x80
> [146280.252171]  [<ffffffff8116b00f>] pagecache_get_page+0x8f/0x1c0
> [146280.252173]  [<ffffffff81952570>] ? _raw_spin_unlock_bh+0x20/0x40
> [146280.252189]  [<ffffffffa0045935>] prepare_pages.isra.19+0xc5/0x180 [btrfs]
> [146280.252199]  [<ffffffffa00464ec>] __btrfs_buffered_write+0x1cc/0x590 [btrfs]
> [146280.252208]  [<ffffffffa0049c07>] btrfs_file_write_iter+0x287/0x510 [btrfs]
> [146280.252211]  [<ffffffff813f7076>] ? aa_path_perm+0xd6/0x170
> [146280.252214]  [<ffffffff811dfd91>] new_sync_write+0x81/0xb0
> [146280.252216]  [<ffffffff811e0537>] vfs_write+0xb7/0x1f0
> [146280.252217]  [<ffffffff81950636>] ? mutex_lock+0x16/0x37
> [146280.252219]  [<ffffffff811e1146>] SyS_write+0x46/0xb0
> [146280.252221]  [<ffffffff819529ed>] system_call_fastpath+0x16/0x1b
> 
> Here is a slightly different stacktrace:
> 
> [158880.240245] INFO: task kworker/u16:6:13974 blocked for more than 120 seconds.
> [158880.240249]       Tainted: G            E  3.19.0 #1
> [158880.240252] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [158880.240254] kworker/u16:6   D ffff88064e7b76c8     0 13974      2 0x00000000
> [158880.240259] Workqueue: writeback bdi_writeback_workfn (flush-btrfs-1)
> [158880.240260]  ffff88064e7b76c8 ffff88066f0c18e0 0000000000014100 ffff88064e7b7fd8
> [158880.240262]  0000000000014100 ffffffff8201e4a0 ffff88066f0c18e0 ffff88077c3e06e8
> [158880.240264]  ffff88075a214a00 ffff88077c3e06e8 ffff88064e7b7770 0000000000000002
> [158880.240266] Call Trace:
> [158880.240268]  [<ffffffff8194efa0>] ? bit_wait+0x50/0x50
> [158880.240270]  [<ffffffff8194e770>] io_schedule+0xa0/0x130
> [158880.240273]  [<ffffffff8194efcc>] bit_wait_io+0x2c/0x50
> [158880.240275]  [<ffffffff8194ed9b>] __wait_on_bit_lock+0x4b/0xb0
> [158880.240277]  [<ffffffff81169f2e>] __lock_page+0xae/0xb0
> [158880.240279]  [<ffffffff810b0840>] ? autoremove_wake_function+0x40/0x40
> [158880.240289]  [<ffffffffa00501bd>] lock_delalloc_pages+0x13d/0x1d0 [btrfs]
> [158880.240299]  [<ffffffffa005fc8a>] ? btrfs_map_block+0x1a/0x20 [btrfs]
> [158880.240308]  [<ffffffffa0050476>] ? find_delalloc_range.constprop.46+0xa6/0x160 [btrfs]
> [158880.240318]  [<ffffffffa0052cb3>] find_lock_delalloc_range+0x143/0x1f0 [btrfs]
> [158880.240326]  [<ffffffffa00534e0>] ? end_extent_writepage+0xa0/0xa0 [btrfs]
> [158880.240335]  [<ffffffffa0052de1>] writepage_delalloc.isra.32+0x81/0x160 [btrfs]
> [158880.240343]  [<ffffffffa0053fab>] __extent_writepage+0xbb/0x2a0 [btrfs]
> [158880.240350]  [<ffffffffa00544ca>] extent_write_cache_pages.isra.29.constprop.49+0x33a/0x3f0 [btrfs]
> [158880.240359]  [<ffffffffa0055f1d>] extent_writepages+0x4d/0x70 [btrfs]
> [158880.240368]  [<ffffffffa0039090>] ? btrfs_submit_direct+0x7a0/0x7a0 [btrfs]
> [158880.240371]  [<ffffffff8109c0a0>] ? default_wake_function+0x10/0x20
> [158880.240378]  [<ffffffffa00360a8>] btrfs_writepages+0x28/0x30 [btrfs]
> [158880.240380]  [<ffffffff81176d2e>] do_writepages+0x1e/0x40
> [158880.240383]  [<ffffffff81209400>] __writeback_single_inode+0x40/0x220
> [158880.240385]  [<ffffffff81209f13>] writeback_sb_inodes+0x263/0x430
> [158880.240387]  [<ffffffff8120a17f>] __writeback_inodes_wb+0x9f/0xd0
> [158880.240389]  [<ffffffff8120a3f3>] wb_writeback+0x243/0x2c0
> [158880.240392]  [<ffffffff8120c9b3>] bdi_writeback_workfn+0x113/0x440
> [158880.240394]  [<ffffffff810981bc>] ? finish_task_switch+0x6c/0x1a0
> [158880.240397]  [<ffffffff81088f3f>] process_one_work+0x14f/0x3f0
> [158880.240399]  [<ffffffff810896a1>] worker_thread+0x121/0x4e0
> [158880.240402]  [<ffffffff81089580>] ? rescuer_thread+0x3a0/0x3a0
> [158880.240404]  [<ffffffff8108ea72>] kthread+0xd2/0xf0
> [158880.240406]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180
> [158880.240408]  [<ffffffff8195293c>] ret_from_fork+0x7c/0xb0
> [158880.240411]  [<ffffffff8108e9a0>] ? kthread_create_on_node+0x180/0x180
> 
> 
> Help!
> Thanks,
> Steven
> 
Hi, Steven,

Are you seeing any ENOMEM Btrfs-related errors in your dmesg? In your
previous thread you trigged an ENOMEM BUG_ON and you mentioned that your
containers often get OOM'ed.

I experimented with Btrfs in a memory-constrained cgroup and saw all
sorts of buggy behavior (https://lkml.org/lkml/2015/2/17/131), but I
haven't been able to reproduce this particular issue. This is a wild
guess, but there could be a buggy error handling path somewhere that
forgets to unlock a page.

-- 
Omar

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Forever blocked in bit_wait with kernel 3.19
  2015-02-18  6:26 ` Omar Sandoval
@ 2015-02-18 17:49   ` Steven Schlansker
  0 siblings, 0 replies; 7+ messages in thread
From: Steven Schlansker @ 2015-02-18 17:49 UTC (permalink / raw)
  To: Omar Sandoval; +Cc: linux-btrfs


On Feb 17, 2015, at 10:26 PM, Omar Sandoval <osandov@osandov.com> wrote:

> On Thu, Feb 12, 2015 at 11:12:25AM +0000, Steven Schlansker wrote:
>> [ Please CC me on replies, I'm not on the list ]
>> [ This is a followup to http://www.spinics.net/lists/linux-btrfs/msg41496.html ]
>> 
>> Hello linux-btrfs,
>> I've been having troubles keeping my Apache Mesos / Docker slave nodes stable.  After some period of load, tasks begin to hang.  Once this happens task after task ends up waiting at the same point, never to return.  The system quickly becomes unusable and must be terminated.
> 
> Are you seeing any ENOMEM Btrfs-related errors in your dmesg? In your
> previous thread you trigged an ENOMEM BUG_ON and you mentioned that your
> containers often get OOM'ed.

Correct, the containers OOM relatively frequently.  We're working on fixing this but accidents happen and we're doing heavy development with memory-hungry services.

I've not observed any BTRFS complaints about memory; I'd assumed that it would be allocating outside of any container limits.  But I don't know how e.g. that interacts with page cache accounting.

> 
> I experimented with Btrfs in a memory-constrained cgroup and saw all
> sorts of buggy behavior (https://lkml.org/lkml/2015/2/17/131), but I
> haven't been able to reproduce this particular issue. This is a wild
> guess, but there could be a buggy error handling path somewhere that
> forgets to unlock a page.

That seems like a very reasonable explanation.  Is there any particular debugging I could provide to diagnose this?  I can't reproduce it on demand but it's only a matter of time until it happens again...  And I'd like to help contribute to fixing it, but I'm not much of a filesystem developer :)


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2015-02-18 18:05 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-02-12 11:12 Forever blocked in bit_wait with kernel 3.19 Steven Schlansker
     [not found] ` <84643B78-D39C-430C-BE12-BBC8DFD7B69F@lksmith.net>
2015-02-12 18:50   ` Steven Schlansker
2015-02-12 18:59     ` Lucas Smith
2015-02-12 19:18       ` Lucas Smith
2015-02-13 11:19       ` Steven Schlansker
2015-02-18  6:26 ` Omar Sandoval
2015-02-18 17:49   ` Steven Schlansker

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.