All of lore.kernel.org
 help / color / mirror / Atom feed
* 40TB volume taking over 16 hours to mount, any ideas?
@ 2014-08-08 21:35 Jose Ildefonso Camargo Tolosa
  2014-08-09  3:38 ` Russell Coker
  0 siblings, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-08 21:35 UTC (permalink / raw)
  To: linux-btrfs

Greetings,

I have been having some issues with btrfs for the past couple of days.

Some info (this has changed as I tried multiple things):

btrfs fi show
Btrfs v3.12

uname -a
Linux server1 3.15.8-031508-generic #201407311933 SMP Thu Jul 31
23:34:33 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

The complete story:

The filesystem was created on Ubuntu 12.04, running kernel 3.11.
mount options included compress=zlib .

After having some issues with it, specifically that it would mount
itself read-only, and would then be "stuck" while trying to mount it
again, we decided to upgrade 14.04, kernel 3.13, and 3.12 btrfs tools.

It worked just fine for a few days, until it mounted itself read-only
again.  Someone rebooted the server, trying to get it back up: it
never came back.  On console access, there was "hung task" messages,
lots of them.  Well, I rebooted with a rescue disk, and removed btrfs
filesytem entry from fstab, and booted back to the system.

After this, tried to mount btrfs again, it would start reading at
~1MB/s from disk, then all disk activity would cease, and "mount"
would start taking 100% CPU.  Well, we left the server like that
overnight: after ~12 hours, it was just the same, and I had messages
very similar to this one on dmesg output:

http://pastebin.com/dhPTrDSp

Then, after reading here and there, decided to try to use a newer
kernel, tried 3.15.8.  Well, it is still mounting after ~16 hours, and
I got messages like these at first:

[25490.214875] BTRFS info (device sdb1): force clearing of disk cache
[25490.214882] BTRFS info (device sdb1): enabling auto recovery
[25556.240243] BTRFS: detected SSD devices, enabling SSD mode
[25812.123804] INFO: task btrfs-transacti:31532 blocked for more than
120 seconds.
[25812.125408]       Not tainted 3.15.8-031508-generic #201407311933
[25812.126732] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[25812.128449] btrfs-transacti D 0000000000000003     0 31532      2 0x00000000
[25812.128458]  ffff880608c73dc8 0000000000000002 ffff880608c73d68
ffff880608c73fd8
[25812.128461]  0000000000014500 0000000000014500 ffff8804698b3260
ffff880867440000
[25812.128463]  ffff880608c73dd8 ffff8804658ec000 ffff880868357000
ffff880608c73e00
[25812.128465] Call Trace:
[25812.128476]  [<ffffffff817784f9>] schedule+0x29/0x70
[25812.128503]  [<ffffffffa010273d>]
btrfs_commit_transaction+0x25d/0xa70 [btrfs]
[25812.128514]  [<ffffffff810b5450>] ? __wake_up_sync+0x20/0x20
[25812.128524]  [<ffffffffa0100475>] transaction_kthread+0x1d5/0x250 [btrfs]
[25812.128535]  [<ffffffffa01002a0>] ? open_ctree+0x1ee0/0x1ee0 [btrfs]
[25812.128539]  [<ffffffff81091469>] kthread+0xc9/0xe0
[25812.128542]  [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[25812.128545]  [<ffffffff8178567c>] ret_from_fork+0x7c/0xb0
[25812.128547]  [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0

mmm... not this is not an SSD device, this is a hardware RAID5 array
of rotational disks.  I am not sure why btrfs thought this is a SSD.

Current mount command:

mount -o clear_cache,compress=zlib,recovery /dev/sdb1 /pot/

And it kept going until later, when I just got this:

[26893.263397] INFO: task btrfs-transacti:31532 blocked for more than
120 seconds.
[26893.342166]       Not tainted 3.15.8-031508-generic #201407311933
[26893.342167] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[26893.342170] btrfs-transacti D 0000000000000003     0 31532      2 0x00000000
[26893.342173]  ffff880608c73dc8 0000000000000002 ffff880608c73d68
ffff880608c73fd8
[26893.342174]  0000000000014500 0000000000014500 ffff8804698b3260
ffff880867440000
[26893.342175]  ffff880608c73dd8 ffff8804658ec000 ffff880868357000
ffff880608c73e00
[26893.342176] Call Trace:
[26893.342186]  [<ffffffff817784f9>] schedule+0x29/0x70
[26893.342214]  [<ffffffffa010273d>]
btrfs_commit_transaction+0x25d/0xa70 [btrfs]
[26893.342220]  [<ffffffff810b5450>] ? __wake_up_sync+0x20/0x20
[26893.342231]  [<ffffffffa0100475>] transaction_kthread+0x1d5/0x250 [btrfs]
[26893.342241]  [<ffffffffa01002a0>] ? open_ctree+0x1ee0/0x1ee0 [btrfs]
[26893.342246]  [<ffffffff81091469>] kthread+0xc9/0xe0
[26893.342249]  [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[26893.342252]  [<ffffffff8178567c>] ret_from_fork+0x7c/0xb0
[26893.342253]  [<ffffffff810913a0>] ? flush_kthread_worker+0xb0/0xb0
[43614.686424] perf interrupt took too long (2505 > 2500), lowering
kernel.perf_event_max_sample_rate to 50000
[72858.744039] ip_tables: (C) 2000-2006 Netfilter Core Team
[85708.015922] ip_tables: (C) 2000-2006 Netfilter Core Team

The last iptables thing was a few minutes ago.  Right now, "mount" is
using 100% CPU, there is no disk activity, memory usage looks low.

perf top output looks like this:

     7.57%  [btrfs]    [k] generic_bin_search.constprop.44
     6.77%  [btrfs]    [k] btrfs_tree_read_unlock
     6.70%  [kernel]   [k] __radix_tree_lookup
     5.45%  [btrfs]    [k] map_private_extent_buffer
     5.13%  [btrfs]    [k] btrfs_comp_cpu_keys
     4.67%  [btrfs]    [k] btrfs_search_slot
     4.66%  [btrfs]    [k] comp_keys
     4.49%  [btrfs]    [k] btrfs_try_tree_read_lock
     3.96%  [btrfs]    [k] find_extent_buffer
     3.12%  [kernel]   [k] _raw_read_lock
     2.97%  [btrfs]    [k] btrfs_clear_path_blocking
     2.96%  [btrfs]    [k] btrfs_get_token_64
     2.84%  [btrfs]    [k] release_extent_buffer
     2.52%  [kernel]   [k] _raw_spin_lock
     2.21%  [btrfs]    [k] btrfs_set_lock_blocking_rw
     2.20%  [btrfs]    [k] btrfs_get_token_16
     2.12%  [kernel]   [k] memcmp
     1.97%  [btrfs]    [k] unlock_up
     1.66%  [btrfs]    [k] btrfs_tree_read_lock
     1.50%  [btrfs]    [k] btrfs_tree_read_unlock_blocking
     1.45%  [btrfs]    [k] read_block_for_search.isra.40
     1.29%  [btrfs]    [k] btrfs_release_path
     1.17%  [btrfs]    [k] btrfs_root_node
     1.16%  [btrfs]    [k] check_buffer_tree_ref
     1.06%  [btrfs]    [k] btrfs_clear_lock_blocking_rw
     0.96%  [btrfs]    [k] btrfs_get_token_8
     0.86%  [btrfs]    [k] btrfs_buffer_uptodate
     0.85%  [btrfs]    [k] verify_parent_transid
     0.81%  [btrfs]    [k] bin_search
     0.78%  [kernel]   [k] memcpy
     0.77%  [btrfs]    [k] free_extent_buffer.part.39
     0.76%  [btrfs]    [k] btrfs_get_token_32
     0.72%  [btrfs]    [k] setup_nodes_for_search
     0.67%  [btrfs]    [k] free_extent_buffer
     0.67%  [btrfs]    [k] mark_extent_buffer_accessed
     0.66%  [kernel]   [k] __kmalloc
     0.57%  [btrfs]    [k] btrfs_set_path_blocking
     0.55%  [btrfs]    [k] check_item_in_log
     0.48%  [btrfs]    [k] read_extent_buffer
     0.42%  [btrfs]    [k] memcmp_extent_buffer
     0.38%  [kernel]   [k] crc32c_intel_le_hw
     0.36%  [btrfs]    [k] btrfs_match_dir_item_name.isra.3
     0.35%  [btrfs]    [k] verify_dir_item
     0.33%  [aacraid]  [k] aac_queuecommand
     0.32%  [kernel]   [k] _raw_spin_unlock
     0.31%  [btrfs]    [k] btrfs_crc32c
     0.31%  [kernel]   [k] kfree
     0.29%  [btrfs]    [k] replay_dir_deletes
     0.29%  [btrfs]    [k] extent_buffer_uptodate
     0.27%  [kernel]   [k] mark_page_accessed
     0.26%  [btrfs]    [k] btrfs_lookup_dir_item
     0.25%  [kernel]   [k] radix_tree_lookup
     0.14%  [btrfs]    [k] btrfs_find_tree_block
     0.13%  [kernel]   [k] crypto_shash_update


I am compiling 3.16 kernel now to give it a try, but I am not sure on
this.  I mean, this 40TB volume is holding just backups, but it took
quite some time to put the backups there, it will take ~2 weeks to put
that data in there again, I'd really prefer *not* to do it.

Any ideas?

Thanks!


-- 
Ildefonso Camargo
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-08 21:35 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
@ 2014-08-09  3:38 ` Russell Coker
  2014-08-09 14:32   ` Andy Smith
  0 siblings, 1 reply; 26+ messages in thread
From: Russell Coker @ 2014-08-09  3:38 UTC (permalink / raw)
  To: Jose Ildefonso Camargo Tolosa; +Cc: linux-btrfs

On Fri, 8 Aug 2014 16:35:29 Jose Ildefonso Camargo Tolosa wrote:
> uname -a
> Linux server1 3.15.8-031508-generic #201407311933 SMP Thu Jul 31
> 23:34:33 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> 
> The complete story:
> 
> The filesystem was created on Ubuntu 12.04, running kernel 3.11.
> mount options included compress=zlib .
> 
> After having some issues with it, specifically that it would mount
> itself read-only, and would then be "stuck" while trying to mount it
> again, we decided to upgrade 14.04, kernel 3.13, and 3.12 btrfs tools.

[...]

> Then, after reading here and there, decided to try to use a newer
> kernel, tried 3.15.8.  Well, it is still mounting after ~16 hours, and
> I got messages like these at first:

I recommend trying a 3.14 kernel.  I had ongoing problems with kernels before 
3.14 which included infinite loops in kernel space.  Based on reports on this 
list I haven't been inclined to test 3.15 kernels.  But 3.14 has been working 
well for me on many systems.

Trying 3.14 can't hurt.

-- 
My Main Blog         http://etbe.coker.com.au/
My Documents Blog    http://doc.coker.com.au/


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09  3:38 ` Russell Coker
@ 2014-08-09 14:32   ` Andy Smith
  2014-08-09 14:58     ` Jose Ildefonso Camargo Tolosa
  0 siblings, 1 reply; 26+ messages in thread
From: Andy Smith @ 2014-08-09 14:32 UTC (permalink / raw)
  To: linux-btrfs

Hello,

On Sat, Aug 09, 2014 at 01:38:34PM +1000, Russell Coker wrote:
> On Fri, 8 Aug 2014 16:35:29 Jose Ildefonso Camargo Tolosa wrote:
> > Then, after reading here and there, decided to try to use a newer
> > kernel, tried 3.15.8.  Well, it is still mounting after ~16 hours, and
> > I got messages like these at first:
> 
> I recommend trying a 3.14 kernel.  I had ongoing problems with kernels before 
> 3.14 which included infinite loops in kernel space.  Based on reports on this 
> list I haven't been inclined to test 3.15 kernels.  But 3.14 has been working 
> well for me on many systems.

I'm in a similar position with a filesystem that won't mount except
read-only, but am already on 3.14 and am also wondering whether to
try a 3.16 kernel.

    https://bugzilla.kernel.org/show_bug.cgi?id=81981

Jose, maybe you could try -oro in the hope of at least getting back
to a read-only mount?

Cheers,
Andy

-- 
"I remember the first time I made love.  Perhaps it was not love exactly but I
 made it and it still works." — The League Against Tedium

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 14:32   ` Andy Smith
@ 2014-08-09 14:58     ` Jose Ildefonso Camargo Tolosa
  2014-08-09 16:06       ` Jose Ildefonso Camargo Tolosa
  0 siblings, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-09 14:58 UTC (permalink / raw)
  To: Andy Smith; +Cc: linux-btrfs

On Sat, Aug 9, 2014 at 9:32 AM, Andy Smith <andy@strugglers.net> wrote:
> Hello,
>
> On Sat, Aug 09, 2014 at 01:38:34PM +1000, Russell Coker wrote:
>> On Fri, 8 Aug 2014 16:35:29 Jose Ildefonso Camargo Tolosa wrote:
>> > Then, after reading here and there, decided to try to use a newer
>> > kernel, tried 3.15.8.  Well, it is still mounting after ~16 hours, and
>> > I got messages like these at first:
>>
>> I recommend trying a 3.14 kernel.  I had ongoing problems with kernels before
>> 3.14 which included infinite loops in kernel space.  Based on reports on this
>> list I haven't been inclined to test 3.15 kernels.  But 3.14 has been working
>> well for me on many systems.
>
> I'm in a similar position with a filesystem that won't mount except
> read-only, but am already on 3.14 and am also wondering whether to
> try a 3.16 kernel.
>
>     https://bugzilla.kernel.org/show_bug.cgi?id=81981
>
> Jose, maybe you could try -oro in the hope of at least getting back
> to a read-only mount?

Will try 3.14, ro would be "good enough" for me, provided that I can
resize the filesystem, if I can do that, I can create a new one, and
copy all data (hopefully faster than moving ~11TB of data through the
network).

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 14:58     ` Jose Ildefonso Camargo Tolosa
@ 2014-08-09 16:06       ` Jose Ildefonso Camargo Tolosa
  2014-08-09 17:01         ` Duncan
  0 siblings, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-09 16:06 UTC (permalink / raw)
  To: Andy Smith; +Cc: linux-btrfs

Re-sending to list.

On Sat, Aug 9, 2014 at 9:58 AM, Jose Ildefonso Camargo Tolosa
<ildefonso.camargo@gmail.com> wrote:
> On Sat, Aug 9, 2014 at 9:32 AM, Andy Smith <andy@strugglers.net> wrote:
>> Hello,
>>
>> On Sat, Aug 09, 2014 at 01:38:34PM +1000, Russell Coker wrote:
>>> On Fri, 8 Aug 2014 16:35:29 Jose Ildefonso Camargo Tolosa wrote:
>>> > Then, after reading here and there, decided to try to use a newer
>>> > kernel, tried 3.15.8.  Well, it is still mounting after ~16 hours, and
>>> > I got messages like these at first:
>>>
>>> I recommend trying a 3.14 kernel.  I had ongoing problems with kernels before
>>> 3.14 which included infinite loops in kernel space.  Based on reports on this
>>> list I haven't been inclined to test 3.15 kernels.  But 3.14 has been working
>>> well for me on many systems.
>>
>> I'm in a similar position with a filesystem that won't mount except
>> read-only, but am already on 3.14 and am also wondering whether to
>> try a 3.16 kernel.
>>
>>     https://bugzilla.kernel.org/show_bug.cgi?id=81981
>>
>> Jose, maybe you could try -oro in the hope of at least getting back
>> to a read-only mount?
>
> Will try 3.14, ro would be "good enough" for me, provided that I can
> resize the filesystem, if I can do that, I can create a new one, and
> copy all data (hopefully faster than moving ~11TB of data through the
> network).

Or maybe 3.16? sigh.... I have them both ready, but I am not sure
which one to try.  My fear is that if I go to 3.16 (still in
development), would I be able to go back to, say, 3.14 and work with
the filesystem there?  According to documents, disk format is "stable"
now.

What do you say? 3.14 or 3.16 for my next attempt (I have just today,
if I can't get this FS back to life today, I will blow it and start
over, with the ~1.5 weeks copy period ahead of me).

-- 
Ildefonso Camargo
Command Prompt, Inc. - http://www.commandprompt.com/
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC
@cmdpromptinc - 509-416-6579

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 16:06       ` Jose Ildefonso Camargo Tolosa
@ 2014-08-09 17:01         ` Duncan
  2014-08-09 18:21           ` Marc MERLIN
  2014-08-09 18:38           ` 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
  0 siblings, 2 replies; 26+ messages in thread
From: Duncan @ 2014-08-09 17:01 UTC (permalink / raw)
  To: linux-btrfs

Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 11:06:37 -0500 as
excerpted:

> 3.16 (still in development)

??

3.16 has been out for nearly a week now and we're nearing half-way thru 
the 3.17 commit-window.  Based on the kernel git I have here, Linus' 
commit officially changing the makefile entry to 3.16 was on Sunday, Aug 
3, at 15:25:02 -0700.

The last pre-3.16 commit was a merge of two timer-related fixes from the 
tip-tree at 9:58:20 -0700 that morning.

So where does your "still in development" come from?

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 17:01         ` Duncan
@ 2014-08-09 18:21           ` Marc MERLIN
  2014-08-10  4:03             ` Duncan
  2014-08-10 12:43             ` Holger Hoffstätte
  2014-08-09 18:38           ` 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
  1 sibling, 2 replies; 26+ messages in thread
From: Marc MERLIN @ 2014-08-09 18:21 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Sat, Aug 09, 2014 at 05:01:24PM +0000, Duncan wrote:
> Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 11:06:37 -0500 as
> excerpted:
> 
> > 3.16 (still in development)
> 
> ??
> 
> 3.16 has been out for nearly a week now and we're nearing half-way thru 
> the 3.17 commit-window.  Based on the kernel git I have here, Linus' 
> commit officially changing the makefile entry to 3.16 was on Sunday, Aug 
> 3, at 15:25:02 -0700.
> 
> The last pre-3.16 commit was a merge of two timer-related fixes from the 
> tip-tree at 9:58:20 -0700 that morning.
> 
> So where does your "still in development" come from?

You could argue that since 3.16.0 does not have the recently found
deadlock patch that's been plaging 15 and 16 (14 not as much for me),
it's not usable for some (it ran about 1 day on my laptop before
deadlocking, and maybe an hour at most on my server).

I sure hope that deadlock patch is going to be added to the 3.16.x tree,
I'm not super stocked with being stuck at 3.14.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 17:01         ` Duncan
  2014-08-09 18:21           ` Marc MERLIN
@ 2014-08-09 18:38           ` Jose Ildefonso Camargo Tolosa
  2014-08-09 21:02             ` Jose Ildefonso Camargo Tolosa
  2014-08-10  4:21             ` Duncan
  1 sibling, 2 replies; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-09 18:38 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Sat, Aug 9, 2014 at 12:01 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 11:06:37 -0500 as
> excerpted:
>
>> 3.16 (still in development)
>
> ??
>
> 3.16 has been out for nearly a week now and we're nearing half-way thru
> the 3.17 commit-window.  Based on the kernel git I have here, Linus'
> commit officially changing the makefile entry to 3.16 was on Sunday, Aug
> 3, at 15:25:02 -0700.
>
> The last pre-3.16 commit was a merge of two timer-related fixes from the
> tip-tree at 9:58:20 -0700 that morning.
>
> So where does your "still in development" come from?
>

Well, maybe not the right word, but here is what kernel.org says about
mainline kernels:

"Mainline tree is maintained by Linus Torvalds. It's the tree where
all new features are introduced and where all the exciting new
development happens. New mainline kernels are released every 2-3
months."

So, there you go: all new features are introduced, and where all the
exciting new development happens.

So... development is quite active on mainline kernels.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 18:38           ` 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
@ 2014-08-09 21:02             ` Jose Ildefonso Camargo Tolosa
  2014-08-10  3:58               ` Jose Ildefonso Camargo Tolosa
  2014-08-10  4:21             ` Duncan
  1 sibling, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-09 21:02 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

3.14.16 test is on its way, it already started with this:

[19732.769100] BTRFS: device fsid 7356e329-62ba-49fb-83cc-f6b91ac3b581
devid 1 transid 111580 /dev/sdb1
[19732.769429] BTRFS info (device sdb1): enabling auto recovery
[19732.769433] BTRFS info (device sdb1): force clearing of disk cache
[20050.137779] INFO: task btrfs-transacti:7353 blocked for more than
120 seconds.
[20050.139361]       Not tainted 3.14.16-031416-generic #201408072035
[20050.140704] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[20050.142422] btrfs-transacti D ffffffff818118e0     0  7353      2 0x00000000
[20050.142430]  ffff880450afddc8 0000000000000002 ffff880450afdd68
ffff880450afdfd8
[20050.142434]  0000000000014500 0000000000014500 ffff88046985e380
ffff8804602018e0
[20050.142437]  ffff880450afddd8 ffff8808642fc000 ffff8802aa5b8800
ffff880450afde00
[20050.142440] Call Trace:
[20050.142447]  [<ffffffff8175b0c9>] schedule+0x29/0x70
[20050.142473]  [<ffffffffa01040ed>]
btrfs_commit_transaction+0x25d/0xa00 [btrfs]
[20050.142482]  [<ffffffff810b4e10>] ? __wake_up_sync+0x20/0x20
[20050.142493]  [<ffffffffa0101e45>] transaction_kthread+0x1d5/0x250 [btrfs]
[20050.142504]  [<ffffffffa0101c70>] ? open_ctree+0x20d0/0x20d0 [btrfs]
[20050.142507]  [<ffffffff8108fd89>] kthread+0xc9/0xe0
[20050.142509]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[20050.142513]  [<ffffffff817681bc>] ret_from_fork+0x7c/0xb0
[20050.142515]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[20170.194168] INFO: task btrfs-transacti:7353 blocked for more than
120 seconds.
[20170.195747]       Not tainted 3.14.16-031416-generic #201408072035
[20170.197090] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[20170.198815] btrfs-transacti D ffffffff818118e0     0  7353      2 0x00000000
[20170.198820]  ffff880450afddc8 0000000000000002 ffff880450afdd68
ffff880450afdfd8
[20170.198822]  0000000000014500 0000000000014500 ffff88046985e380
ffff8804602018e0
[20170.198824]  ffff880450afddd8 ffff8808642fc000 ffff8802aa5b8800
ffff880450afde00
[20170.198824] Call Trace:
[20170.198831]  [<ffffffff8175b0c9>] schedule+0x29/0x70
[20170.198856]  [<ffffffffa01040ed>]
btrfs_commit_transaction+0x25d/0xa00 [btrfs]
[20170.198861]  [<ffffffff810b4e10>] ? __wake_up_sync+0x20/0x20
[20170.198875]  [<ffffffffa0101e45>] transaction_kthread+0x1d5/0x250 [btrfs]
[20170.198886]  [<ffffffffa0101c70>] ? open_ctree+0x20d0/0x20d0 [btrfs]
[20170.198889]  [<ffffffff8108fd89>] kthread+0xc9/0xe0
[20170.198891]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[20170.198895]  [<ffffffff817681bc>] ret_from_fork+0x7c/0xb0
[20170.198897]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[20290.250561] INFO: task btrfs-transacti:7353 blocked for more than
120 seconds.
[20290.252140]       Not tainted 3.14.16-031416-generic #201408072035
[20290.253483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[20290.282212] btrfs-transacti D ffffffff818118e0     0  7353      2 0x00000000
[20290.282216]  ffff880450afddc8 0000000000000002 ffff880450afdd68
ffff880450afdfd8
[20290.282219]  0000000000014500 0000000000014500 ffff88046985e380
ffff8804602018e0
[20290.282221]  ffff880450afddd8 ffff8808642fc000 ffff8802aa5b8800
ffff880450afde00
[20290.282221] Call Trace:
[20290.282227]  [<ffffffff8175b0c9>] schedule+0x29/0x70
[20290.282253]  [<ffffffffa01040ed>]
btrfs_commit_transaction+0x25d/0xa00 [btrfs]
[20290.282262]  [<ffffffff810b4e10>] ? __wake_up_sync+0x20/0x20
[20290.282272]  [<ffffffffa0101e45>] transaction_kthread+0x1d5/0x250 [btrfs]
[20290.282283]  [<ffffffffa0101c70>] ? open_ctree+0x20d0/0x20d0 [btrfs]
[20290.282286]  [<ffffffff8108fd89>] kthread+0xc9/0xe0
[20290.282289]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[20290.282292]  [<ffffffff817681bc>] ret_from_fork+0x7c/0xb0
[20290.282294]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0


I'll allow it to run for a few hours, and then will report.

On a side-note, I ran 'btrfs check' and it returned so many errors
that it went out of my console's history... unfortunately I didn't
redirect its output to a file (big mistake), I didn't thought it would
be so big.  Anyway, part of the output:

(.... older output lost due to term size ....)
root 5 inode 94906683 errors 200, dir isize wrong
root 5 inode 94906716 errors 200, dir isize wrong
root 5 inode 94906730 errors 200, dir isize wrong
root 5 inode 94906735 errors 200, dir isize wrong
root 5 inode 94906758 errors 200, dir isize wrong
(....)
root 5 inode 94928259 errors 200, dir isize wrong
root 5 inode 94928286 errors 200, dir isize wrong
root 5 inode 94928311 errors 200, dir isize wrong
root 5 inode 94928321 errors 200, dir isize wrong
root 5 inode 133964681 errors 200, dir isize wrong
root 5 inode 133964684 errors 200, dir isize wrong
root 5 inode 142590710 errors 200, dir isize wrong
root 5 inode 144973646 errors 200, dir isize wrong
root 5 inode 146401067 errors 100, file extent discount
root 5 inode 146401080 errors 100, file extent discount
root 5 inode 146401096 errors 100, file extent discount
root 5 inode 146401108 errors 100, file extent discount
(.....)
root 5 inode 146436466 errors 100, file extent discount
root 5 inode 146436478 errors 100, file extent discount
root 5 inode 146436507 errors 100, file extent discount
root 5 inode 146436538 errors 100, file extent discount
root 5 inode 146436569 errors 100, file extent discount
root 5 inode 146436570 errors 100, file extent discount
root 5 inode 146436581 errors 100, file extent discount
root 5 inode 146436601 errors 100, file extent discount
root 5 inode 146436627 errors 100, file extent discount
root 5 inode 146938464 errors 100, file extent discount
root 5 inode 146953375 errors 100, file extent discount
root 5 inode 147034342 errors 100, file extent discount
found 3028618030140 bytes used err is 1
total csum bytes: 10600828544
total tree bytes: 69761568768
total fs tree bytes: 51199995904
total extent tree bytes: 6283046912
btree space waste bytes: 13467432119
file data blocks allocated: 52622775431168
 referenced 11853450313728
Btrfs v3.14.2

real    266m24.291s    <---- this is why i do not want to run it
again, unless absolutely necessary.
user    40m8.369s
sys     15m37.105s

Ildefonso

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 21:02             ` Jose Ildefonso Camargo Tolosa
@ 2014-08-10  3:58               ` Jose Ildefonso Camargo Tolosa
  2014-08-10  8:24                 ` Duncan
  0 siblings, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-10  3:58 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

And it is still going.... although the hung task message stopped long
ago (behavior similar to 3.15), it hasn't finished mounting, mount is
still taking 100% CPU, *and* I can't see any disk activity at all.
Last hung task message:

[21131.749759] INFO: task btrfs-transacti:7353 blocked for more than
120 seconds.
[21131.828755]       Not tainted 3.14.16-031416-generic #201408072035
[21131.868788] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[21131.947525] btrfs-transacti D ffffffff818118e0     0  7353      2 0x00000000
[21131.947530]  ffff880450afddc8 0000000000000002 ffff880450afdd68
ffff880450afdfd8
[21131.947535]  0000000000014500 0000000000014500 ffff88046985e380
ffff8804602018e0
[21131.947540]  ffff880450afddd8 ffff8808642fc000 ffff8802aa5b8800
ffff880450afde00
[21131.947544] Call Trace:
[21131.947551]  [<ffffffff8175b0c9>] schedule+0x29/0x70
[21131.947577]  [<ffffffffa01040ed>]
btrfs_commit_transaction+0x25d/0xa00 [btrfs]
[21131.947581]  [<ffffffff810b4e10>] ? __wake_up_sync+0x20/0x20
[21131.947591]  [<ffffffffa0101e45>] transaction_kthread+0x1d5/0x250 [btrfs]
[21131.947601]  [<ffffffffa0101c70>] ? open_ctree+0x20d0/0x20d0 [btrfs]
[21131.947604]  [<ffffffff8108fd89>] kthread+0xc9/0xe0
[21131.947606]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0
[21131.947610]  [<ffffffff817681bc>] ret_from_fork+0x7c/0xb0
[21131.947612]  [<ffffffff8108fcc0>] ? flush_kthread_worker+0xb0/0xb0

Do you think I will have better luck with 3.16? or maybe it is that
this filesystem has so many errors (remember the btrfs check output)
that it will take a really long time to mount because it is trying to
correct this?

Thanks!

Ildefonso

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 18:21           ` Marc MERLIN
@ 2014-08-10  4:03             ` Duncan
  2014-08-10 12:43             ` Holger Hoffstätte
  1 sibling, 0 replies; 26+ messages in thread
From: Duncan @ 2014-08-10  4:03 UTC (permalink / raw)
  To: linux-btrfs

Marc MERLIN posted on Sat, 09 Aug 2014 11:21:13 -0700 as excerpted:

> You could argue that since 3.16.0 does not have the recently found
> deadlock patch that's been plaging 15 and 16 (14 not as much for me),
> it's not usable for some (it ran about 1 day on my laptop before
> deadlocking, and maybe an hour at most on my server).
> 
> I sure hope that deadlock patch is going to be added to the 3.16.x tree,
> I'm not super stocked with being stuck at 3.14.

Well, yes.

It'll almost certainly make it to the stable series including 3.16.x 
shortly after it ends up in the 3.17 development tree.  But the switch to 
worker-threads was only with 3.15, so anything previous to that doesn't 
need it (thus 3.14 working well for you, previous versions had other 
bugs), and 3.15 isn't a long-term-stable and Greg KH already warned that 
the just-Friday-released 3.15.9 is its penultimate release and people 
should be thinking about switching to 3.16, so pre-3.15 the patch isn't 
needed and whether it'll make it into 3.15.10, the last 3.15-series 
release, is questionable at this point, so 3.17-development or presumably 
3.16.1 or 3.16.2 looks to be the soonest it'll possibly happen for people 
not willing to cherrypick the patch from the list as soon as posted.

FWIW, 3.15 (where I didn't have time to try the development series and 
only upgraded about time it came out) and the 3.16 development series 
including the 3.16.0 release have worked well enough for me, but my btrfs 
are all on ssd, the ones I regularly mount all being raid1-pairs, and 
apparently on my 6-core at least, the bug is hard enough to trigger on 
ssd and I don't routinely push them hard enough to have seen it, thus 
explaining why I've not had problems with 3.15 and the 3.16 series up 
thru 3.16.0 release, beyond an instance that was either right about 3.15 
release or in 3.14, and might have been a one-off as it certainly was for 
me.

Tho while the problem has been pretty well traced so we know what it is, 
I'm not sure that a full patch for it has yet been posted on the list, 
has it?  I think it was nailed down too late in the week to prepare and 
pre-post test a patch before the weekend.  So I'd expect to see the patch 
on the list on Tuesday or so, just in time to make the last bit of the 
3.17 commit window (tho it's a stable-candidate fix so could go in later 
as well), but likely too late to make 3.15.10 and 3.16.1, so 3.17-rc1 or 
3.16.2 it'll likely be.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 18:38           ` 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
  2014-08-09 21:02             ` Jose Ildefonso Camargo Tolosa
@ 2014-08-10  4:21             ` Duncan
  2014-08-10  4:57               ` Mitch Harder
  1 sibling, 1 reply; 26+ messages in thread
From: Duncan @ 2014-08-10  4:21 UTC (permalink / raw)
  To: linux-btrfs

Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 13:38:46 -0500 as
excerpted:

> On Sat, Aug 9, 2014 at 12:01 PM, Duncan <1i5t5.duncan@cox.net> wrote:
>> Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 11:06:37 -0500
>> as excerpted:
>>
>>> 3.16 (still in development)
>>
>> ??
>>
>> 3.16 has been out for nearly a week now and we're nearing half-way thru
>> the 3.17 commit-window.  Based on the kernel git I have here, Linus'
>> commit officially changing the makefile entry to 3.16 was on Sunday,
>> Aug 3, at 15:25:02 -0700.
>>
>> The last pre-3.16 commit was a merge of two timer-related fixes from
>> the tip-tree at 9:58:20 -0700 that morning.
>>
>> So where does your "still in development" come from?
>>
>>
> Well, maybe not the right word, but here is what kernel.org says about
> mainline kernels:
> 
> "Mainline tree is maintained by Linus Torvalds. It's the tree where all
> new features are introduced and where all the exciting new development
> happens. New mainline kernels are released every 2-3 months."
> 
> So, there you go: all new features are introduced, and where all the
> exciting new development happens.
> 
> So... development is quite active on mainline kernels.

But 3.16.0 is out, and the real active development is in the commit 
window pre-rc1, tho a kernel doesn't really /start/ settling down until 
rc3 or so, and isn't reasonably stable until rc5 or so (tho rc5 is a 
little late to start testing and reporting bugs to have fixed by release, 
it's really best to start testing around rc3 or so, at which point any 
real bad data-eating-risk bugs should be either fixed or at least 
published, so the risk is dramatically lower than it would be during the 
commit window itself, for instance).  But from rc5 on thru rc7 or 8 and 
release, unless you're one of the ones still waiting on a bug found 
earlier to be fixed, it's generally quite stable and boring.

So by the time of actual .0 release, it really is quite stable, and no 
longer development kernel.  Sure, Greg KH's stable series kernel releases 
stabilize it further, but that's exactly what they are, stable series, 
not development series, and there's really no development going into it 
generally from rc1 on, tho occasionally something that needs to come 
after everything else is slipped in in the first couple days after rc1, 
but still well before rc2, and the .0 release signifies the end of the 
post development stabilization period such that .0 really is no longer a 
development kernel at all, even if there are a few more weekly stable-
series updates (about 10, 3.15.10 was announced to be the last one for 
3.15, with the Friday-released 3.15.9) before support ceases if it's not 
a long-term-stable candidate.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  4:21             ` Duncan
@ 2014-08-10  4:57               ` Mitch Harder
  2014-08-10  7:21                 ` Duncan
  0 siblings, 1 reply; 26+ messages in thread
From: Mitch Harder @ 2014-08-10  4:57 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Sat, Aug 9, 2014 at 11:21 PM, Duncan <1i5t5.duncan@cox.net> wrote:

> ....  But from rc5 on thru rc7 or 8 and
> release, unless you're one of the ones still waiting on a bug found
> earlier to be fixed, it's generally quite stable and boring.
>
> So by the time of actual .0 release, it really is quite stable, and no
> longer development kernel.  Sure, Greg KH's stable series kernel releases
> stabilize it further, but that's exactly what they are, stable series,
> not development series, and there's really no development going into it
> generally from rc1 on, tho occasionally something that needs to come
> after everything else is slipped in in the first couple days after rc1,
> but still well before rc2, and the .0 release signifies the end of the
> post development stabilization period such that .0 really is no longer a
> development kernel at all, even if there are a few more weekly stable-
> series updates (about 10, 3.15.10 was announced to be the last one for
> 3.15, with the Friday-released 3.15.9) before support ceases if it's not
> a long-term-stable candidate.
>

I can't say I've observed that to be the case with Btrfs.  I know
there is a core group of developers working very hard on testing the
Btrfs updates in the _rc kernels, but once that .0 kernel hits the
streets, the extra exposure to all the various combinations of
hardware and options has been know to discover new issues.  I think
this is nearly unavoidable given the pace of Btrfs development.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  4:57               ` Mitch Harder
@ 2014-08-10  7:21                 ` Duncan
  0 siblings, 0 replies; 26+ messages in thread
From: Duncan @ 2014-08-10  7:21 UTC (permalink / raw)
  To: linux-btrfs

Mitch Harder posted on Sat, 09 Aug 2014 23:57:19 -0500 as excerpted:

> On Sat, Aug 9, 2014 at 11:21 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> 
>> So by the time of actual .0 release, [the kernel] really is quite
>> stable, and no longer development kernel.
>>
> I can't say I've observed that to be the case with Btrfs.  I know there
> is a core group of developers working very hard on testing the Btrfs
> updates in the _rc kernels, but once that .0 kernel hits the streets,
> the extra exposure to all the various combinations of hardware and
> options has been know to discover new issues.  I think this is nearly
> unavoidable given the pace of Btrfs development.

That's because, despite the (IMO premature) recent removal of all the 
warnings to the contrary, btrfs itself isn't stable yet.  I'd argue that 
a 3.x.0 kernel is in general more stable than any btrfs to date, tho in 
both cases there's certainly corner-cases that are markedly more unstable 
(if they run at all) than the general case.

Which means at this point it's a rather dramatic stability inversion to 
be afraid of 3.x.0 kernels while all the while running btrfs on the same 
systems.

(It is worth noting, however, that say a temperature sensor driver or a 
camera driver could get away with the level of working-for-most-people-
most-of-the-time level of stability that is btrfs at this point, and be 
considered reasonably stable.  But people tend to be rather more 
conservative when it's their data, not just a temperature sample or a 
camera shot here or there, going missing, and filesystems therefore have 
a rather higher threshold definition to hit for really being stable.  And 
that's as it /should/ be, because it /is/ people's data in the balance.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  3:58               ` Jose Ildefonso Camargo Tolosa
@ 2014-08-10  8:24                 ` Duncan
  2014-08-10  8:50                   ` Timofey Titovets
  2014-08-10 16:25                   ` Chris Murphy
  0 siblings, 2 replies; 26+ messages in thread
From: Duncan @ 2014-08-10  8:24 UTC (permalink / raw)
  To: linux-btrfs

Jose Ildefonso Camargo Tolosa posted on Sat, 09 Aug 2014 22:58:37 -0500 as
excerpted:

> Do you think I will have better luck with 3.16? or maybe it is that this
> filesystem has so many errors (remember the btrfs check output) that it
> will take a really long time to mount because it is trying to correct
> this?

As a user I'd give up on that mount.

There are two critical patches in the pipeline ATM, that should hopefully 
hit 3.17-rc1 next weekend.  The one's already posted (See the Fix csum 
tree corruption patch first posted just under 12 hours ago as I type 
this), but that's a longer term fix.  The other was traced down late last 
week but I don't believe a proper patch has been posted yet.  That's the 
one you likely need here.  Of course you could cherrypick it when posted.

Tho either way I think it's likely that the filesystem is toast and 
you'll end up doing a mkfs on it, hopefully with those patches helping to 
prevent a repeat.

What I'd try at this point is btrfs restore, tho you'll need somewhere 
else to put the restored data, and you'll have to redo file ownership and 
permissions as that's not restored, only the data files.

Or restore from backup, your choice, but you said it was remote and 
you're looking at over a week's worth of downloading.

Either way, the question then comes up of what to use when you do a new 
mkfs.  My personal feeling?  Btrfs isn't yet fully stable, and there's a 
very real possibility that one may have to restore from backup, so one 
should be prepared for that.  Given the size of the data store you're 
working with and the remote nature of that backup, with access over 
limited-speed pipes, I wonder if btrfs is really an appropriate choice 
for you at this point.  If you believe the features of btrfs and the 
chance to work with something so leading/bleeding edge are worth the 
current pain and are prepared to redo that restore again should it be 
necessary, then yes, btrfs is a good choice.  OTOH, if you want something 
that reliably "just works" at this point, consider a more mature 
filesystem.  It may not have btrfs' bells and whistles, but a boring 
"just works, reliably", might be what you want.

I guess xfs is the standard recommendation for big-data sizes and it is 
said to be long past the "better have a UPS" days, or of course the 
default ext4.  Personally I've had real good luck with reiserfs (since 
data=ordered by default at least, the early data=writeback days were 
where it got its bad rep), but it's better adapted to smaller files, 
while I'd guess with 40 TB, your files are likely big as well.

You can of course try btrfs again in a year or so, when it should have 
matured quite a bit.  I actually did that after my first try at btrfs, 
leaving for a time then coming back, and was impressed at how much it had 
matured in the mean time.  Additionally, my use-case was different as the 
first time I tried it I was still on spinning rust; now all my btrfs are 
on SSD, and I still use reiserfs for my spinning rust -- tho I've nowhere 
near the double-digit TB scale you're doing.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  8:24                 ` Duncan
@ 2014-08-10  8:50                   ` Timofey Titovets
  2014-08-10 10:16                     ` Duncan
  2014-08-10 16:25                   ` Chris Murphy
  1 sibling, 1 reply; 26+ messages in thread
From: Timofey Titovets @ 2014-08-10  8:50 UTC (permalink / raw)
  To: ildefonso.camargo; +Cc: linux-btrfs

Jose, I add my 50 cents,
i know, what you want backup data from raid through network and what
you have only 11 TB data from 40 TB fs
As i now, you can safety resize btrfs fs, without btrfs fi resize,
something like that:
$ btrfs fi df /
Data, single: total=81.00GiB, used=60.33GiB
System, DUP: total=32.00MiB, used=16.00KiB
Metadata, DUP: total=1.50GiB, used=507.48MiB
GlobalReserve, single: total=176.00MiB, used=0.00B

- I can cut partitions to 81+1.5+~256 = 83g and + 1G for safety = 84G.
And my data not lost and btrfs still working after this.
After, you can use free space on the disk, to create new fs and backup
data to him without problems,
Developers, please сorrect me if I'm wrong.


-- 
Best regards,
Timofey.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  8:50                   ` Timofey Titovets
@ 2014-08-10 10:16                     ` Duncan
  0 siblings, 0 replies; 26+ messages in thread
From: Duncan @ 2014-08-10 10:16 UTC (permalink / raw)
  To: linux-btrfs

Timofey Titovets posted on Sun, 10 Aug 2014 11:50:39 +0300 as excerpted:

> Jose, I add my 50 cents,
> i know, what you want backup data from raid through network and what you
> have only 11 TB data from 40 TB fs As i now, you can safety resize btrfs
> fs, without btrfs fi resize, something like that:
> $ btrfs fi df /
> Data, single: total=81.00GiB, used=60.33GiB
> System, DUP: total=32.00MiB, used=16.00KiB
> Metadata, DUP: total=1.50GiB, used=507.48MiB
> GlobalReserve, single: total=176.00MiB, used=0.00B
> 
> - I can cut partitions to 81+1.5+~256 = 83g and + 1G for safety = 84G.
> And my data not lost and btrfs still working after this.
> After, you can use free space on the disk, to create new fs and backup
> data to him without problems,
> Developers, please сorrect me if I'm wrong.

You're correct, but two points, plus another less related one:

1) Most critical for the OP, he has to get the filesystem mounted and 
working first, and that's proving troublesome (tho with the patches I 
mentioned it might be doable).

2) While in theory you /could/ cut your filesystem even a bit further 
than that after a balance to trim down data usage (trimming it to ~61 GB 
allocated data area from the 81 GB current), given that btrfs can 
allocate chunks but requires a balance to deallocate, and data and 
metadata chunks can only be allocated from unallocated, not switched from 
one to the other directly, it's good to keep a reasonable safety margin 
of unallocated space.

The 1 GiB of unallocated space you propose since you didn't mention a 
balance is cutting it rather close, particularly since data chunks are 
normally 1 GiB.  On a filesystem around that size I'd recommend keeping 
10 GiB or so free at least, so maybe 95 GiB minimum in the example, or 
about 75 GiB minimum if you did a data-balance first, cutting data chunk 
allocation to 61 GiB and total allocated to ~63 GiB.

On a bit over 10 TiB of usage, however, I'd suggest keeping perhaps half 
a TiB spare/unallocated, so on 11 TiB data usage and perhaps half a TiB 
of metadata usage, resizing the filesystem no smaller than say 12 TiB, 
but 15 TiB would be somewhat more comfortable.

3) But that's a lot of data in a single filesystem basket.  Balancing or 
btrfs checking simply takes a LONG time at that size, let alone backup 
and restore of the entire thing.  So I'd suggest breaking it up into 
perhaps TiB sized independent filesystems, as well, if possible.  If say 
only half of them need mounted at a time, and half of those can be read-
only mounted, then that's "only" 2-3 TiB of data to balance/check/restore 
if something goes wrong, the rest with any luck should be unaffected as 
it wasn't mounted or was read-only mounted, and 2-3 TiB will take a lot 
less time to process than 10+ TiB, for sure.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-09 18:21           ` Marc MERLIN
  2014-08-10  4:03             ` Duncan
@ 2014-08-10 12:43             ` Holger Hoffstätte
  2014-08-10 14:39               ` Fixing the btrfs deadlocks Marc MERLIN
  1 sibling, 1 reply; 26+ messages in thread
From: Holger Hoffstätte @ 2014-08-10 12:43 UTC (permalink / raw)
  To: linux-btrfs

On Sat, 09 Aug 2014 11:21:13 -0700, Marc MERLIN wrote:

> I sure hope that deadlock patch is going to be added to the 3.16.x tree,
> I'm not super stocked with being stuck at 3.14.

Let me try to re-stoke your enthusiasm a bit :)

If you are comfortable with patching your own kernel you can take
a look at my custom patch queue, which I just put up at:
https://github.com/hhoffstaette/kernel-patches/tree/master/3.14
You can take only what you need, i.e. btrfs-*.

These are meant to be ironed over latest stable in order. and mostly
address "corner cases" that seem to happen infrequently, but still
are bad enough to leave a bad taste.

Unfortunately the btrfs code has changed significantly with 3.15+,
so later patches often no longer apply at all. However the ones that do
have been working well for me. I don't just blindly add them, but rather
at least try to understand what they do/change.

I sent out an email for comments about backporting this select set of
patches to 3.14, but got no replies. :(

-h


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Fixing the btrfs deadlocks
  2014-08-10 12:43             ` Holger Hoffstätte
@ 2014-08-10 14:39               ` Marc MERLIN
  2014-08-10 15:42                 ` Holger Hoffstätte
  0 siblings, 1 reply; 26+ messages in thread
From: Marc MERLIN @ 2014-08-10 14:39 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Sun, Aug 10, 2014 at 12:43:31PM +0000, Holger Hoffstätte wrote:
> On Sat, 09 Aug 2014 11:21:13 -0700, Marc MERLIN wrote:
> 
> > I sure hope that deadlock patch is going to be added to the 3.16.x tree,
> > I'm not super stocked with being stuck at 3.14.
> 
> Let me try to re-stoke your enthusiasm a bit :)
> 
> If you are comfortable with patching your own kernel you can take
> a look at my custom patch queue, which I just put up at:
> https://github.com/hhoffstaette/kernel-patches/tree/master/3.14
> You can take only what you need, i.e. btrfs-*.
> 
> These are meant to be ironed over latest stable in order. and mostly
> address "corner cases" that seem to happen infrequently, but still
> are bad enough to leave a bad taste.
> 
> Unfortunately the btrfs code has changed significantly with 3.15+,
> so later patches often no longer apply at all. However the ones that do
> have been working well for me. I don't just blindly add them, but rather
> at least try to understand what they do/change.
> 
> I sent out an email for comments about backporting this select set of
> patches to 3.14, but got no replies. :(

My apologies if I missed some Emails, but I'm a bit confused.
The deadlocks happen reliably with 3.15+, but those patches are marked as
being for 3.14 in your URL, but then you say you didn't backport them to
3.14.

You lost me :)

That said, if you'd like to me try a set of patches that applies to
3.15.latest or 3.16 to see if they stop my frequent btrfs deadlocks, I'd be
happy to.
If the patches are meant for 3.14, that's not very helpful since 3.14 is
reasonably stable for me in comparison.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/  

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Fixing the btrfs deadlocks
  2014-08-10 14:39               ` Fixing the btrfs deadlocks Marc MERLIN
@ 2014-08-10 15:42                 ` Holger Hoffstätte
  2014-08-10 16:36                   ` Marc MERLIN
  0 siblings, 1 reply; 26+ messages in thread
From: Holger Hoffstätte @ 2014-08-10 15:42 UTC (permalink / raw)
  To: linux-btrfs

On Sun, 10 Aug 2014 07:39:00 -0700, Marc MERLIN wrote:

> My apologies if I missed some Emails, but I'm a bit confused.
> The deadlocks happen reliably with 3.15+, but those patches are marked as
> being for 3.14 in your URL, but then you say you didn't backport them to
> 3.14.

sigh :)

My patch queue is meant for 3.14 only. The notorious hangs in 3.15+
are a different issue, as we have seen likely caused by the workqueue
changes; I was only interested in keeping & improving 3.14.x, precisely
because 3.15 is still borked and short-lived, and 3.16 has exciting new,
completely btrfs-unrelated problems for which I really don't have much
patience at the moment.

> That said, if you'd like to me try a set of patches that applies to
> 3.15.latest or 3.16 to see if they stop my frequent btrfs deadlocks, I'd be
> happy to.

I meant you could try my patch queue if you intend to use 3.14.x for a
longer period of time since it's a longterm kernel. If and when The Great
Hang has been fixed I migh look at 3.16+ again, but thanks to the quite
unhelpful general btrfs backport policy I'm not holding my breath.
"Simply use the latest kernel" is laughably impractical for many reasons,
and for the majority of people it's just easier to not use btrfs at all -
which helps nobody.

The - admittedly poorly worded - "backporting" referred to getting a set
of identified patches (aka my queue) into 3.14-longterm via Greg KH; there
is nothing to do except apply them.

> If the patches are meant for 3.14, that's not very helpful since 3.14 is
> reasonably stable for me in comparison.

..which is precisely what I said and why I'm using it. :)
That does not mean problems identified post-3.14 cannot or should not
be patched if they are being addressed in other trees and apply easily.

-h


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10  8:24                 ` Duncan
  2014-08-10  8:50                   ` Timofey Titovets
@ 2014-08-10 16:25                   ` Chris Murphy
  2014-08-11 21:33                     ` Jose Ildefonso Camargo Tolosa
  1 sibling, 1 reply; 26+ messages in thread
From: Chris Murphy @ 2014-08-10 16:25 UTC (permalink / raw)
  To: Btrfs BTRFS


On Aug 10, 2014, at 2:24 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> 
> Either way, the question then comes up of what to use when you do a new 
> mkfs.  My personal feeling?  Btrfs isn't yet fully stable, and there's a 
> very real possibility that one may have to restore from backup, so one 
> should be prepared for that.

Jose said early on he that he is prepared for this, but nevertheless a week is a rather long time for restore.

>  Given the size of the data store you're 
> working with and the remote nature of that backup, with access over 
> limited-speed pipes, I wonder if btrfs is really an appropriate choice 
> for you at this point.

We don't know if this backup is extra icing on the cake. Yes it's a week of restore, but it still might be entirely dispensable; e.g. 1 of 2+ glusterfs georep slaves and at least one of the others is XFS based.

> I guess xfs is the standard recommendation for big-data sizes and it is 
> said to be long past the "better have a UPS" days, or of course the 
> default ext4.  

XFS is now the default filesystem for RHEL 7 and Fedora 21 Server, likely also for Fedora 21 Cloud. This includes /boot.

> 
> You can of course try btrfs again in a year or so, when it should have 
> matured quite a bit.  I actually did that after my first try at btrfs, 
> leaving for a time then coming back, and was impressed at how much it had 
> matured in the mean time.

I'm looking back to Aug 2011 when Fedora 16 was slated to have Btrfs by default.


Chris Murphy


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Fixing the btrfs deadlocks
  2014-08-10 15:42                 ` Holger Hoffstätte
@ 2014-08-10 16:36                   ` Marc MERLIN
  0 siblings, 0 replies; 26+ messages in thread
From: Marc MERLIN @ 2014-08-10 16:36 UTC (permalink / raw)
  To: Holger Hoffstätte; +Cc: linux-btrfs

On Sun, Aug 10, 2014 at 03:42:09PM +0000, Holger Hoffstätte wrote:
> On Sun, 10 Aug 2014 07:39:00 -0700, Marc MERLIN wrote:
> 
> > My apologies if I missed some Emails, but I'm a bit confused.
> > The deadlocks happen reliably with 3.15+, but those patches are marked as
> > being for 3.14 in your URL, but then you say you didn't backport them to
> > 3.14.
> 
> sigh :)
> 
> My patch queue is meant for 3.14 only. The notorious hangs in 3.15+
> are a different issue, as we have seen likely caused by the workqueue
> changes; I was only interested in keeping & improving 3.14.x, precisely
> because 3.15 is still borked and short-lived, and 3.16 has exciting new,
> completely btrfs-unrelated problems for which I really don't have much
> patience at the moment.
 
Aaah, gotcha, and to be clear, 3.16 has the same hang problems so far
(or maybe even worse).

> I meant you could try my patch queue if you intend to use 3.14.x for a
> longer period of time since it's a longterm kernel. If and when The Great

Ok, I understand you now.

> "Simply use the latest kernel" is laughably impractical for many reasons,
> and for the majority of people it's just easier to not use btrfs at all -
> which helps nobody.
 
I totally agree with you on that one.

> ..which is precisely what I said and why I'm using it. :)
> That does not mean problems identified post-3.14 cannot or should not
> be patched if they are being addressed in other trees and apply easily.

Ok, I'm following you now. 
So, I don't really have time to deal with potential recoveries and
restores if anything goes wrong in the next 3 weeks, and since 3.14
works well enough for me (the hangs are rare enough that I can deal with
them), I'm unfortunately going to skip on your offer.
I do appreciate the work you did and the rationale behind it very much
actually, I'm just not the right "customer" for this at the moment.

Hopefully this new thread will bring up your work again for others to
see and hopefully it will be useful to some.

Thanks
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-10 16:25                   ` Chris Murphy
@ 2014-08-11 21:33                     ` Jose Ildefonso Camargo Tolosa
  2014-08-12  4:15                       ` Duncan
  0 siblings, 1 reply; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-11 21:33 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Btrfs BTRFS

As I hate when a thread is left "hanging", you deserve to know what
happened in the end, you likely already guessed, but anyway: I nuked
the filesystem, and started over.

After some internal discussion in the company, we decided to move to
ZFS for now.  However, we will keep an eye on btrfs, and will likely
deploy it to some smaller system for further testing.

Thanks you all for your help!

Sincerely,

Ildefonso

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-11 21:33                     ` Jose Ildefonso Camargo Tolosa
@ 2014-08-12  4:15                       ` Duncan
  2014-08-12 14:24                         ` Marc MERLIN
  0 siblings, 1 reply; 26+ messages in thread
From: Duncan @ 2014-08-12  4:15 UTC (permalink / raw)
  To: linux-btrfs

Jose Ildefonso Camargo Tolosa posted on Mon, 11 Aug 2014 16:33:36 -0500 as
excerpted:

> As I hate when a thread is left "hanging", you deserve to know what
> happened in the end, you likely already guessed, but anyway: I nuked the
> filesystem, and started over.
> 
> After some internal discussion in the company, we decided to move to ZFS
> for now.  However, we will keep an eye on btrfs, and will likely deploy
> it to some smaller system for further testing.
> 
> Thanks you all for your help!

Thank you too. =:^)

Sounds like a sane decision for the time being.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-12  4:15                       ` Duncan
@ 2014-08-12 14:24                         ` Marc MERLIN
  2014-08-13  2:02                           ` Jose Ildefonso Camargo Tolosa
  0 siblings, 1 reply; 26+ messages in thread
From: Marc MERLIN @ 2014-08-12 14:24 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

On Tue, Aug 12, 2014 at 04:15:26AM +0000, Duncan wrote:
> Jose Ildefonso Camargo Tolosa posted on Mon, 11 Aug 2014 16:33:36 -0500 as
> excerpted:
> 
> > As I hate when a thread is left "hanging", you deserve to know what
> > happened in the end, you likely already guessed, but anyway: I nuked the
> > filesystem, and started over.
> > 
> > After some internal discussion in the company, we decided to move to ZFS
> > for now.  However, we will keep an eye on btrfs, and will likely deploy
> > it to some smaller system for further testing.
> > 
> > Thanks you all for your help!
> 
> Thank you too. =:^)
> 
> Sounds like a sane decision for the time being.

Just remember that ZFS is fine for internal use, but that you can never
ever ship any product based on ZFS due to its licensing.

Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                         | PGP 1024R/763BE901

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: 40TB volume taking over 16 hours to mount, any ideas?
  2014-08-12 14:24                         ` Marc MERLIN
@ 2014-08-13  2:02                           ` Jose Ildefonso Camargo Tolosa
  0 siblings, 0 replies; 26+ messages in thread
From: Jose Ildefonso Camargo Tolosa @ 2014-08-13  2:02 UTC (permalink / raw)
  To: Marc MERLIN; +Cc: Duncan, Btrfs BTRFS

On Tue, Aug 12, 2014 at 9:24 AM, Marc MERLIN <marc@merlins.org> wrote:
> On Tue, Aug 12, 2014 at 04:15:26AM +0000, Duncan wrote:
>> Jose Ildefonso Camargo Tolosa posted on Mon, 11 Aug 2014 16:33:36 -0500 as
>> excerpted:
>>
>> > As I hate when a thread is left "hanging", you deserve to know what
>> > happened in the end, you likely already guessed, but anyway: I nuked the
>> > filesystem, and started over.
>> >
>> > After some internal discussion in the company, we decided to move to ZFS
>> > for now.  However, we will keep an eye on btrfs, and will likely deploy
>> > it to some smaller system for further testing.
>> >
>> > Thanks you all for your help!
>>
>> Thank you too. =:^)
>>
>> Sounds like a sane decision for the time being.
>
> Just remember that ZFS is fine for internal use, but that you can never
> ever ship any product based on ZFS due to its licensing.
>

Yeah, that's one of the reasons why we won't lose sight of btrfs, and
will try to keep at least one system with it in order to help testing
and stabilizing it.

And that's true for ZFS on Linux, but not for ZFS with FreeBSD (think:
FreeNAS).  However, I do not want to use FreeBSD for now.

Ildefonso

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2014-08-13  2:02 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-08 21:35 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
2014-08-09  3:38 ` Russell Coker
2014-08-09 14:32   ` Andy Smith
2014-08-09 14:58     ` Jose Ildefonso Camargo Tolosa
2014-08-09 16:06       ` Jose Ildefonso Camargo Tolosa
2014-08-09 17:01         ` Duncan
2014-08-09 18:21           ` Marc MERLIN
2014-08-10  4:03             ` Duncan
2014-08-10 12:43             ` Holger Hoffstätte
2014-08-10 14:39               ` Fixing the btrfs deadlocks Marc MERLIN
2014-08-10 15:42                 ` Holger Hoffstätte
2014-08-10 16:36                   ` Marc MERLIN
2014-08-09 18:38           ` 40TB volume taking over 16 hours to mount, any ideas? Jose Ildefonso Camargo Tolosa
2014-08-09 21:02             ` Jose Ildefonso Camargo Tolosa
2014-08-10  3:58               ` Jose Ildefonso Camargo Tolosa
2014-08-10  8:24                 ` Duncan
2014-08-10  8:50                   ` Timofey Titovets
2014-08-10 10:16                     ` Duncan
2014-08-10 16:25                   ` Chris Murphy
2014-08-11 21:33                     ` Jose Ildefonso Camargo Tolosa
2014-08-12  4:15                       ` Duncan
2014-08-12 14:24                         ` Marc MERLIN
2014-08-13  2:02                           ` Jose Ildefonso Camargo Tolosa
2014-08-10  4:21             ` Duncan
2014-08-10  4:57               ` Mitch Harder
2014-08-10  7:21                 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.