All of lore.kernel.org
 help / color / mirror / Atom feed
* raid6, disks of different sizes, ENOSPC errors despite having plenty of space
@ 2014-04-23 21:04 Sergey Ivanyuk
  2014-04-23 22:53 ` Hugo Mills
  2014-04-24 11:33 ` Duncan
  0 siblings, 2 replies; 4+ messages in thread
From: Sergey Ivanyuk @ 2014-04-23 21:04 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I have a filesystem that I've converted to raid6 from raid1, on 4 drives (I
have another copy of the data):

        Total devices 4 FS bytes used 924.64GiB
        devid    1 size 1.82TiB used 474.00GiB path /dev/sdd
        devid    2 size 465.76GiB used 465.76GiB path /dev/sda
        devid    3 size 465.76GiB used 465.76GiB path /dev/sdb
        devid    4 size 465.76GiB used 465.73GiB path /dev/sdc

Data, RAID6: total=924.00GiB, used=923.42GiB
System, RAID1: total=32.00MiB, used=208.00KiB
Metadata, RAID1: total=1.70GiB, used=1.28GiB
Metadata, DUP: total=384.00MiB, used=252.13MiB
unknown, single: total=512.00MiB, used=0.00


Recent btrfs-progs built from source, kernel 3.15.0-rc2 on armv7l. Despite
having plenty of space left on the larger drive, attempting to copy more
data onto the filesystem results in a kworker process pegged at 100% CPU
for a very long time (10s of minutes), at which point the writes proceed
for some time, and the process repeats until the eventual "No space left on
device" error. Balancing fails with the same error, even if attempting to
convert back to raid1.

I realize that this likely has something to do with the disparity between
device sizes, and per the wiki a fixed-width stripe may help, though I'm
not sure if it's possible to change the stripe width in my situation, since
I can't rebalance. Is there anything I can do to get this filesystem back
to writable state?

Also, here's a stack trace for the stuck kworker process, which appears to
be a bug since it does this for a very long time:

Exception stack(0xab4699c8 to 0xab469a10)
99c0:                   aec7c870 00000000 00000000 aec7c841 08000000
aec7c870
99e0: ab469ad0 bd51e880 00003000 00000000 0006c000 00000000 00000005
ab469a10
9a00: 80299c8c 80310098 200e0013 ffffffff
[<80011e80>] (__irq_svc) from [<80310098>] (rb_next+0x14/0x5c)
[<80310098>] (rb_next) from [<80299c8c>]
(btrfs_find_space_for_alloc+0x138/0x344)
[<80299c8c>] (btrfs_find_space_for_alloc) from [<80240020>]
(find_free_extent+0x378/0xabc)
[<80240020>] (find_free_extent) from [<80240840>]
(btrfs_reserve_extent+0xdc/0x164)
[<80240840>] (btrfs_reserve_extent) from [<8025aef4>]
(cow_file_range+0x17c/0x5bc)
[<8025aef4>] (cow_file_range) from [<8025c1e0>]
(run_delalloc_range+0x34c/0x380)
[<8025c1e0>] (run_delalloc_range) from [<80274d6c>]
(__extent_writepage+0x708/0x940)
[<80274d6c>] (__extent_writepage) from [<802754b4>]
(extent_writepages+0x238/0x368)
[<802754b4>] (extent_writepages) from [<8009b190>] (do_writepages+0x24/0x38)
[<8009b190>] (do_writepages) from [<800ef59c>]
(__writeback_single_inode+0x28/0x110)
[<800ef59c>] (__writeback_single_inode) from [<800f04c8>]
(writeback_sb_inodes+0x184/0x38c)
[<800f04c8>] (writeback_sb_inodes) from [<800f0740>]
(__writeback_inodes_wb+0x70/0xac)
[<800f0740>] (__writeback_inodes_wb) from [<800f0978>]
(wb_writeback+0x1fc/0x20c)
[<800f0978>] (wb_writeback) from [<800f0b78>]
(bdi_writeback_workfn+0x144/0x338)
[<800f0b78>] (bdi_writeback_workfn) from [<80037cfc>]
(process_one_work+0x110/0x368)
[<80037cfc>] (process_one_work) from [<800383c8>]
(worker_thread+0x138/0x3e8)
[<800383c8>] (worker_thread) from [<8003de90>] (kthread+0xcc/0xe8)
[<8003de90>] (kthread) from [<8000e238>] (ret_from_fork+0x14/0x3c)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid6, disks of different sizes, ENOSPC errors despite having plenty of space
  2014-04-23 21:04 raid6, disks of different sizes, ENOSPC errors despite having plenty of space Sergey Ivanyuk
@ 2014-04-23 22:53 ` Hugo Mills
  2014-04-24 17:16   ` Sergey Ivanyuk
  2014-04-24 11:33 ` Duncan
  1 sibling, 1 reply; 4+ messages in thread
From: Hugo Mills @ 2014-04-23 22:53 UTC (permalink / raw)
  To: Sergey Ivanyuk; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 2729 bytes --]

On Wed, Apr 23, 2014 at 05:04:10PM -0400, Sergey Ivanyuk wrote:
> Hi,
> 
> I have a filesystem that I've converted to raid6 from raid1, on 4 drives (I
> have another copy of the data):
> 
>         Total devices 4 FS bytes used 924.64GiB
>         devid    1 size 1.82TiB used 474.00GiB path /dev/sdd
>         devid    2 size 465.76GiB used 465.76GiB path /dev/sda
>         devid    3 size 465.76GiB used 465.76GiB path /dev/sdb
>         devid    4 size 465.76GiB used 465.73GiB path /dev/sdc
> 
> Data, RAID6: total=924.00GiB, used=923.42GiB
> System, RAID1: total=32.00MiB, used=208.00KiB
> Metadata, RAID1: total=1.70GiB, used=1.28GiB
> Metadata, DUP: total=384.00MiB, used=252.13MiB
> unknown, single: total=512.00MiB, used=0.00
> 
> 
> Recent btrfs-progs built from source, kernel 3.15.0-rc2 on armv7l. Despite
> having plenty of space left on the larger drive, attempting to copy more
> data onto the filesystem results in a kworker process pegged at 100% CPU
> for a very long time (10s of minutes), at which point the writes proceed
> for some time, and the process repeats until the eventual "No space left on
> device" error. Balancing fails with the same error, even if attempting to
> convert back to raid1.
> 
> I realize that this likely has something to do with the disparity between
> device sizes, and per the wiki a fixed-width stripe may help, though I'm
> not sure if it's possible to change the stripe width in my situation, since
> I can't rebalance. Is there anything I can do to get this filesystem back
> to writable state?

   With those device sizes, yes, you're going to have limits on the
available data you can store -- with RAID-6, it'll be 465.76*(4-2) =
931.52 GB (less metadata space), so your conclusion above is indeed
correct.

   We don't have the fixed-width stripe feature implemented yet, which
probably explains why you can't use it. :) You can play with an
approximation of the consequences, once the feature is there, at
http://carfax.org.uk/btrfs-usage/ . Without that feature, though,
there's not much you can do to improve the situation. What might help
in converting back to RAID-1 is adding a small device to the FS
temporarily before doing the conversion, and then removing it again
afterwards.

> Also, here's a stack trace for the stuck kworker process, which appears to
> be a bug since it does this for a very long time:

   This is probably something different.

   Hugo.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
     --- Computer Science is not about computers,  any more than ---     
                     astronomy is about telescopes.                      

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 811 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid6, disks of different sizes, ENOSPC errors despite having plenty of space
  2014-04-23 21:04 raid6, disks of different sizes, ENOSPC errors despite having plenty of space Sergey Ivanyuk
  2014-04-23 22:53 ` Hugo Mills
@ 2014-04-24 11:33 ` Duncan
  1 sibling, 0 replies; 4+ messages in thread
From: Duncan @ 2014-04-24 11:33 UTC (permalink / raw)
  To: linux-btrfs

Sergey Ivanyuk posted on Wed, 23 Apr 2014 17:04:10 -0400 as excerpted:

> I have a filesystem that I've converted to raid6 from raid1, on 4 drives
> (I have another copy of the data):
> 
> Total devices 4 FS bytes used 924.64GiB
> devid    1 size 1.82TiB used 474.00GiB path /dev/sdd
> devid    2 size 465.76GiB used 465.76GiB path /dev/sda
> devid    3 size 465.76GiB used 465.76GiB path /dev/sdb
> devid    4 size 465.76GiB used 465.73GiB path /dev/sdc

That tells your story right there.  Btrfs raid6 mode requires a minimum 
of four available devices to allocate new chunks.  You have four devices, 
but three of them are full (465.76 out of 465.76 gigs allocated, no 
further free space left to allocate), so as soon as you run out of 
already allocated data or metadata space, you'll be unable to write 
anything else, since to allocate more in raid6 mode you'd have to have 
room on all four devices.

And here's your current allocation:

> Data, RAID6: total=924.00GiB, used=923.42GiB
> System, RAID1: total=32.00MiB, used=208.00KiB
> Metadata, RAID1: total=1.70GiB, used=1.28GiB
> Metadata, DUP: total=384.00MiB, used=252.13MiB
> unknown, single: total=512.00MiB, used=0.00

OK, it's only data that's raid6, and actually, it appears to be data 
that's full (close enough to it, you might write a few small files), too.

Metadata is raid1 (which requires two devices to allocate more) and dup, 
presumably from the original single-device setup, since dup on a multi-
device filesystem is not normally possible.  The raid1 metadata has 
nearly half a gig free, but btrfs will reserve about 200 MiB of that 
which won't be usable, so you should have about 200 MiB available 
metadata space to use.

So metadata isn't full ATM, but it's getting close.

Meanwhile, last I knew, btrfs raid5/6 mode support wasn't yet complete in 
any case.  Normal runtime works fine, but recovery from a lost device 
isn't fully implemented yet.  So effectively, raid6 is really inefficient 
raid0 in terms of recovery ATM, on two less devices since two are parity, 
but with the prospect of a "free" upgrade to raid6 reliability once 
that's fully implemented, since you're actually doing the parity writing 
for it already if configured for raid6, you just can't use it to restore 
so you're effectively raid0 reliability ATM.

Which isn't a good position to be in since with raid0, loss of a single 
device effectively kills the entire thing.  So unless you're purely 
playing and don't care about either possible data loss or the speed and 
capacity loss over normal raid0 mode, raid6 isn't a good idea at this 
point anyway.

As Hugo says, adding another device temporarily may give you enough room 
to balance-convert the raid6 back to raid1, or to raid0 or single or 
whatever.  That's what I'd do for now.

Meanwhile, it's worth noting that with the unevenly sized devices you 
have, single mode is the only way you're going to get full usage.  Tho 
with raid1 mode you shouldn't be too far out, basically being able to 
utilize 465*3=1395 GiB of the large device, since it's greater than the 
size of the other devices put together, with a bit over 400 gig left 
unusable.  Effectively, btrfs raid1 does pair-mirroring, and you'd be 
putting one of the pair on the larger device with the other one of the 
pair alternating between the other three devices.  Or get one more 465 
gig device and you'd basically fill them all.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: raid6, disks of different sizes, ENOSPC errors despite having plenty of space
  2014-04-23 22:53 ` Hugo Mills
@ 2014-04-24 17:16   ` Sergey Ivanyuk
  0 siblings, 0 replies; 4+ messages in thread
From: Sergey Ivanyuk @ 2014-04-24 17:16 UTC (permalink / raw)
  To: Hugo Mills, Sergey Ivanyuk, linux-btrfs

Thanks, this makes sense. I freed up some space and the re-balance
back to raid1 is now running (I had to run 'btrfs balance -dusage=5'
before some free space actually became available).

Filed the other issue as https://bugzilla.kernel.org/show_bug.cgi?id=74761 .

On Wed, Apr 23, 2014 at 6:53 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> On Wed, Apr 23, 2014 at 05:04:10PM -0400, Sergey Ivanyuk wrote:
>> Hi,
>>
>> I have a filesystem that I've converted to raid6 from raid1, on 4 drives (I
>> have another copy of the data):
>>
>>         Total devices 4 FS bytes used 924.64GiB
>>         devid    1 size 1.82TiB used 474.00GiB path /dev/sdd
>>         devid    2 size 465.76GiB used 465.76GiB path /dev/sda
>>         devid    3 size 465.76GiB used 465.76GiB path /dev/sdb
>>         devid    4 size 465.76GiB used 465.73GiB path /dev/sdc
>>
>> Data, RAID6: total=924.00GiB, used=923.42GiB
>> System, RAID1: total=32.00MiB, used=208.00KiB
>> Metadata, RAID1: total=1.70GiB, used=1.28GiB
>> Metadata, DUP: total=384.00MiB, used=252.13MiB
>> unknown, single: total=512.00MiB, used=0.00
>>
>>
>> Recent btrfs-progs built from source, kernel 3.15.0-rc2 on armv7l. Despite
>> having plenty of space left on the larger drive, attempting to copy more
>> data onto the filesystem results in a kworker process pegged at 100% CPU
>> for a very long time (10s of minutes), at which point the writes proceed
>> for some time, and the process repeats until the eventual "No space left on
>> device" error. Balancing fails with the same error, even if attempting to
>> convert back to raid1.
>>
>> I realize that this likely has something to do with the disparity between
>> device sizes, and per the wiki a fixed-width stripe may help, though I'm
>> not sure if it's possible to change the stripe width in my situation, since
>> I can't rebalance. Is there anything I can do to get this filesystem back
>> to writable state?
>
>    With those device sizes, yes, you're going to have limits on the
> available data you can store -- with RAID-6, it'll be 465.76*(4-2) =
> 931.52 GB (less metadata space), so your conclusion above is indeed
> correct.
>
>    We don't have the fixed-width stripe feature implemented yet, which
> probably explains why you can't use it. :) You can play with an
> approximation of the consequences, once the feature is there, at
> http://carfax.org.uk/btrfs-usage/ . Without that feature, though,
> there's not much you can do to improve the situation. What might help
> in converting back to RAID-1 is adding a small device to the FS
> temporarily before doing the conversion, and then removing it again
> afterwards.
>
>> Also, here's a stack trace for the stuck kworker process, which appears to
>> be a bug since it does this for a very long time:
>
>    This is probably something different.
>
>    Hugo.
>
> --
> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
>   PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
>      --- Computer Science is not about computers,  any more than ---
>                      astronomy is about telescopes.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2014-04-24 17:16 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-04-23 21:04 raid6, disks of different sizes, ENOSPC errors despite having plenty of space Sergey Ivanyuk
2014-04-23 22:53 ` Hugo Mills
2014-04-24 17:16   ` Sergey Ivanyuk
2014-04-24 11:33 ` Duncan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.