All of lore.kernel.org
 help / color / mirror / Atom feed
* btrfs convert running out of space
@ 2015-01-19 23:45 Gareth Pye
  2015-01-20  0:13 ` Gareth Pye
  0 siblings, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-19 23:45 UTC (permalink / raw)
  To: linux-btrfs

Hi,

I'm attempting to convert a btrfs filesystem from raid10 to raid1.
Things had been going well through a couple of pauses and resumes, but
last night it errored with:
ERROR: error during balancing '/data' - No space left on device

Which is strange because there is around 1.4T spare on the drives.
df:
/dev/drbd0      5.5T  4.6T  1.4T  77% /data

btrfs fi df:
Data, RAID10: total=1.34TiB, used=1.34TiB
Data, RAID1: total=3.80TiB, used=3.20TiB
System, RAID1: total=32.00MiB, used=720.00KiB
Metadata, RAID1: total=13.00GiB, used=9.70GiB
GlobalReserve, single: total=512.00MiB, used=204.00KiB

btrfs fi show:
Label: none  uuid: b2986e1a-0891-4779-960c-e01f7534c6eb
        Total devices 6 FS bytes used 4.55TiB
        devid    1 size 1.81TiB used 1.72TiB path /dev/drbd0
        devid    2 size 1.81TiB used 1.72TiB path /dev/drbd1
        devid    3 size 1.81TiB used 1.72TiB path /dev/drbd2
        devid    4 size 1.81TiB used 1.72TiB path /dev/drbd3
        devid    5 size 1.81TiB used 1.72TiB path /dev/drbd4
        devid    6 size 1.81TiB used 1.72TiB path /dev/drbd5

The above numbers are from after a quick bit of testing. When the
error occured the RAID1 total number was much larger and the device
used totals were 1.81TiB. So I ran a balance with -dusage=2 and all
the numbers went back to where I expected them to be. RAID1 total of
3.21TiB and appropriate device usage numbers. With the system looking
healthy again I checked my btrfs tools version (3.12) and updated that
to the current git (3.18.1, matching my kernel version) and tried the
convert to raid1 again (this time with the dsoft option) but that
quickly got to the above 600G empty allocation, where I canceled it.

# uname -a
Linux emile 3.18.1-031801-generic #201412170637 SMP Wed Dec 17
11:38:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
# btrfs --version
Btrfs v3.12

dmesg doesn't tell me much, this is the end of it:
[1295952.558506] BTRFS info (device drbd5): relocating block group
13734193922048 flags 65
[1295971.813271] BTRFS info (device drbd5): relocating block group
13716980498432 flags 65
[1295976.492826] BTRFS info (device drbd5): relocating block group
13713759272960 flags 65
[1295976.921302] BTRFS info (device drbd5): relocating block group
13710538047488 flags 65
[1295977.593500] BTRFS info (device drbd5): relocating block group
13707316822016 flags 65
[1295988.490751] BTRFS info (device drbd5): relocating block group
13704095596544 flags 65
[1295999.193131] BTRFS info (device drbd5): relocating block group
13613800620032 flags 65
[1296003.036323] BTRFS info (device drbd5): relocating block group
13578367139840 flags 65
[1296009.333859] BTRFS info (device drbd5): relocating block group
13539712434176 flags 65
[1296041.246938] BTRFS info (device drbd5): relocating block group
13513942630400 flags 65
[1296056.891600] BTRFS info (device drbd5): relocating block group
13488172826624 flags 65
[1296071.386463] BTRFS info (device drbd5): relocating block group
13472066699264 flags 65
[1296074.577288] BTRFS info (device drbd5): relocating block group
13468845473792 flags 65
[1296105.783088] BTRFS info (device drbd5): relocating block group
13465624248320 flags 65
[1296114.910226] BTRFS info (device drbd5): relocating block group
13462403022848 flags 65
[1296115.398699] BTRFS info (device drbd5): relocating block group
13459181797376 flags 65
[1296115.798719] BTRFS info (device drbd5): relocating block group
13455960571904 flags 65
[1296123.664726] BTRFS info (device drbd5): relocating block group
13452739346432 flags 65
[1296124.262510] BTRFS info (device drbd5): relocating block group
13449518120960 flags 65
[1296124.787219] BTRFS info (device drbd5): relocating block group
13446296895488 flags 65
[1296125.290209] BTRFS info (device drbd5): relocating block group
13443075670016 flags 65
[1296125.820547] BTRFS info (device drbd5): relocating block group
13439854444544 flags 65
[1296126.306939] BTRFS info (device drbd5): relocating block group
13436633219072 flags 65
[1296126.831993] BTRFS info (device drbd5): relocating block group
13433411993600 flags 65
[1296127.331577] BTRFS info (device drbd5): relocating block group
13430190768128 flags 65
[1296127.914643] BTRFS info (device drbd5): relocating block group
13426969542656 flags 65
[1296128.462360] BTRFS info (device drbd5): relocating block group
13423748317184 flags 65
[1296129.290787] BTRFS info (device drbd5): relocating block group
13420527091712 flags 65
[1296129.927392] BTRFS info (device drbd5): relocating block group
13417305866240 flags 65
[1296136.878558] BTRFS info (device drbd5): relocating block group
13414084640768 flags 65
[1296144.932052] BTRFS info (device drbd5): relocating block group
13282014396416 flags 65
[1296153.141743] BTRFS info (device drbd5): relocating block group
13222925041664 flags 65
[1296153.830643] BTRFS info (device drbd5): relocating block group
13219703816192 flags 65
[1296154.519497] BTRFS info (device drbd5): relocating block group
13216482590720 flags 65
[1296163.842627] BTRFS info (device drbd5): 203 enospc errors during balance

Before that there is just lots of not particularly different
"relocating block group" messages.

Any ideas on what is going on here?

-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-19 23:45 btrfs convert running out of space Gareth Pye
@ 2015-01-20  0:13 ` Gareth Pye
  2015-01-20  5:39   ` Lakshmi_Narayanan_Du
  2015-01-20  7:38   ` Chris Murphy
  0 siblings, 2 replies; 21+ messages in thread
From: Gareth Pye @ 2015-01-20  0:13 UTC (permalink / raw)
  To: linux-btrfs

I just tried from a slightly different tack, after doing another
-dusage=2 pass I did the following:

# btrfs balance start -v -dconvert=raid1 -dsoft -dusage=96 /data
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=96
Done, had to relocate 0 out of 3763 chunks
# btrfs balance start -v -dconvert=raid1 -dsoft -dusage=99 /data
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=99
ERROR: error during balancing '/data' - No space left on device
There may be more info in syslog - try dmesg | tail
# dmesg | tail
[1301598.556845] BTRFS info (device drbd5): relocating block group
19366003343360 flags 17
[1301601.300990] BTRFS info (device drbd5): relocating block group
19364929601536 flags 17
[1301606.043675] BTRFS info (device drbd5): relocating block group
19363855859712 flags 17
[1301609.564754] BTRFS info (device drbd5): relocating block group
19362782117888 flags 17
[1301612.453678] BTRFS info (device drbd5): relocating block group
19361708376064 flags 17
[1301616.911777] BTRFS info (device drbd5): relocating block group
19360634634240 flags 17
[1301901.823345] BTRFS info (device drbd5): relocating block group
15298300215296 flags 65
[1301904.206732] BTRFS info (device drbd5): relocating block group
15285415313408 flags 65
[1301904.675298] BTRFS info (device drbd5): relocating block group
14946985312256 flags 65
[1301954.658780] BTRFS info (device drbd5): 3 enospc errors during balance
# btrfs balance start -v -dusage=2 /data
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=2
Done, had to relocate 9 out of 3772 chunks

That looks to me like when converting 3 blocks it wrote 9 blocks with
less than 2% usage (Plus presumably 3 mostly full blocks). That sounds
like a bug.


On Tue, Jan 20, 2015 at 10:45 AM, Gareth Pye <gareth@cerberos.id.au> wrote:
> Hi,
>
> I'm attempting to convert a btrfs filesystem from raid10 to raid1.
> Things had been going well through a couple of pauses and resumes, but
> last night it errored with:
> ERROR: error during balancing '/data' - No space left on device
>
> Which is strange because there is around 1.4T spare on the drives.
> df:
> /dev/drbd0      5.5T  4.6T  1.4T  77% /data
>
> btrfs fi df:
> Data, RAID10: total=1.34TiB, used=1.34TiB
> Data, RAID1: total=3.80TiB, used=3.20TiB
> System, RAID1: total=32.00MiB, used=720.00KiB
> Metadata, RAID1: total=13.00GiB, used=9.70GiB
> GlobalReserve, single: total=512.00MiB, used=204.00KiB
>
> btrfs fi show:
> Label: none  uuid: b2986e1a-0891-4779-960c-e01f7534c6eb
>         Total devices 6 FS bytes used 4.55TiB
>         devid    1 size 1.81TiB used 1.72TiB path /dev/drbd0
>         devid    2 size 1.81TiB used 1.72TiB path /dev/drbd1
>         devid    3 size 1.81TiB used 1.72TiB path /dev/drbd2
>         devid    4 size 1.81TiB used 1.72TiB path /dev/drbd3
>         devid    5 size 1.81TiB used 1.72TiB path /dev/drbd4
>         devid    6 size 1.81TiB used 1.72TiB path /dev/drbd5
>
> The above numbers are from after a quick bit of testing. When the
> error occured the RAID1 total number was much larger and the device
> used totals were 1.81TiB. So I ran a balance with -dusage=2 and all
> the numbers went back to where I expected them to be. RAID1 total of
> 3.21TiB and appropriate device usage numbers. With the system looking
> healthy again I checked my btrfs tools version (3.12) and updated that
> to the current git (3.18.1, matching my kernel version) and tried the
> convert to raid1 again (this time with the dsoft option) but that
> quickly got to the above 600G empty allocation, where I canceled it.
>
> # uname -a
> Linux emile 3.18.1-031801-generic #201412170637 SMP Wed Dec 17
> 11:38:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
> # btrfs --version
> Btrfs v3.12
>
> dmesg doesn't tell me much, this is the end of it:
> [1295952.558506] BTRFS info (device drbd5): relocating block group
> 13734193922048 flags 65
> [1295971.813271] BTRFS info (device drbd5): relocating block group
> 13716980498432 flags 65
> [1295976.492826] BTRFS info (device drbd5): relocating block group
> 13713759272960 flags 65
> [1295976.921302] BTRFS info (device drbd5): relocating block group
> 13710538047488 flags 65
> [1295977.593500] BTRFS info (device drbd5): relocating block group
> 13707316822016 flags 65
> [1295988.490751] BTRFS info (device drbd5): relocating block group
> 13704095596544 flags 65
> [1295999.193131] BTRFS info (device drbd5): relocating block group
> 13613800620032 flags 65
> [1296003.036323] BTRFS info (device drbd5): relocating block group
> 13578367139840 flags 65
> [1296009.333859] BTRFS info (device drbd5): relocating block group
> 13539712434176 flags 65
> [1296041.246938] BTRFS info (device drbd5): relocating block group
> 13513942630400 flags 65
> [1296056.891600] BTRFS info (device drbd5): relocating block group
> 13488172826624 flags 65
> [1296071.386463] BTRFS info (device drbd5): relocating block group
> 13472066699264 flags 65
> [1296074.577288] BTRFS info (device drbd5): relocating block group
> 13468845473792 flags 65
> [1296105.783088] BTRFS info (device drbd5): relocating block group
> 13465624248320 flags 65
> [1296114.910226] BTRFS info (device drbd5): relocating block group
> 13462403022848 flags 65
> [1296115.398699] BTRFS info (device drbd5): relocating block group
> 13459181797376 flags 65
> [1296115.798719] BTRFS info (device drbd5): relocating block group
> 13455960571904 flags 65
> [1296123.664726] BTRFS info (device drbd5): relocating block group
> 13452739346432 flags 65
> [1296124.262510] BTRFS info (device drbd5): relocating block group
> 13449518120960 flags 65
> [1296124.787219] BTRFS info (device drbd5): relocating block group
> 13446296895488 flags 65
> [1296125.290209] BTRFS info (device drbd5): relocating block group
> 13443075670016 flags 65
> [1296125.820547] BTRFS info (device drbd5): relocating block group
> 13439854444544 flags 65
> [1296126.306939] BTRFS info (device drbd5): relocating block group
> 13436633219072 flags 65
> [1296126.831993] BTRFS info (device drbd5): relocating block group
> 13433411993600 flags 65
> [1296127.331577] BTRFS info (device drbd5): relocating block group
> 13430190768128 flags 65
> [1296127.914643] BTRFS info (device drbd5): relocating block group
> 13426969542656 flags 65
> [1296128.462360] BTRFS info (device drbd5): relocating block group
> 13423748317184 flags 65
> [1296129.290787] BTRFS info (device drbd5): relocating block group
> 13420527091712 flags 65
> [1296129.927392] BTRFS info (device drbd5): relocating block group
> 13417305866240 flags 65
> [1296136.878558] BTRFS info (device drbd5): relocating block group
> 13414084640768 flags 65
> [1296144.932052] BTRFS info (device drbd5): relocating block group
> 13282014396416 flags 65
> [1296153.141743] BTRFS info (device drbd5): relocating block group
> 13222925041664 flags 65
> [1296153.830643] BTRFS info (device drbd5): relocating block group
> 13219703816192 flags 65
> [1296154.519497] BTRFS info (device drbd5): relocating block group
> 13216482590720 flags 65
> [1296163.842627] BTRFS info (device drbd5): 203 enospc errors during balance
>
> Before that there is just lots of not particularly different
> "relocating block group" messages.
>
> Any ideas on what is going on here?
>
> --
> Gareth Pye
> Level 2 MTG Judge, Melbourne, Australia
> "Dear God, I would like to file a bug report"



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: btrfs convert running out of space
  2015-01-20  0:13 ` Gareth Pye
@ 2015-01-20  5:39   ` Lakshmi_Narayanan_Du
  2015-01-20  7:38   ` Chris Murphy
  1 sibling, 0 replies; 21+ messages in thread
From: Lakshmi_Narayanan_Du @ 2015-01-20  5:39 UTC (permalink / raw)
  To: gareth, linux-btrfs; +Cc: Lakshmi_Narayanan_Du

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 9509 bytes --]

Hi ,

I am newbie to file system and just started to understand the concepts of btrfs.  Correct me if I am wrong .I tried  to reproduce the issue  but could not get the issue.
Please see the below snippet of the logs and let me know whether I am missing any context/steps to get the bug   

btrfs balance start -v -dconvert=raid1 -dsoft -dusage=96 /mnt/
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=96
Done, had to relocate 0 out of 21 chunks

btrfs balance start -v -dconvert=raid1 -dsoft -dusage=99 /mnt/
Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=99
Done, had to relocate 0 out of 21 chunks

dmesg  | tail
------------------------------------------------------------------------------------------------------------------------------- 
 [689272.797639] BTRFS info (device sdb2): relocating block group 28349300736 flags 1
[689274.250880] BTRFS info (device sdb2): found 54 extents
[689276.095172] BTRFS info (device sdb2): found 54 extents
-------------------------------------------------------------------------------------------------------------------------------
uname -a

Linux linux-dmhj 3.18.0-4-default #1 SMP Thu Dec 18 19:38:06 EST 2014 x86_64 x86_64 x86_64 GNU/Linux

Thanks,
Lakshmi

-----Original Message-----
From: linux-btrfs-owner@vger.kernel.org [mailto:linux-btrfs-owner@vger.kernel.org] On Behalf Of Gareth Pye
Sent: Tuesday, January 20, 2015 5:43 AM
To: linux-btrfs
Subject: Re: btrfs convert running out of space

I just tried from a slightly different tack, after doing another
-dusage=2 pass I did the following:

# btrfs balance start -v -dconvert=raid1 -dsoft -dusage=96 /data Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=96 Done, had to relocate 0 out of 3763 chunks # btrfs balance start -v -dconvert=raid1 -dsoft -dusage=99 /data Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x302): converting, target=16, soft is on, usage=99
ERROR: error during balancing '/data' - No space left on device There may be more info in syslog - try dmesg | tail # dmesg | tail [1301598.556845] BTRFS info (device drbd5): relocating block group
19366003343360 flags 17
[1301601.300990] BTRFS info (device drbd5): relocating block group
19364929601536 flags 17
[1301606.043675] BTRFS info (device drbd5): relocating block group
19363855859712 flags 17
[1301609.564754] BTRFS info (device drbd5): relocating block group
19362782117888 flags 17
[1301612.453678] BTRFS info (device drbd5): relocating block group
19361708376064 flags 17
[1301616.911777] BTRFS info (device drbd5): relocating block group
19360634634240 flags 17
[1301901.823345] BTRFS info (device drbd5): relocating block group
15298300215296 flags 65
[1301904.206732] BTRFS info (device drbd5): relocating block group
15285415313408 flags 65
[1301904.675298] BTRFS info (device drbd5): relocating block group
14946985312256 flags 65
[1301954.658780] BTRFS info (device drbd5): 3 enospc errors during balance # btrfs balance start -v -dusage=2 /data Dumping filters: flags 0x1, state 0x0, force is off
  DATA (flags 0x2): balancing, usage=2
Done, had to relocate 9 out of 3772 chunks

That looks to me like when converting 3 blocks it wrote 9 blocks with less than 2% usage (Plus presumably 3 mostly full blocks). That sounds like a bug.


On Tue, Jan 20, 2015 at 10:45 AM, Gareth Pye <gareth@cerberos.id.au> wrote:
> Hi,
>
> I'm attempting to convert a btrfs filesystem from raid10 to raid1.
> Things had been going well through a couple of pauses and resumes, but 
> last night it errored with:
> ERROR: error during balancing '/data' - No space left on device
>
> Which is strange because there is around 1.4T spare on the drives.
> df:
> /dev/drbd0      5.5T  4.6T  1.4T  77% /data
>
> btrfs fi df:
> Data, RAID10: total=1.34TiB, used=1.34TiB Data, RAID1: total=3.80TiB, 
> used=3.20TiB System, RAID1: total=32.00MiB, used=720.00KiB Metadata, 
> RAID1: total=13.00GiB, used=9.70GiB GlobalReserve, single: 
> total=512.00MiB, used=204.00KiB
>
> btrfs fi show:
> Label: none  uuid: b2986e1a-0891-4779-960c-e01f7534c6eb
>         Total devices 6 FS bytes used 4.55TiB
>         devid    1 size 1.81TiB used 1.72TiB path /dev/drbd0
>         devid    2 size 1.81TiB used 1.72TiB path /dev/drbd1
>         devid    3 size 1.81TiB used 1.72TiB path /dev/drbd2
>         devid    4 size 1.81TiB used 1.72TiB path /dev/drbd3
>         devid    5 size 1.81TiB used 1.72TiB path /dev/drbd4
>         devid    6 size 1.81TiB used 1.72TiB path /dev/drbd5
>
> The above numbers are from after a quick bit of testing. When the 
> error occured the RAID1 total number was much larger and the device 
> used totals were 1.81TiB. So I ran a balance with -dusage=2 and all 
> the numbers went back to where I expected them to be. RAID1 total of 
> 3.21TiB and appropriate device usage numbers. With the system looking 
> healthy again I checked my btrfs tools version (3.12) and updated that 
> to the current git (3.18.1, matching my kernel version) and tried the 
> convert to raid1 again (this time with the dsoft option) but that 
> quickly got to the above 600G empty allocation, where I canceled it.
>
> # uname -a
> Linux emile 3.18.1-031801-generic #201412170637 SMP Wed Dec 17
> 11:38:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux # btrfs --version 
> Btrfs v3.12
>
> dmesg doesn't tell me much, this is the end of it:
> [1295952.558506] BTRFS info (device drbd5): relocating block group
> 13734193922048 flags 65
> [1295971.813271] BTRFS info (device drbd5): relocating block group
> 13716980498432 flags 65
> [1295976.492826] BTRFS info (device drbd5): relocating block group
> 13713759272960 flags 65
> [1295976.921302] BTRFS info (device drbd5): relocating block group
> 13710538047488 flags 65
> [1295977.593500] BTRFS info (device drbd5): relocating block group
> 13707316822016 flags 65
> [1295988.490751] BTRFS info (device drbd5): relocating block group
> 13704095596544 flags 65
> [1295999.193131] BTRFS info (device drbd5): relocating block group
> 13613800620032 flags 65
> [1296003.036323] BTRFS info (device drbd5): relocating block group
> 13578367139840 flags 65
> [1296009.333859] BTRFS info (device drbd5): relocating block group
> 13539712434176 flags 65
> [1296041.246938] BTRFS info (device drbd5): relocating block group
> 13513942630400 flags 65
> [1296056.891600] BTRFS info (device drbd5): relocating block group
> 13488172826624 flags 65
> [1296071.386463] BTRFS info (device drbd5): relocating block group
> 13472066699264 flags 65
> [1296074.577288] BTRFS info (device drbd5): relocating block group
> 13468845473792 flags 65
> [1296105.783088] BTRFS info (device drbd5): relocating block group
> 13465624248320 flags 65
> [1296114.910226] BTRFS info (device drbd5): relocating block group
> 13462403022848 flags 65
> [1296115.398699] BTRFS info (device drbd5): relocating block group
> 13459181797376 flags 65
> [1296115.798719] BTRFS info (device drbd5): relocating block group
> 13455960571904 flags 65
> [1296123.664726] BTRFS info (device drbd5): relocating block group
> 13452739346432 flags 65
> [1296124.262510] BTRFS info (device drbd5): relocating block group
> 13449518120960 flags 65
> [1296124.787219] BTRFS info (device drbd5): relocating block group
> 13446296895488 flags 65
> [1296125.290209] BTRFS info (device drbd5): relocating block group
> 13443075670016 flags 65
> [1296125.820547] BTRFS info (device drbd5): relocating block group
> 13439854444544 flags 65
> [1296126.306939] BTRFS info (device drbd5): relocating block group
> 13436633219072 flags 65
> [1296126.831993] BTRFS info (device drbd5): relocating block group
> 13433411993600 flags 65
> [1296127.331577] BTRFS info (device drbd5): relocating block group
> 13430190768128 flags 65
> [1296127.914643] BTRFS info (device drbd5): relocating block group
> 13426969542656 flags 65
> [1296128.462360] BTRFS info (device drbd5): relocating block group
> 13423748317184 flags 65
> [1296129.290787] BTRFS info (device drbd5): relocating block group
> 13420527091712 flags 65
> [1296129.927392] BTRFS info (device drbd5): relocating block group
> 13417305866240 flags 65
> [1296136.878558] BTRFS info (device drbd5): relocating block group
> 13414084640768 flags 65
> [1296144.932052] BTRFS info (device drbd5): relocating block group
> 13282014396416 flags 65
> [1296153.141743] BTRFS info (device drbd5): relocating block group
> 13222925041664 flags 65
> [1296153.830643] BTRFS info (device drbd5): relocating block group
> 13219703816192 flags 65
> [1296154.519497] BTRFS info (device drbd5): relocating block group
> 13216482590720 flags 65
> [1296163.842627] BTRFS info (device drbd5): 203 enospc errors during 
> balance
>
> Before that there is just lots of not particularly different 
> "relocating block group" messages.
>
> Any ideas on what is going on here?
>
> --
> Gareth Pye
> Level 2 MTG Judge, Melbourne, Australia "Dear God, I would like to 
> file a bug report"



--
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at  http://vger.kernel.org/majordomo-info.html
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±ý»k~ÏâžØ^n‡r¡ö¦zË\x1aëh™¨è­Ú&£ûàz¿äz¹Þ—ú+€Ê+zf£¢·hšˆ§~†­†Ûiÿÿïêÿ‘êçz_è®\x0fæj:+v‰¨þ)ߣøm

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20  0:13 ` Gareth Pye
  2015-01-20  5:39   ` Lakshmi_Narayanan_Du
@ 2015-01-20  7:38   ` Chris Murphy
  2015-01-20 21:25     ` Gareth Pye
  1 sibling, 1 reply; 21+ messages in thread
From: Chris Murphy @ 2015-01-20  7:38 UTC (permalink / raw)
  Cc: linux-btrfs

> On Mon, Jan 19, 2015 at 5:13 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
> I just tried from a slightly different tack, after doing another
> -dusage=2 pass I did the following:
> # btrfs balance start -v -dconvert=raid1 -dsoft -dusage=96 /data
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x302): converting, target=16, soft is on, usage=96
> Done, had to relocate 0 out of 3763 chunks
> # btrfs balance start -v -dconvert=raid1 -dsoft -dusage=99 /data
> Dumping filters: flags 0x1, state 0x0, force is off
>   DATA (flags 0x302): converting, target=16, soft is on, usage=99
> ERROR: error during balancing '/data' - No space left on device


I guess I don't really understand the purpose of combining dconvert
and dusage. Sure, it should either work or give an error, so there may
still be a bug here. But I'm just not following why they'd be used
together, seems superfluous. What happens when they're tried
separately? I'm not sure which one's instigating the problem.

Also I think you should upgrade to btrfs-progs 3.18, seeing as 3.12 is
old and if there's a bug in that version of progs vs the kernel, it's
not going to get fixed.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20  7:38   ` Chris Murphy
@ 2015-01-20 21:25     ` Gareth Pye
  2015-01-20 21:41       ` Chris Murphy
  0 siblings, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-20 21:25 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

Yeah, I have updated btrfs-progs to 3.18. While it is plausible that
the bug was created by using 3.12, none of the behavior has changed
now I'm using 3.18.

I was experimenting with -dusage values to try and process the blocks
in a different order to see if that made any difference. It did let me
get through more of the file system before erroring but now it errors
on the first block it tries.

Using "btrfs balance start -v -dusage=2 /data" cleans up all the empty
block groups that "btrfs balance start -v
-dconvert=raid1,soft,limit=10 /data" creates. I'm using limit=10 to
speed up testing, I have tried without it and it just takes longer to
complete and the whole time the RAID1 total sky rockets while the
RAID1 used doesn't move.

On Tue, Jan 20, 2015 at 6:38 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> On Mon, Jan 19, 2015 at 5:13 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
>> I just tried from a slightly different tack, after doing another
>> -dusage=2 pass I did the following:
>> # btrfs balance start -v -dconvert=raid1 -dsoft -dusage=96 /data
>> Dumping filters: flags 0x1, state 0x0, force is off
>>   DATA (flags 0x302): converting, target=16, soft is on, usage=96
>> Done, had to relocate 0 out of 3763 chunks
>> # btrfs balance start -v -dconvert=raid1 -dsoft -dusage=99 /data
>> Dumping filters: flags 0x1, state 0x0, force is off
>>   DATA (flags 0x302): converting, target=16, soft is on, usage=99
>> ERROR: error during balancing '/data' - No space left on device
>
>
> I guess I don't really understand the purpose of combining dconvert
> and dusage. Sure, it should either work or give an error, so there may
> still be a bug here. But I'm just not following why they'd be used
> together, seems superfluous. What happens when they're tried
> separately? I'm not sure which one's instigating the problem.
>
> Also I think you should upgrade to btrfs-progs 3.18, seeing as 3.12 is
> old and if there's a bug in that version of progs vs the kernel, it's
> not going to get fixed.
>
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 21:25     ` Gareth Pye
@ 2015-01-20 21:41       ` Chris Murphy
  2015-01-20 21:49         ` Gareth Pye
  2015-01-20 23:33         ` Hugo Mills
  0 siblings, 2 replies; 21+ messages in thread
From: Chris Murphy @ 2015-01-20 21:41 UTC (permalink / raw)
  To: linux-btrfs

On Tue, Jan 20, 2015 at 2:25 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
> Yeah, I have updated btrfs-progs to 3.18. While it is plausible that
> the bug was created by using 3.12, none of the behavior has changed
> now I'm using 3.18.
>
> I was experimenting with -dusage values to try and process the blocks
> in a different order to see if that made any difference. It did let me
> get through more of the file system before erroring but now it errors
> on the first block it tries.
>
> Using "btrfs balance start -v -dusage=2 /data" cleans up all the empty
> block groups that "btrfs balance start -v
> -dconvert=raid1,soft,limit=10 /data" creates. I'm using limit=10 to
> speed up testing, I have tried without it and it just takes longer to
> complete and the whole time the RAID1 total sky rockets while the
> RAID1 used doesn't move.

Sounds like during the conversion, no longer needed raid1 chunks
aren't quickly deallocated so they can be used as raid10 chunks.
There's been some work on this in the 3.19 kernel, it might be worth
testing.

I'm not sure if the significance of the change from flags 17 to flags
65 right before the enospc errors. The spacing between flags 17 chunks
is exactly 1GB whereas the spacing between the values reported for
flags 65 vary a lot, one is a 12GB gap.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 21:41       ` Chris Murphy
@ 2015-01-20 21:49         ` Gareth Pye
  2015-01-20 22:53           ` Chris Murphy
  2015-01-20 23:33         ` Hugo Mills
  1 sibling, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-20 21:49 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

The conversion is going the other way (raid10->raid1), but I expect
the analysis is going to be the same. I'll wait on 3.19 kernel then
(or probably 3.19.1) as this system is slightly more production than
use of btrfs would suggest.

The flags 17 messages are from the non-converting balance to clear up
the empty blocks, the flags 65 messages are from the RAID10->RAID1
balance

On Wed, Jan 21, 2015 at 8:41 AM, Chris Murphy <lists@colorremedies.com> wrote:
> On Tue, Jan 20, 2015 at 2:25 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
>> Yeah, I have updated btrfs-progs to 3.18. While it is plausible that
>> the bug was created by using 3.12, none of the behavior has changed
>> now I'm using 3.18.
>>
>> I was experimenting with -dusage values to try and process the blocks
>> in a different order to see if that made any difference. It did let me
>> get through more of the file system before erroring but now it errors
>> on the first block it tries.
>>
>> Using "btrfs balance start -v -dusage=2 /data" cleans up all the empty
>> block groups that "btrfs balance start -v
>> -dconvert=raid1,soft,limit=10 /data" creates. I'm using limit=10 to
>> speed up testing, I have tried without it and it just takes longer to
>> complete and the whole time the RAID1 total sky rockets while the
>> RAID1 used doesn't move.
>
> Sounds like during the conversion, no longer needed raid1 chunks
> aren't quickly deallocated so they can be used as raid10 chunks.
> There's been some work on this in the 3.19 kernel, it might be worth
> testing.
>
> I'm not sure if the significance of the change from flags 17 to flags
> 65 right before the enospc errors. The spacing between flags 17 chunks
> is exactly 1GB whereas the spacing between the values reported for
> flags 65 vary a lot, one is a 12GB gap.
>
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 21:49         ` Gareth Pye
@ 2015-01-20 22:53           ` Chris Murphy
  2015-01-20 23:04             ` Gareth Pye
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Murphy @ 2015-01-20 22:53 UTC (permalink / raw)
  Cc: linux-btrfs

On Tue, Jan 20, 2015 at 2:49 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
> The conversion is going the other way (raid10->raid1), but I expect
> the analysis is going to be the same. I'll wait on 3.19 kernel then
> (or probably 3.19.1) as this system is slightly more production than
> use of btrfs would suggest.
>
> The flags 17 messages are from the non-converting balance to clear up
> the empty blocks, the flags 65 messages are from the RAID10->RAID1
> balance

Makes sense.

Are there any particularly large files? File bigger than 1GB? Are any
of those nocow (xattr +C)? I'm just throwing spaghetti to see what
might be different with this volume than others which have
successfully converted between raid10 and raid1. The file system was
created with btrfs-progs 3.12? Defaults? (Other than data raid 10.)

It looks to be a large file but it might be worth grabbing a
btrfs-image per the wiki in case the fs behavior changes while doing
anything else to it (even normal operation might free up whatever's
stuck and then the problem isn't reproducible).

Another option, if you have the space, is to create two more drbd
devices of the same size (as each other, they don't have to be the
same size as what's already in the array), and add them to this
volume, and then attempt to complete the conversion (with soft option
as you have been).

Also, those three block values reported with flags 65 can be plugged
into btrfs-debug-tree or btrfs inspect internal to find out what file
is involved, and maybe there's something about them that's instigating
the problem.

I would consider the file system suspect at this point, just being
conservative about it. The reason is that it's in the middle of a
conversion to raid1, so anything newly allocated will be raid1, and
anything old is raid10 and being in this state long term I think is
sufficiently non-deterministic in that who knows what could happen in
three weeks. It might be OK. So if you have the space to create a new
raid1 volume, and btrfs send old to new, it gets you where you
ultimately want to be much sooner. And then you can throw 3.19rc5 at
the problemed fs and see how it behaves.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 22:53           ` Chris Murphy
@ 2015-01-20 23:04             ` Gareth Pye
  2015-01-21  4:03               ` Chris Murphy
  0 siblings, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-20 23:04 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

Yeah, we don't have that much space spare :(

File system has been going strong from when it was created with early
RAID5 code, then converted to RAID10 with kernel 3.12.

There aren't any nocow files to my knowledge but there are plenty of
files larger than a gig on the file system. The first few results from
logical-resolve have been for files in the 1G~2G range, so that could
be some sticky spaghetti.

On Wed, Jan 21, 2015 at 9:53 AM, Chris Murphy <lists@colorremedies.com> wrote:
> On Tue, Jan 20, 2015 at 2:49 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
>> The conversion is going the other way (raid10->raid1), but I expect
>> the analysis is going to be the same. I'll wait on 3.19 kernel then
>> (or probably 3.19.1) as this system is slightly more production than
>> use of btrfs would suggest.
>>
>> The flags 17 messages are from the non-converting balance to clear up
>> the empty blocks, the flags 65 messages are from the RAID10->RAID1
>> balance
>
> Makes sense.
>
> Are there any particularly large files? File bigger than 1GB? Are any
> of those nocow (xattr +C)? I'm just throwing spaghetti to see what
> might be different with this volume than others which have
> successfully converted between raid10 and raid1. The file system was
> created with btrfs-progs 3.12? Defaults? (Other than data raid 10.)
>
> It looks to be a large file but it might be worth grabbing a
> btrfs-image per the wiki in case the fs behavior changes while doing
> anything else to it (even normal operation might free up whatever's
> stuck and then the problem isn't reproducible).
>
> Another option, if you have the space, is to create two more drbd
> devices of the same size (as each other, they don't have to be the
> same size as what's already in the array), and add them to this
> volume, and then attempt to complete the conversion (with soft option
> as you have been).
>
> Also, those three block values reported with flags 65 can be plugged
> into btrfs-debug-tree or btrfs inspect internal to find out what file
> is involved, and maybe there's something about them that's instigating
> the problem.
>
> I would consider the file system suspect at this point, just being
> conservative about it. The reason is that it's in the middle of a
> conversion to raid1, so anything newly allocated will be raid1, and
> anything old is raid10 and being in this state long term I think is
> sufficiently non-deterministic in that who knows what could happen in
> three weeks. It might be OK. So if you have the space to create a new
> raid1 volume, and btrfs send old to new, it gets you where you
> ultimately want to be much sooner. And then you can throw 3.19rc5 at
> the problemed fs and see how it behaves.
>
> --
> Chris Murphy
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 21:41       ` Chris Murphy
  2015-01-20 21:49         ` Gareth Pye
@ 2015-01-20 23:33         ` Hugo Mills
  1 sibling, 0 replies; 21+ messages in thread
From: Hugo Mills @ 2015-01-20 23:33 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1799 bytes --]

On Tue, Jan 20, 2015 at 02:41:13PM -0700, Chris Murphy wrote:
> On Tue, Jan 20, 2015 at 2:25 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
> > Yeah, I have updated btrfs-progs to 3.18. While it is plausible that
> > the bug was created by using 3.12, none of the behavior has changed
> > now I'm using 3.18.
> >
> > I was experimenting with -dusage values to try and process the blocks
> > in a different order to see if that made any difference. It did let me
> > get through more of the file system before erroring but now it errors
> > on the first block it tries.
> >
> > Using "btrfs balance start -v -dusage=2 /data" cleans up all the empty
> > block groups that "btrfs balance start -v
> > -dconvert=raid1,soft,limit=10 /data" creates. I'm using limit=10 to
> > speed up testing, I have tried without it and it just takes longer to
> > complete and the whole time the RAID1 total sky rockets while the
> > RAID1 used doesn't move.
> 
> Sounds like during the conversion, no longer needed raid1 chunks
> aren't quickly deallocated so they can be used as raid10 chunks.
> There's been some work on this in the 3.19 kernel, it might be worth
> testing.
> 
> I'm not sure if the significance of the change from flags 17 to flags
> 65 right before the enospc errors. The spacing between flags 17 chunks
> is exactly 1GB whereas the spacing between the values reported for
> flags 65 vary a lot, one is a 12GB gap.

   flags 17 is RAID-1 data. 65 is RAID-10 data. If you're converting,
I think that's the type *before* it gets converted. The values for
block group type are in the ctree.h header (kernel and userspace),
BTRFS_BLOCK_GROUP_*.

   Hugo.

-- 
Hugo Mills             | Putting U back in Honor, Valor, and Trth.
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: 65E74AC0          |

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-20 23:04             ` Gareth Pye
@ 2015-01-21  4:03               ` Chris Murphy
  2015-01-22 21:58                 ` Gareth Pye
  0 siblings, 1 reply; 21+ messages in thread
From: Chris Murphy @ 2015-01-21  4:03 UTC (permalink / raw)
  To: Gareth Pye; +Cc: linux-btrfs

On Tue, Jan 20, 2015 at 4:04 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
> Yeah, we don't have that much space spare :(
>
> File system has been going strong from when it was created with early
> RAID5 code, then converted to RAID10 with kernel 3.12.
>
> There aren't any nocow files to my knowledge but there are plenty of
> files larger than a gig on the file system. The first few results from
> logical-resolve have been for files in the 1G~2G range, so that could
> be some sticky spaghetti.

Are any of those big files in a snapshot? The snapshotting may be
pinning a bunch of large extents, so even if it seems like the volume
has enough space, it might actually be running out of space. All I can
think of is progressively removing the files that are implicated in
the conversion failure. That could mean just deleting older snapshots
that you probably don't need, progressively getting to the point where
you migrate those files off this fs to another one, and then delete
them (all instances in all subvol/snapshots) and just keep trying.

Is a btrfs check happy? Or does it complain about anything?

I've had quite good luck just adding a drive (two drives for raid1/10
volumes) to an existing btrfs volume, they don't have to be drdb, they
can be local block devices, either physical drives or LV's. I've even
done this with flash drives (kinda scary and slow but it worked).

I'd still suggest contingency planning in case this volume becomes
temperamental and you have no choice but to migrate it elsewhere.
Better to do it on your timetable than the filesystem's.

-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-21  4:03               ` Chris Murphy
@ 2015-01-22 21:58                 ` Gareth Pye
  2015-01-22 21:58                   ` Gareth Pye
  2015-01-23  4:34                   ` Duncan
  0 siblings, 2 replies; 21+ messages in thread
From: Gareth Pye @ 2015-01-22 21:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

What are the chances that splitting all the large files up into sub
gig pieces, finish convert, then recombine them all will work?

On Wed, Jan 21, 2015 at 3:03 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Tue, Jan 20, 2015 at 4:04 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
>> Yeah, we don't have that much space spare :(
>>
>> File system has been going strong from when it was created with early
>> RAID5 code, then converted to RAID10 with kernel 3.12.
>>
>> There aren't any nocow files to my knowledge but there are plenty of
>> files larger than a gig on the file system. The first few results from
>> logical-resolve have been for files in the 1G~2G range, so that could
>> be some sticky spaghetti.
>
> Are any of those big files in a snapshot? The snapshotting may be
> pinning a bunch of large extents, so even if it seems like the volume
> has enough space, it might actually be running out of space. All I can
> think of is progressively removing the files that are implicated in
> the conversion failure. That could mean just deleting older snapshots
> that you probably don't need, progressively getting to the point where
> you migrate those files off this fs to another one, and then delete
> them (all instances in all subvol/snapshots) and just keep trying.
>
> Is a btrfs check happy? Or does it complain about anything?
>
> I've had quite good luck just adding a drive (two drives for raid1/10
> volumes) to an existing btrfs volume, they don't have to be drdb, they
> can be local block devices, either physical drives or LV's. I've even
> done this with flash drives (kinda scary and slow but it worked).
>
> I'd still suggest contingency planning in case this volume becomes
> temperamental and you have no choice but to migrate it elsewhere.
> Better to do it on your timetable than the filesystem's.
>
> --
> Chris Murphy



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-22 21:58                 ` Gareth Pye
@ 2015-01-22 21:58                   ` Gareth Pye
  2015-01-23  4:34                   ` Duncan
  1 sibling, 0 replies; 21+ messages in thread
From: Gareth Pye @ 2015-01-22 21:58 UTC (permalink / raw)
  To: Chris Murphy; +Cc: linux-btrfs

PS: the only snapshots are of apt-mirror, which doesn't have large files.

On Fri, Jan 23, 2015 at 8:58 AM, Gareth Pye <gareth@cerberos.id.au> wrote:
> What are the chances that splitting all the large files up into sub
> gig pieces, finish convert, then recombine them all will work?
>
> On Wed, Jan 21, 2015 at 3:03 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> On Tue, Jan 20, 2015 at 4:04 PM, Gareth Pye <gareth@cerberos.id.au> wrote:
>>> Yeah, we don't have that much space spare :(
>>>
>>> File system has been going strong from when it was created with early
>>> RAID5 code, then converted to RAID10 with kernel 3.12.
>>>
>>> There aren't any nocow files to my knowledge but there are plenty of
>>> files larger than a gig on the file system. The first few results from
>>> logical-resolve have been for files in the 1G~2G range, so that could
>>> be some sticky spaghetti.
>>
>> Are any of those big files in a snapshot? The snapshotting may be
>> pinning a bunch of large extents, so even if it seems like the volume
>> has enough space, it might actually be running out of space. All I can
>> think of is progressively removing the files that are implicated in
>> the conversion failure. That could mean just deleting older snapshots
>> that you probably don't need, progressively getting to the point where
>> you migrate those files off this fs to another one, and then delete
>> them (all instances in all subvol/snapshots) and just keep trying.
>>
>> Is a btrfs check happy? Or does it complain about anything?
>>
>> I've had quite good luck just adding a drive (two drives for raid1/10
>> volumes) to an existing btrfs volume, they don't have to be drdb, they
>> can be local block devices, either physical drives or LV's. I've even
>> done this with flash drives (kinda scary and slow but it worked).
>>
>> I'd still suggest contingency planning in case this volume becomes
>> temperamental and you have no choice but to migrate it elsewhere.
>> Better to do it on your timetable than the filesystem's.
>>
>> --
>> Chris Murphy
>
>
>
> --
> Gareth Pye
> Level 2 MTG Judge, Melbourne, Australia
> "Dear God, I would like to file a bug report"



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-22 21:58                 ` Gareth Pye
  2015-01-22 21:58                   ` Gareth Pye
@ 2015-01-23  4:34                   ` Duncan
  2015-01-23  7:54                     ` Marc Joliet
  1 sibling, 1 reply; 21+ messages in thread
From: Duncan @ 2015-01-23  4:34 UTC (permalink / raw)
  To: linux-btrfs

Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted:

> What are the chances that splitting all the large files up into sub gig
> pieces, finish convert, then recombine them all will work?

[Further context removed due to the hassle of trying to sort the
top-posting into proper order to reply in /proper/ context.]

A likely easier alternative would be to temporarily move those files off 
the filesystem in question.

Option 1: Do that (thumb drives work well for this if you're not talking 
/terabytes/ and don't have a spare hard drive handy), finish the convert, 
and move them back.

Option 2: Since new files should be created using the desired target mode 
(raid1 IIRC), you may actually be able to move them off and immediately 
back on, so they appear as new files and thus get created in the desired 
mode.  Of course the success here depends on how many you have to move 
vs. the amount of free space available that will be used when you do so, 
but with enough space, it should "just work".

Note that with this method, if the files are small enough to entirely fit 
one-at-a-time or a-few-at-a-time in memory (I have 16 gig RAM, for 
instance, and don't tend to use more than a gig or two for apps, so could 
in theory do 12-14 gig at a time for this), you can even use a tmpfs as 
the temporary storage before moving them back to the target filesystem.  
That should be pretty fast since the one side is all memory.

This is actually the solution recommended when a btrfs from ext* 
conversion didn't have the recommended defrag and balance done afterward, 
or when even after a defrag, the balance fails due to overly large ext* 
extents, compared to the normally 1-gig extents that btrfs usually works 
with.  Move all the gig-plus files off the filesystem, thus eliminating 
the old overly large extents, and back, so the files get created with 
native btrfs sized extents.  The solution is known to work for that, so 
assuming a similar issue here, it should work here as well.

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-23  4:34                   ` Duncan
@ 2015-01-23  7:54                     ` Marc Joliet
  2015-01-23  8:46                       ` Duncan
  0 siblings, 1 reply; 21+ messages in thread
From: Marc Joliet @ 2015-01-23  7:54 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 1540 bytes --]

Am Fri, 23 Jan 2015 04:34:19 +0000 (UTC)
schrieb Duncan <1i5t5.duncan@cox.net>:

> Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted:
> 
> > What are the chances that splitting all the large files up into sub gig
> > pieces, finish convert, then recombine them all will work?
> 
[...]
> Option 2: Since new files should be created using the desired target mode 
> (raid1 IIRC), you may actually be able to move them off and immediately 
> back on, so they appear as new files and thus get created in the desired 
> mode.  Of course the success here depends on how many you have to move 
> vs. the amount of free space available that will be used when you do so, 
> but with enough space, it should "just work".
>
> Note that with this method, if the files are small enough to entirely fit 
> one-at-a-time or a-few-at-a-time in memory (I have 16 gig RAM, for 
> instance, and don't tend to use more than a gig or two for apps, so could 
> in theory do 12-14 gig at a time for this), you can even use a tmpfs as 
> the temporary storage before moving them back to the target filesystem.  
> That should be pretty fast since the one side is all memory.

With current coreutils, wouldn't that also work if he moves the files to
another (temporary) subvolume? (And with future coreutils, by copying the files
without using reflinks and then removing the originals.)

[...]
-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-23  7:54                     ` Marc Joliet
@ 2015-01-23  8:46                       ` Duncan
  2015-01-25 15:23                         ` Marc Joliet
  0 siblings, 1 reply; 21+ messages in thread
From: Duncan @ 2015-01-23  8:46 UTC (permalink / raw)
  To: linux-btrfs

Marc Joliet posted on Fri, 23 Jan 2015 08:54:41 +0100 as excerpted:

> Am Fri, 23 Jan 2015 04:34:19 +0000 (UTC)
> schrieb Duncan <1i5t5.duncan@cox.net>:
> 
>> Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted:
>> 
>> > What are the chances that splitting all the large files up into sub
>> > gig pieces, finish convert, then recombine them all will work?
>> 
> [...]
>> Option 2: Since new files should be created using the desired target
>> mode (raid1 IIRC), you may actually be able to move them off and
>> immediately back on, so they appear as new files and thus get created
>> in the desired mode.
> 
> With current coreutils, wouldn't that also work if he moves the files to
> another (temporary) subvolume? (And with future coreutils, by copying
> the files without using reflinks and then removing the originals.)

If done correctly, yes.

However, "off the filesystem" is far simpler to explain over email or the 
like, and is much less ambiguous in terms of "OK, but did you do it 
'correctly'" if it doesn't end up helping.  If it doesn't work, it 
doesn't work.  If "move to a different subvolume under specific 
conditions in terms of reflinking and the like" doesn't work, there's 
always the question of whether it /really/ didn't work, or if somehow the 
instructions weren't clear enough and thus failure was simply the result 
of a failure to fully meet the technical requirements.

Of course if I was doing it myself, and if I was absolutely sure of the 
technical details in terms of what command I had to use to be /sure/ it 
didn't simply reflink and thus defeat the whole exercise, I'd likely use 
the shortcut.  But in reality, if it didn't work I'd be second-guessing 
myself and would probably move everything entirely off and back on to be 
sure, and knowing that, I'd probably do it the /sure/ way in the first 
place, avoiding the chance of having to redo it to prove to myself that 
I'd done it correctly.

Of course, having demonstrated to myself that it worked, if I ever had 
the problem again, I might try the shortcut, just to demonstrate to my 
own satisfaction the full theory that the effect of the shortcut was the 
same as the effect of doing it the longer and more fool-proof way.  But 
of course I'd rather not have the opportunity to try that second-half 
proof. =:^)

Make sense? =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-23  8:46                       ` Duncan
@ 2015-01-25 15:23                         ` Marc Joliet
  2015-01-27  3:24                           ` Gareth Pye
  0 siblings, 1 reply; 21+ messages in thread
From: Marc Joliet @ 2015-01-25 15:23 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 3158 bytes --]

Am Fri, 23 Jan 2015 08:46:23 +0000 (UTC)
schrieb Duncan <1i5t5.duncan@cox.net>:

> Marc Joliet posted on Fri, 23 Jan 2015 08:54:41 +0100 as excerpted:
> 
> > Am Fri, 23 Jan 2015 04:34:19 +0000 (UTC)
> > schrieb Duncan <1i5t5.duncan@cox.net>:
> > 
> >> Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted:
> >> 
> >> > What are the chances that splitting all the large files up into sub
> >> > gig pieces, finish convert, then recombine them all will work?
> >> 
> > [...]
> >> Option 2: Since new files should be created using the desired target
> >> mode (raid1 IIRC), you may actually be able to move them off and
> >> immediately back on, so they appear as new files and thus get created
> >> in the desired mode.
> > 
> > With current coreutils, wouldn't that also work if he moves the files to
> > another (temporary) subvolume? (And with future coreutils, by copying
> > the files without using reflinks and then removing the originals.)
> 
> If done correctly, yes.
> 
> However, "off the filesystem" is far simpler to explain over email or the 
> like, and is much less ambiguous in terms of "OK, but did you do it 
> 'correctly'" if it doesn't end up helping.  If it doesn't work, it 
> doesn't work.  If "move to a different subvolume under specific 
> conditions in terms of reflinking and the like" doesn't work, there's 
> always the question of whether it /really/ didn't work, or if somehow the 
> instructions weren't clear enough and thus failure was simply the result 
> of a failure to fully meet the technical requirements.
> 
> Of course if I was doing it myself, and if I was absolutely sure of the 
> technical details in terms of what command I had to use to be /sure/ it 
> didn't simply reflink and thus defeat the whole exercise, I'd likely use 
> the shortcut.  But in reality, if it didn't work I'd be second-guessing 
> myself and would probably move everything entirely off and back on to be 
> sure, and knowing that, I'd probably do it the /sure/ way in the first 
> place, avoiding the chance of having to redo it to prove to myself that 
> I'd done it correctly.
> 
> Of course, having demonstrated to myself that it worked, if I ever had 
> the problem again, I might try the shortcut, just to demonstrate to my 
> own satisfaction the full theory that the effect of the shortcut was the 
> same as the effect of doing it the longer and more fool-proof way.  But 
> of course I'd rather not have the opportunity to try that second-half 
> proof. =:^)
> 
> Make sense? =:^)

I was going to argue that my suggestion was hardly difficult to get right, but
then I read that cp defaults to --reflink=always and that it is not possible to
turn off reflinks (i.e., there is no --reflink=never).

So then would have to consider alternatives like dd, and, well, you are right,
I suppose :) .

(Of course, with the *current* version of coreutils, the simple "mv somefile
tmp_subvol/; mv tmp_subvol/somefile ." will still work.)

-- 
Marc Joliet
--
"People who think they know everything really annoy those of us who know we
don't" - Bjarne Stroustrup

[-- Attachment #2: Digitale Signatur von OpenPGP --]
[-- Type: application/pgp-signature, Size: 819 bytes --]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-25 15:23                         ` Marc Joliet
@ 2015-01-27  3:24                           ` Gareth Pye
  2015-01-27  6:20                             ` Duncan
  0 siblings, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-27  3:24 UTC (permalink / raw)
  To: linux-btrfs

Have gone with the move stuff off then finish convert plan. Convert
has now finished and I'm 60% of the way through moving all the big
files back on.

Thanks for the help guys.

On Mon, Jan 26, 2015 at 2:23 AM, Marc Joliet <marcec@gmx.de> wrote:
> Am Fri, 23 Jan 2015 08:46:23 +0000 (UTC)
> schrieb Duncan <1i5t5.duncan@cox.net>:
>
>> Marc Joliet posted on Fri, 23 Jan 2015 08:54:41 +0100 as excerpted:
>>
>> > Am Fri, 23 Jan 2015 04:34:19 +0000 (UTC)
>> > schrieb Duncan <1i5t5.duncan@cox.net>:
>> >
>> >> Gareth Pye posted on Fri, 23 Jan 2015 08:58:08 +1100 as excerpted:
>> >>
>> >> > What are the chances that splitting all the large files up into sub
>> >> > gig pieces, finish convert, then recombine them all will work?
>> >>
>> > [...]
>> >> Option 2: Since new files should be created using the desired target
>> >> mode (raid1 IIRC), you may actually be able to move them off and
>> >> immediately back on, so they appear as new files and thus get created
>> >> in the desired mode.
>> >
>> > With current coreutils, wouldn't that also work if he moves the files to
>> > another (temporary) subvolume? (And with future coreutils, by copying
>> > the files without using reflinks and then removing the originals.)
>>
>> If done correctly, yes.
>>
>> However, "off the filesystem" is far simpler to explain over email or the
>> like, and is much less ambiguous in terms of "OK, but did you do it
>> 'correctly'" if it doesn't end up helping.  If it doesn't work, it
>> doesn't work.  If "move to a different subvolume under specific
>> conditions in terms of reflinking and the like" doesn't work, there's
>> always the question of whether it /really/ didn't work, or if somehow the
>> instructions weren't clear enough and thus failure was simply the result
>> of a failure to fully meet the technical requirements.
>>
>> Of course if I was doing it myself, and if I was absolutely sure of the
>> technical details in terms of what command I had to use to be /sure/ it
>> didn't simply reflink and thus defeat the whole exercise, I'd likely use
>> the shortcut.  But in reality, if it didn't work I'd be second-guessing
>> myself and would probably move everything entirely off and back on to be
>> sure, and knowing that, I'd probably do it the /sure/ way in the first
>> place, avoiding the chance of having to redo it to prove to myself that
>> I'd done it correctly.
>>
>> Of course, having demonstrated to myself that it worked, if I ever had
>> the problem again, I might try the shortcut, just to demonstrate to my
>> own satisfaction the full theory that the effect of the shortcut was the
>> same as the effect of doing it the longer and more fool-proof way.  But
>> of course I'd rather not have the opportunity to try that second-half
>> proof. =:^)
>>
>> Make sense? =:^)
>
> I was going to argue that my suggestion was hardly difficult to get right, but
> then I read that cp defaults to --reflink=always and that it is not possible to
> turn off reflinks (i.e., there is no --reflink=never).
>
> So then would have to consider alternatives like dd, and, well, you are right,
> I suppose :) .
>
> (Of course, with the *current* version of coreutils, the simple "mv somefile
> tmp_subvol/; mv tmp_subvol/somefile ." will still work.)
>
> --
> Marc Joliet
> --
> "People who think they know everything really annoy those of us who know we
> don't" - Bjarne Stroustrup



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-27  3:24                           ` Gareth Pye
@ 2015-01-27  6:20                             ` Duncan
  2015-01-27 21:53                               ` Gareth Pye
  0 siblings, 1 reply; 21+ messages in thread
From: Duncan @ 2015-01-27  6:20 UTC (permalink / raw)
  To: linux-btrfs

Gareth Pye posted on Tue, 27 Jan 2015 14:24:03 +1100 as excerpted:

> Have gone with the move stuff off then finish convert plan. Convert has
> now finished and I'm 60% of the way through moving all the big files
> back on.
> 
> Thanks for the help guys.

Glad the big-file-move-off seems to have worked for you, and thanks for 
confirming that moving them off did indeed solve your conversion 
blockers.  Evidently btrfs still has a few rough spots to iron out when 
it comes to those big files. =:^(

Please confirm when all big files are moved back on, too, just to be sure 
there's nothing unexpected on that side, but based on the conversion-from-
ext* reports, I expect it to be smooth sailing.  Btrfs really does have a 
problem with large files existing in large enough extents that it can't 
really handle them properly, and once they are off the filesystem 
temporarily, generally even moved off and back on, it does seem to break 
up the log jam (well, in this case huge-extent-jam =:^) and you're back 
in business! =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-27  6:20                             ` Duncan
@ 2015-01-27 21:53                               ` Gareth Pye
  2015-01-28  0:18                                 ` Duncan
  0 siblings, 1 reply; 21+ messages in thread
From: Gareth Pye @ 2015-01-27 21:53 UTC (permalink / raw)
  To: Duncan; +Cc: linux-btrfs

Yeah, copying them all back on has gone event free, now running some
balance passes to clear up the 310G of slack allocation.

I did realize that something I'd claimed earlier wasn't true: there
were 3 files larger than a gig in the apt mirror snapshots, so large
files in snapshots could have been contributing to the issue (If I'd
realized that before moving all the large files off I'd have tried
moving just those few off)

Thank you all for your help. This file systems possibilities do excite
me, the future is bright.

On Tue, Jan 27, 2015 at 5:20 PM, Duncan <1i5t5.duncan@cox.net> wrote:
> Gareth Pye posted on Tue, 27 Jan 2015 14:24:03 +1100 as excerpted:
>
>> Have gone with the move stuff off then finish convert plan. Convert has
>> now finished and I'm 60% of the way through moving all the big files
>> back on.
>>
>> Thanks for the help guys.
>
> Glad the big-file-move-off seems to have worked for you, and thanks for
> confirming that moving them off did indeed solve your conversion
> blockers.  Evidently btrfs still has a few rough spots to iron out when
> it comes to those big files. =:^(
>
> Please confirm when all big files are moved back on, too, just to be sure
> there's nothing unexpected on that side, but based on the conversion-from-
> ext* reports, I expect it to be smooth sailing.  Btrfs really does have a
> problem with large files existing in large enough extents that it can't
> really handle them properly, and once they are off the filesystem
> temporarily, generally even moved off and back on, it does seem to break
> up the log jam (well, in this case huge-extent-jam =:^) and you're back
> in business! =:^)
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Gareth Pye
Level 2 MTG Judge, Melbourne, Australia
"Dear God, I would like to file a bug report"

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: btrfs convert running out of space
  2015-01-27 21:53                               ` Gareth Pye
@ 2015-01-28  0:18                                 ` Duncan
  0 siblings, 0 replies; 21+ messages in thread
From: Duncan @ 2015-01-28  0:18 UTC (permalink / raw)
  To: linux-btrfs

Gareth Pye posted on Wed, 28 Jan 2015 08:53:01 +1100 as excerpted:

> Thank you all for your help. This file systems possibilities do excite
> me, the future is bright.

And a final thank you.  Glad we were able to help you do what you
wanted/needed to do. Sometimes we never know if it worked, and getting 
that final confirmation is a reward of its own, plus the solution is 
clearer the next time someone has a problem.  =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-01-28  0:18 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-19 23:45 btrfs convert running out of space Gareth Pye
2015-01-20  0:13 ` Gareth Pye
2015-01-20  5:39   ` Lakshmi_Narayanan_Du
2015-01-20  7:38   ` Chris Murphy
2015-01-20 21:25     ` Gareth Pye
2015-01-20 21:41       ` Chris Murphy
2015-01-20 21:49         ` Gareth Pye
2015-01-20 22:53           ` Chris Murphy
2015-01-20 23:04             ` Gareth Pye
2015-01-21  4:03               ` Chris Murphy
2015-01-22 21:58                 ` Gareth Pye
2015-01-22 21:58                   ` Gareth Pye
2015-01-23  4:34                   ` Duncan
2015-01-23  7:54                     ` Marc Joliet
2015-01-23  8:46                       ` Duncan
2015-01-25 15:23                         ` Marc Joliet
2015-01-27  3:24                           ` Gareth Pye
2015-01-27  6:20                             ` Duncan
2015-01-27 21:53                               ` Gareth Pye
2015-01-28  0:18                                 ` Duncan
2015-01-20 23:33         ` Hugo Mills

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.