All of lore.kernel.org
 help / color / mirror / Atom feed
* recovery problem raid5
@ 2016-03-18 17:41 Marcin Solecki
  2016-03-18 18:02 ` Hugo Mills
  0 siblings, 1 reply; 11+ messages in thread
From: Marcin Solecki @ 2016-03-18 17:41 UTC (permalink / raw)
  To: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 4602 bytes --]

Hello all,
I give up for this problem at restore my data


# uname -a
Linux jarvis.home 4.5.0-1.el7.elrepo.x86_64

# btrfs --version
btrfs-progs v3.19.1

# btrfs fi show
warning, device 4 is missing
bytenr mismatch, want=21020672, have=21217280
Couldn't read chunk root
Label: none  uuid: 27ef2638-b50a-4243-80ed-40c3733ec11d
         Total devices 4 FS bytes used 2.50TiB
         devid    1 size 931.51GiB used 899.71GiB path /dev/sdd
         devid    2 size 931.51GiB used 899.69GiB path /dev/sdb
         devid    3 size 931.51GiB used 899.69GiB path /dev/sdc
         *** Some devices missing

# mount  -o recovery /dev/sda /srv/
mount: wrong fs type, bad option, bad superblock on /dev/sda,
        missing codepage or helper program, or other error

        In some cases useful info is found in syslog - try
        dmesg | tail or so.

dmesg after :
[ 4886.521315] BTRFS info (device sdc): enabling auto recovery
[ 4886.521320] BTRFS info (device sdc): disk space caching is enabled
[ 4886.522853] BTRFS: failed to read chunk tree on sdc
[ 4886.528789] BTRFS: open_ctree failed

#btrfs check --repair /dev/sda
enabling repair mode
warning, device 4 is missing
bytenr mismatch, want=21020672, have=21217280
Couldn't read chunk root
Couldn't open file system

# btrfs rescue chunk-recover -v /dev/sda
All Devices:
         Device: id = 3, name = /dev/sdc
         Device: id = 2, name = /dev/sdb
         Device: id = 1, name = /dev/sda

[ 5164.468272] btrfs[3653]: segfault at 7f454014172e ip 0000000000423479 
sp 00007f4482cec880 error 4 in btrfs[400000+83000]
[ 5168.928317] btrfs[3657]: segfault at 7fd18c14172e ip 0000000000423479 
sp 00007fd0d5858880 error 4 in btrfs[400000+83000]
[ 5173.812457] btrfs[3662]: segfault at 7fd76c14172e ip 0000000000423479 
sp 00007fd6b0e59880 error 4 in btrfs[400000+83000]

# btrfs rescue super-recover -v /dev/sda
All Devices:
         Device: id = 3, name = /dev/sdc
         Device: id = 2, name = /dev/sdb
         Device: id = 1, name = /dev/sda

Before Recovering:
         [All good supers]:
                 device name = /dev/sdc
                 superblock bytenr = 65536

                 device name = /dev/sdc
                 superblock bytenr = 67108864

                 device name = /dev/sdc
                 superblock bytenr = 274877906944

                 device name = /dev/sdb
                 superblock bytenr = 65536

                 device name = /dev/sdb
                 superblock bytenr = 67108864

                 device name = /dev/sdb
                 superblock bytenr = 274877906944

                 device name = /dev/sda
                 superblock bytenr = 65536

                 device name = /dev/sda
                 superblock bytenr = 67108864

                 device name = /dev/sda
                 superblock bytenr = 274877906944

         [All bad supers]:

All supers are valid, no need to recover

# btrfs-show-super /dev/sda
superblock: bytenr=65536, device=/dev/sda
---------------------------------------------------------
csum                    0x61b509bb [match]
bytenr                  65536
flags                   0x1
magic                   _BHRfS_M [match]
fsid                    27ef2638-b50a-4243-80ed-40c3733ec11d
label
generation              69462
root                    1648640000
sys_array_size          290
chunk_root_generation   48545
root_level              1
chunk_root              21020672
chunk_root_level        1
log_root                0
log_root_transid        0
log_root_level          0
total_bytes             4000819544064
bytes_used              2743528714240
sectorsize              4096
nodesize                16384
leafsize                16384
stripesize              4096
root_dir                6
num_devices             4
compat_flags            0x0
compat_ro_flags         0x0
incompat_flags          0xe1
                         ( MIXED_BACKREF |
                           BIG_METADATA |
                           EXTENDED_IREF |
                           RAID56 )
csum_type               0
csum_size               4
cache_generation        69462
uuid_tree_generation    69462
dev_item.uuid           70f4650c-e01d-4613-bd7a-a6834c1c44bb
dev_item.fsid           27ef2638-b50a-4243-80ed-40c3733ec11d [match]
dev_item.type           0
dev_item.total_bytes    1000204886016
dev_item.bytes_used     966057263104
dev_item.io_align       4096
dev_item.io_width       4096
dev_item.sector_size    4096
dev_item.devid          1
dev_item.dev_group      0
dev_item.seek_speed     0
dev_item.bandwidth      0
dev_item.generation     0

thx for helps

-- 


[-- Attachment #2: dmesg.log.txt --]
[-- Type: text/plain, Size: 3382 bytes --]

[   88.423604] BTRFS warning (device sdc): devid 4 uuid c24f39f8-c73c-4a17-bbef-cb8988adcbf7 is missing
[   88.455607] BTRFS info (device sdc): bdev (null) errs: wr 921, rd 164889, flush 0, corrupt 0, gen 0
[   88.800609] BTRFS error (device sdc): bad tree block start 0 1619525632
[   88.800648] BTRFS: Failed to read block groups: -5
[   88.809756] BTRFS: open_ctree failed
[  143.792547] BTRFS info (device sdc): allowing degraded mounts
[  143.792552] BTRFS info (device sdc): disk space caching is enabled
[  143.802719] BTRFS info (device sdc): bdev (null) errs: wr 921, rd 164889, flush 0, corrupt 0, gen 0
[  143.874370] BTRFS error (device sdc): bad tree block start 0 1619525632
[  143.874410] BTRFS: Failed to read block groups: -5
[  143.883571] BTRFS: open_ctree failed
[  402.564374] BTRFS info (device sdc): enabling auto recovery
[  402.564379] BTRFS info (device sdc): disk space caching is enabled
[  402.565551] BTRFS: failed to read chunk tree on sdc
[  402.570197] BTRFS: open_ctree failed
[  604.273452] btrfs[2536]: segfault at 7f7fe014172e ip 0000000000423479 sp 00007f7f21358880 error 4 in btrfs[400000+83000]
[ 1101.315762] BTRFS info (device sdc): enabling auto recovery
[ 1101.315766] BTRFS info (device sdc): disk space caching is enabled
[ 1101.316980] BTRFS: failed to read chunk tree on sdc
[ 1101.323006] BTRFS: open_ctree failed
[ 1124.008691] BTRFS info (device sdc): enabling auto recovery
[ 1124.008695] BTRFS info (device sdc): disk space caching is enabled
[ 1124.009859] BTRFS: failed to read chunk tree on sdc
[ 1124.013344] BTRFS: open_ctree failed
[ 1151.424614] btrfs[2666]: segfault at 7f8ff414172e ip 0000000000423479 sp 00007f8f3bcf3880 error 4 in btrfs[400000+83000]
[ 1191.661205] btrfs[2677]: segfault at 7f060014172e ip 0000000000423479 sp 00007f0545789880 error 4 in btrfs[400000+83000]
[ 2368.120700] BTRFS info (device sdc): enabling auto recovery
[ 2368.120704] BTRFS info (device sdc): disk space caching is enabled
[ 2368.121924] BTRFS: failed to read chunk tree on sdc
[ 2368.126591] BTRFS: open_ctree failed
[ 2370.738712] BTRFS info (device sdc): enabling auto recovery
[ 2370.738717] BTRFS info (device sdc): disk space caching is enabled
[ 2370.740003] BTRFS: failed to read chunk tree on sdc
[ 2370.744624] BTRFS: open_ctree failed
[ 2373.002807] BTRFS info (device sdc): enabling auto recovery
[ 2373.002811] BTRFS info (device sdc): disk space caching is enabled
[ 2373.004001] BTRFS: failed to read chunk tree on sdc
[ 2373.007651] BTRFS: open_ctree failed
[ 2513.800271] BTRFS info (device sdc): enabling auto recovery
[ 2513.800276] BTRFS info (device sdc): disk space caching is enabled
[ 2513.802181] BTRFS: failed to read chunk tree on sdc
[ 2513.808614] BTRFS: open_ctree failed
[ 2972.448037] btrfs[3152]: segfault at 7f454c14172e ip 0000000000423479 sp 00007f4494305880 error 4 in btrfs[400000+83000]
[ 3123.647370] BTRFS info (device sdb): enabling auto recovery
[ 3123.647375] BTRFS info (device sdb): disk space caching is enabled
[ 3123.648233] BTRFS: failed to read chunk root on sdb
[ 3123.655235] BTRFS: open_ctree failed
[ 3131.574580] BTRFS info (device sdb): enabling auto recovery
[ 3131.574585] BTRFS info (device sdb): disk space caching is enabled
[ 3131.575360] BTRFS: failed to read chunk root on sdb
[ 3131.582234] BTRFS: open_ctree failed

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 17:41 recovery problem raid5 Marcin Solecki
@ 2016-03-18 18:02 ` Hugo Mills
  2016-03-18 18:08   ` Marcin Solecki
                     ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Hugo Mills @ 2016-03-18 18:02 UTC (permalink / raw)
  To: Marcin Solecki; +Cc: linux-btrfs

[-- Attachment #1: Type: text/plain, Size: 5550 bytes --]

   The main thing you haven't tried here is mount -o degraded, which
is the thing to do if you have a missing device in your array.

   Also, that kernel's not really all that good for a parity RAID
array -- it's the very first one that had the scrub and replace
implementation, so it's rather less stable with parity RAID than the
later 4.x kernels. That's probably not the issue here, though.

   Hugo.

On Fri, Mar 18, 2016 at 06:41:32PM +0100, Marcin Solecki wrote:
> Hello all,
> I give up for this problem at restore my data
> 
> 
> # uname -a
> Linux jarvis.home 4.5.0-1.el7.elrepo.x86_64
> 
> # btrfs --version
> btrfs-progs v3.19.1
> 
> # btrfs fi show
> warning, device 4 is missing
> bytenr mismatch, want=21020672, have=21217280
> Couldn't read chunk root
> Label: none  uuid: 27ef2638-b50a-4243-80ed-40c3733ec11d
>         Total devices 4 FS bytes used 2.50TiB
>         devid    1 size 931.51GiB used 899.71GiB path /dev/sdd
>         devid    2 size 931.51GiB used 899.69GiB path /dev/sdb
>         devid    3 size 931.51GiB used 899.69GiB path /dev/sdc
>         *** Some devices missing
> 
> # mount  -o recovery /dev/sda /srv/
> mount: wrong fs type, bad option, bad superblock on /dev/sda,
>        missing codepage or helper program, or other error
> 
>        In some cases useful info is found in syslog - try
>        dmesg | tail or so.
> 
> dmesg after :
> [ 4886.521315] BTRFS info (device sdc): enabling auto recovery
> [ 4886.521320] BTRFS info (device sdc): disk space caching is enabled
> [ 4886.522853] BTRFS: failed to read chunk tree on sdc
> [ 4886.528789] BTRFS: open_ctree failed
> 
> #btrfs check --repair /dev/sda
> enabling repair mode
> warning, device 4 is missing
> bytenr mismatch, want=21020672, have=21217280
> Couldn't read chunk root
> Couldn't open file system
> 
> # btrfs rescue chunk-recover -v /dev/sda
> All Devices:
>         Device: id = 3, name = /dev/sdc
>         Device: id = 2, name = /dev/sdb
>         Device: id = 1, name = /dev/sda
> 
> [ 5164.468272] btrfs[3653]: segfault at 7f454014172e ip
> 0000000000423479 sp 00007f4482cec880 error 4 in btrfs[400000+83000]
> [ 5168.928317] btrfs[3657]: segfault at 7fd18c14172e ip
> 0000000000423479 sp 00007fd0d5858880 error 4 in btrfs[400000+83000]
> [ 5173.812457] btrfs[3662]: segfault at 7fd76c14172e ip
> 0000000000423479 sp 00007fd6b0e59880 error 4 in btrfs[400000+83000]
> 
> # btrfs rescue super-recover -v /dev/sda
> All Devices:
>         Device: id = 3, name = /dev/sdc
>         Device: id = 2, name = /dev/sdb
>         Device: id = 1, name = /dev/sda
> 
> Before Recovering:
>         [All good supers]:
>                 device name = /dev/sdc
>                 superblock bytenr = 65536
> 
>                 device name = /dev/sdc
>                 superblock bytenr = 67108864
> 
>                 device name = /dev/sdc
>                 superblock bytenr = 274877906944
> 
>                 device name = /dev/sdb
>                 superblock bytenr = 65536
> 
>                 device name = /dev/sdb
>                 superblock bytenr = 67108864
> 
>                 device name = /dev/sdb
>                 superblock bytenr = 274877906944
> 
>                 device name = /dev/sda
>                 superblock bytenr = 65536
> 
>                 device name = /dev/sda
>                 superblock bytenr = 67108864
> 
>                 device name = /dev/sda
>                 superblock bytenr = 274877906944
> 
>         [All bad supers]:
> 
> All supers are valid, no need to recover
> 
> # btrfs-show-super /dev/sda
> superblock: bytenr=65536, device=/dev/sda
> ---------------------------------------------------------
> csum                    0x61b509bb [match]
> bytenr                  65536
> flags                   0x1
> magic                   _BHRfS_M [match]
> fsid                    27ef2638-b50a-4243-80ed-40c3733ec11d
> label
> generation              69462
> root                    1648640000
> sys_array_size          290
> chunk_root_generation   48545
> root_level              1
> chunk_root              21020672
> chunk_root_level        1
> log_root                0
> log_root_transid        0
> log_root_level          0
> total_bytes             4000819544064
> bytes_used              2743528714240
> sectorsize              4096
> nodesize                16384
> leafsize                16384
> stripesize              4096
> root_dir                6
> num_devices             4
> compat_flags            0x0
> compat_ro_flags         0x0
> incompat_flags          0xe1
>                         ( MIXED_BACKREF |
>                           BIG_METADATA |
>                           EXTENDED_IREF |
>                           RAID56 )
> csum_type               0
> csum_size               4
> cache_generation        69462
> uuid_tree_generation    69462
> dev_item.uuid           70f4650c-e01d-4613-bd7a-a6834c1c44bb
> dev_item.fsid           27ef2638-b50a-4243-80ed-40c3733ec11d [match]
> dev_item.type           0
> dev_item.total_bytes    1000204886016
> dev_item.bytes_used     966057263104
> dev_item.io_align       4096
> dev_item.io_width       4096
> dev_item.sector_size    4096
> dev_item.devid          1
> dev_item.dev_group      0
> dev_item.seek_speed     0
> dev_item.bandwidth      0
> dev_item.generation     0
> 
> thx for helps
> 

-- 
Hugo Mills             | You can't expect a boy to be vicious until he's gone
hugo@... carfax.org.uk | to a good school.
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                                                  Saki

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 18:02 ` Hugo Mills
@ 2016-03-18 18:08   ` Marcin Solecki
  2016-03-18 23:31   ` Chris Murphy
  2016-03-19  0:39   ` Duncan
  2 siblings, 0 replies; 11+ messages in thread
From: Marcin Solecki @ 2016-03-18 18:08 UTC (permalink / raw)
  To: Hugo Mills, linux-btrfs

I try mount with -o degraded but this same effect what recovery:

[ 7133.926778] BTRFS info (device sdc): allowing degraded mounts
[ 7133.926783] BTRFS info (device sdc): disk space caching is enabled
[ 7133.932140] BTRFS info (device sdc): bdev (null) errs: wr 921, rd 
164889, flush 0, corrupt 0, gen 0
[ 7133.993146] BTRFS error (device sdc): bad tree block start 0 1619525632
[ 7133.993185] BTRFS: Failed to read block groups: -5
[ 7134.002111] BTRFS: open_ctree failed

Do you  suggest to return for stable kernel on centos? ( 3.10 )


W dniu 2016-03-18 o 19:02, Hugo Mills pisze:
>     The main thing you haven't tried here is mount -o degraded, which
> is the thing to do if you have a missing device in your array.
>
>     Also, that kernel's not really all that good for a parity RAID
> array -- it's the very first one that had the scrub and replace
> implementation, so it's rather less stable with parity RAID than the
> later 4.x kernels. That's probably not the issue here, though.
>
>     Hugo.
>
> On Fri, Mar 18, 2016 at 06:41:32PM +0100, Marcin Solecki wrote:
>> Hello all,
>> I give up for this problem at restore my data
>>
>>
>> # uname -a
>> Linux jarvis.home 4.5.0-1.el7.elrepo.x86_64
>>
>> # btrfs --version
>> btrfs-progs v3.19.1
>>
>> # btrfs fi show
>> warning, device 4 is missing
>> bytenr mismatch, want=21020672, have=21217280
>> Couldn't read chunk root
>> Label: none  uuid: 27ef2638-b50a-4243-80ed-40c3733ec11d
>>          Total devices 4 FS bytes used 2.50TiB
>>          devid    1 size 931.51GiB used 899.71GiB path /dev/sdd
>>          devid    2 size 931.51GiB used 899.69GiB path /dev/sdb
>>          devid    3 size 931.51GiB used 899.69GiB path /dev/sdc
>>          *** Some devices missing
>>
>> # mount  -o recovery /dev/sda /srv/
>> mount: wrong fs type, bad option, bad superblock on /dev/sda,
>>         missing codepage or helper program, or other error
>>
>>         In some cases useful info is found in syslog - try
>>         dmesg | tail or so.
>>
>> dmesg after :
>> [ 4886.521315] BTRFS info (device sdc): enabling auto recovery
>> [ 4886.521320] BTRFS info (device sdc): disk space caching is enabled
>> [ 4886.522853] BTRFS: failed to read chunk tree on sdc
>> [ 4886.528789] BTRFS: open_ctree failed
>>
>> #btrfs check --repair /dev/sda
>> enabling repair mode
>> warning, device 4 is missing
>> bytenr mismatch, want=21020672, have=21217280
>> Couldn't read chunk root
>> Couldn't open file system
>>
>> # btrfs rescue chunk-recover -v /dev/sda
>> All Devices:
>>          Device: id = 3, name = /dev/sdc
>>          Device: id = 2, name = /dev/sdb
>>          Device: id = 1, name = /dev/sda
>>
>> [ 5164.468272] btrfs[3653]: segfault at 7f454014172e ip
>> 0000000000423479 sp 00007f4482cec880 error 4 in btrfs[400000+83000]
>> [ 5168.928317] btrfs[3657]: segfault at 7fd18c14172e ip
>> 0000000000423479 sp 00007fd0d5858880 error 4 in btrfs[400000+83000]
>> [ 5173.812457] btrfs[3662]: segfault at 7fd76c14172e ip
>> 0000000000423479 sp 00007fd6b0e59880 error 4 in btrfs[400000+83000]
>>
>> # btrfs rescue super-recover -v /dev/sda
>> All Devices:
>>          Device: id = 3, name = /dev/sdc
>>          Device: id = 2, name = /dev/sdb
>>          Device: id = 1, name = /dev/sda
>>
>> Before Recovering:
>>          [All good supers]:
>>                  device name = /dev/sdc
>>                  superblock bytenr = 65536
>>
>>                  device name = /dev/sdc
>>                  superblock bytenr = 67108864
>>
>>                  device name = /dev/sdc
>>                  superblock bytenr = 274877906944
>>
>>                  device name = /dev/sdb
>>                  superblock bytenr = 65536
>>
>>                  device name = /dev/sdb
>>                  superblock bytenr = 67108864
>>
>>                  device name = /dev/sdb
>>                  superblock bytenr = 274877906944
>>
>>                  device name = /dev/sda
>>                  superblock bytenr = 65536
>>
>>                  device name = /dev/sda
>>                  superblock bytenr = 67108864
>>
>>                  device name = /dev/sda
>>                  superblock bytenr = 274877906944
>>
>>          [All bad supers]:
>>
>> All supers are valid, no need to recover
>>
>> # btrfs-show-super /dev/sda
>> superblock: bytenr=65536, device=/dev/sda
>> ---------------------------------------------------------
>> csum                    0x61b509bb [match]
>> bytenr                  65536
>> flags                   0x1
>> magic                   _BHRfS_M [match]
>> fsid                    27ef2638-b50a-4243-80ed-40c3733ec11d
>> label
>> generation              69462
>> root                    1648640000
>> sys_array_size          290
>> chunk_root_generation   48545
>> root_level              1
>> chunk_root              21020672
>> chunk_root_level        1
>> log_root                0
>> log_root_transid        0
>> log_root_level          0
>> total_bytes             4000819544064
>> bytes_used              2743528714240
>> sectorsize              4096
>> nodesize                16384
>> leafsize                16384
>> stripesize              4096
>> root_dir                6
>> num_devices             4
>> compat_flags            0x0
>> compat_ro_flags         0x0
>> incompat_flags          0xe1
>>                          ( MIXED_BACKREF |
>>                            BIG_METADATA |
>>                            EXTENDED_IREF |
>>                            RAID56 )
>> csum_type               0
>> csum_size               4
>> cache_generation        69462
>> uuid_tree_generation    69462
>> dev_item.uuid           70f4650c-e01d-4613-bd7a-a6834c1c44bb
>> dev_item.fsid           27ef2638-b50a-4243-80ed-40c3733ec11d [match]
>> dev_item.type           0
>> dev_item.total_bytes    1000204886016
>> dev_item.bytes_used     966057263104
>> dev_item.io_align       4096
>> dev_item.io_width       4096
>> dev_item.sector_size    4096
>> dev_item.devid          1
>> dev_item.dev_group      0
>> dev_item.seek_speed     0
>> dev_item.bandwidth      0
>> dev_item.generation     0
>>
>> thx for helps
>>

-- 
Pozdrawiam Marcin Solecki


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 18:02 ` Hugo Mills
  2016-03-18 18:08   ` Marcin Solecki
@ 2016-03-18 23:31   ` Chris Murphy
  2016-03-18 23:34     ` Hugo Mills
  2016-03-18 23:40     ` Chris Murphy
  2016-03-19  0:39   ` Duncan
  2 siblings, 2 replies; 11+ messages in thread
From: Chris Murphy @ 2016-03-18 23:31 UTC (permalink / raw)
  To: Hugo Mills, Marcin Solecki, Btrfs BTRFS

On Fri, Mar 18, 2016 at 12:02 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>    The main thing you haven't tried here is mount -o degraded, which
> is the thing to do if you have a missing device in your array.
>
>    Also, that kernel's not really all that good for a parity RAID
> array -- it's the very first one that had the scrub and replace
> implementation, so it's rather less stable with parity RAID than the
> later 4.x kernels. That's probably not the issue here, though.


It's a 4.5.0 kernel with 3.19 progs. I'd update the progs even though
that's unlikely the problem.



-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 23:31   ` Chris Murphy
@ 2016-03-18 23:34     ` Hugo Mills
  2016-03-18 23:40     ` Chris Murphy
  1 sibling, 0 replies; 11+ messages in thread
From: Hugo Mills @ 2016-03-18 23:34 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Marcin Solecki, Btrfs BTRFS

[-- Attachment #1: Type: text/plain, Size: 1071 bytes --]

On Fri, Mar 18, 2016 at 05:31:51PM -0600, Chris Murphy wrote:
> On Fri, Mar 18, 2016 at 12:02 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
> >    The main thing you haven't tried here is mount -o degraded, which
> > is the thing to do if you have a missing device in your array.
> >
> >    Also, that kernel's not really all that good for a parity RAID
> > array -- it's the very first one that had the scrub and replace
> > implementation, so it's rather less stable with parity RAID than the
> > later 4.x kernels. That's probably not the issue here, though.
> 
> 
> It's a 4.5.0 kernel with 3.19 progs. I'd update the progs even though
> that's unlikely the problem.

   Oh, my mistake. I misread the 3.19 as the kernel version.

   Marcin: Stick with the 4.5 kernel.

   Hugo.

-- 
Hugo Mills             | Yes, this is an example of something that becomes
hugo@... carfax.org.uk | less explosive as a one-to-one cocrystal with TNT.
http://carfax.org.uk/  | (Hexanitrohexaazaisowurtzitane)
PGP: E2AB1DE4          |                                            Derek Lowe

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 23:31   ` Chris Murphy
  2016-03-18 23:34     ` Hugo Mills
@ 2016-03-18 23:40     ` Chris Murphy
  2016-03-19  8:21       ` Marcin Solecki
  1 sibling, 1 reply; 11+ messages in thread
From: Chris Murphy @ 2016-03-18 23:40 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Marcin Solecki, Btrfs BTRFS

On Fri, Mar 18, 2016 at 5:31 PM, Chris Murphy <lists@colorremedies.com> wrote:
> On Fri, Mar 18, 2016 at 12:02 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>>    The main thing you haven't tried here is mount -o degraded, which
>> is the thing to do if you have a missing device in your array.
>>
>>    Also, that kernel's not really all that good for a parity RAID
>> array -- it's the very first one that had the scrub and replace
>> implementation, so it's rather less stable with parity RAID than the
>> later 4.x kernels. That's probably not the issue here, though.
>
>
> It's a 4.5.0 kernel with 3.19 progs. I'd update the progs even though

And actually I'm wrong because it's possible progs 4.4.1 might help
fix things. But really the problem is that -o degraded isn't work for
the volume with a single missing device and I can't tell you why. It
might be a bug, but it might be that progs 3.19 --repair wasn't a good
idea to do on a volume with one missing device.

I'm really skeptical of any sorts of repairs being allowed without a
scary warning and requiring a force flag on volumes that are degraded.
I know this is possible with ext4 and XFS, but that's only because
they have no idea when the underlying raid is degraded.


-- 
Chris Murphy

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 18:02 ` Hugo Mills
  2016-03-18 18:08   ` Marcin Solecki
  2016-03-18 23:31   ` Chris Murphy
@ 2016-03-19  0:39   ` Duncan
  2 siblings, 0 replies; 11+ messages in thread
From: Duncan @ 2016-03-19  0:39 UTC (permalink / raw)
  To: linux-btrfs

Hugo Mills posted on Fri, 18 Mar 2016 18:02:07 +0000 as excerpted:

> Also, that kernel's not really all that good for a parity RAID
> array -- it's the very first one that had the scrub and replace
> implementation, so it's rather less stable with parity RAID than the
> later 4.x kernels. That's probably not the issue here, though.
> 
> On Fri, Mar 18, 2016 at 06:41:32PM +0100, Marcin Solecki wrote:
>> 
>> # uname -a Linux jarvis.home 4.5.0-1.el7.elrepo.x86_64
>> 
>> # btrfs --version btrfs-progs v3.19.1

Umm... Hugo, look again.  He's running current 4.5 kernel.  It's the 
btrfs-progs version that's v3.19.1 and thus old.

Marcin.  You're kernel is current and should be fine.  It's the older 
kernels that have the most problems and 3.19 that was the first one with 
parity-raid scrub and replace, so Hugo obviously just read your post 
wrong.

Other than that, I'd normally trust Hugo's recommendations over any I 
might offer, and mount -o degraded is indeed precisely what's supposed to 
be used in a missing device situation, so I agree with him there.

You might consider upgrading userspace (btrfs-progs), as Hugo does have a 
point, if he did make it about the wrong thing, and while a 3.19 era 
userspace should work, given that full parity-raid support was very new 
at that point, a current 4.4.1 userspace may well have a few more 
bugfixes, tho I've not tracked the parity-raid support specifically 
enough to know for sure if it has any that apply to that.  But once he 
figures out you were talking about 3.19.1 userspace, not 3.19.1 kernel, 
Hugo will probably know more which if any parity-raid specific changes 
have been made to btrfs-progs since 3.19, as well.

(Of course, the wiki has a short user-targeted description of what 
changed in each btrfs-progs release as well, and I could look it up there 
if wanted to, but so can you, and I'm using the older raid1, not parity-
raid, so I don't have the direct personal interest in that info that you 
might, so...)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-03-18 23:40     ` Chris Murphy
@ 2016-03-19  8:21       ` Marcin Solecki
  0 siblings, 0 replies; 11+ messages in thread
From: Marcin Solecki @ 2016-03-19  8:21 UTC (permalink / raw)
  To: Chris Murphy; +Cc: Hugo Mills, Btrfs BTRFS



W dniu 2016-03-19 o 00:40, Chris Murphy pisze:
> On Fri, Mar 18, 2016 at 5:31 PM, Chris Murphy <lists@colorremedies.com> wrote:
>> On Fri, Mar 18, 2016 at 12:02 PM, Hugo Mills <hugo@carfax.org.uk> wrote:
>>>     The main thing you haven't tried here is mount -o degraded, which
>>> is the thing to do if you have a missing device in your array.
>>>
>>>     Also, that kernel's not really all that good for a parity RAID
>>> array -- it's the very first one that had the scrub and replace
>>> implementation, so it's rather less stable with parity RAID than the
>>> later 4.x kernels. That's probably not the issue here, though.
>>
>> It's a 4.5.0 kernel with 3.19 progs. I'd update the progs even though
> And actually I'm wrong because it's possible progs 4.4.1 might help
> fix things. But really the problem is that -o degraded isn't work for
> the volume with a single missing device and I can't tell you why. It
> might be a bug, but it might be that progs 3.19 --repair wasn't a good
> idea to do on a volume with one missing device.
>
> I'm really skeptical of any sorts of repairs being allowed without a
> scary warning and requiring a force flag on volumes that are degraded.
> I know this is possible with ext4 and XFS, but that's only because
> they have no idea when the underlying raid is degraded.
>
>
I try on fedora 23, kernel line 4.x and btrfs progs 4.x ( don't remember 
), already I resigned to the loss of data but I want to try anything 
what you suggest

-- 
Pozdrawiam Marcin Solecki


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-04-30  1:25 ` Duncan
@ 2016-05-03  9:48   ` Pierre-Matthieu anglade
  0 siblings, 0 replies; 11+ messages in thread
From: Pierre-Matthieu anglade @ 2016-05-03  9:48 UTC (permalink / raw)
  Cc: linux-btrfs

On Sat, Apr 30, 2016 at 1:25 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Pierre-Matthieu anglade posted on Fri, 29 Apr 2016 11:24:12 +0000 as
> excerpted:

> So while btrfs in general, being still not yet fully stable, isn't yet
> really recommended unless you're using data you can afford to lose,
> either because it's backed up, or because it really is data you can
> afford to lose, for raid56 that's *DEFINITELY* the case, because (as
> you've nicely demonstrated) there are known bugs that can affect raid56
> recovery from degraded, to the point it's known that btrfs raid56 can't
> always be relied upon, so you *better* either have backups and be
> prepared to use them, or simply not put anything on the btrfs raid56 that
> you're not willing to lose in the first place.
>
> That's the general picture.  Btrfs raid56 is strongly negatively-
> recommended for anything but testing usage, at this point, as there are
> still known bugs that can affect degraded recovery.

Thank you for having made the picture clearer. Fortunately my case was
just a test one. And from the information I've been able to gather on
the web, I was wondering first if such a bug (or misdemeanour from the
system administrator that goes undetected by the software) would be of
interest to people developping/using btrfs ; 2) if there is some way
to dig a little more my problem since my goal in testing btrfs is also
to get some knowldge about it.

>> # btrfs fi show
>> warning, device 1 is missing
>> warning, device 1 is missing
>> warning devid 1 not found already
>> bytenr mismatch, want=125903568896, have=125903437824
>> Couldn't read tree root Label: none
>>  uuid: 26220e12-d6bd-48b2-89bc-e5df29062484
>>     Total devices 4 FS bytes used 162.48GiB
>>     devid    2 size 2.71TiB used 64.38GiB path /dev/sdb2
>>     devid    3 size 2.71TiB used 64.91GiB path /dev/sdc2
>>     devid    4 size 2.71TiB used 64.91GiB path /dev/sdd2
>>     *** Some devices missing
>
> Unfortunately you can't get it if the filesystem won't mount, but a btrfs
> fi usage (newer, should work with 4.4) or btrfs fi df (should work with
> pretty much any btrfs-tools, going back a very long way, but needs to be
> combined with btrfs fi show output as well to interpret) would have been
> very helpful, here.  Nothing you can do about it when you can't mount,
> but if you had saved the output before the first device removal/replace
> and again before the second, that would have been useful information to
> have.

Here they are :

# btrfs fi usage /mnt
WARNING: RAID56 detected, not implemented
WARNING: RAID56 detected, not implemented
WARNING: RAID56 detected, not implemented
Overall:
    Device size:          10.86TiB
    Device allocated:             0.00B
    Device unallocated:          10.86TiB
    Device missing:             0.00B
    Used:                 0.00B
    Free (estimated):             0.00B    (min: 8.00EiB)
    Data ratio:                  0.00
    Metadata ratio:              0.00
    Global reserve:          80.00MiB    (used: 0.00B)

Data,RAID5: Size:183.00GiB, Used:162.26GiB
   /dev/sda2      61.00GiB
   /dev/sdb2      61.00GiB
   /dev/sdc2      61.00GiB
   /dev/sdd2      61.00GiB

Metadata,RAID5: Size:2.03GiB, Used:228.56MiB
   /dev/sda2     864.00MiB
   /dev/sdb2     352.00MiB
   /dev/sdc2     864.00MiB
   /dev/sdd2     864.00MiB

System,RAID5: Size:64.00MiB, Used:16.00KiB
   /dev/sda2      32.00MiB
   /dev/sdc2      32.00MiB
   /dev/sdd2      32.00MiB

Unallocated:
   /dev/sda2       2.65TiB
   /dev/sdb2       2.65TiB
   /dev/sdc2       2.65TiB
   /dev/sdd2       2.65TiB

#btrfs fi df /mnt
Data, RAID5: total=183.00GiB, used=162.26GiB
System, RAID5: total=64.00MiB, used=16.00KiB
Metadata, RAID5: total=2.03GiB, used=228.56MiB
GlobalReserve, single: total=80.00MiB, used=0.00B


#  btrfs fi show
Label: none  uuid: 26220e12-d6bd-48b2-89bc-e5df29062484
    Total devices 4 FS bytes used 162.48GiB
    devid    1 size 2.71TiB used 61.88GiB path /dev/sda2
    devid    2 size 2.71TiB used 61.34GiB path /dev/sdb2
    devid    3 size 2.71TiB used 61.88GiB path /dev/sdc2
    devid    4 size 2.71TiB used 61.88GiB path /dev/sdd2



>
> Presumably you used btrfs device add and then btrfs balance to do the
> convert.  Do you perhaps remember the balance command you used?

Fortunately the full log (except the parts done with a livecd) is
there. I've tried to filter out irrelevant btrfs commands. Still
keeping the track of my jaggy setup trajectory :

   17  mkfs.btrfs /dev/sdb2
   18  mkfs.btrfs /dev/sdc2
   19  mkfs.btrfs /dev/sdd2
   29  btrfs device add /dev/sdb2 /dev/sc2 /dev/sdd2 /dev/sda2
   30  btrfs device add /dev/sdb2 /dev/sc2 /dev/sdd2 /
   31  btrfs device add /dev/sdb2 /dev/sdc2 /dev/sdd2 /
   32  btrfs device add /dev/sdb2 /dev/sdc2 /dev/sdd2 / -f
   34  btrfs balance start -dconvert=raid5 -mconvert=raid5 /
   41  btrfs fi balance start -dconvert=raid5 -mconvert=raid5  /
   43  btrfs fi balance start -dconvert=single -mconvert=single  /
   44  btrfs fi balance start -dconvert=single -mconvert=single
   45  btrfs fi balance start -dconvert=single -mconvert=single  /dev/sda2
   46  btrfs fi balance start -dconvert=single -mconvert=single  /dev/sdb2
   47  btrfs fi balance start -f  -dconvert=single -mconvert=single  /
   50  btrfs fi balance start -f  -dconvert=single -mconvert=single  /
   57  btrfs fi balance start -f  -dconvert=raid5 -mconvert=raid5  /
  217  btrfs fi balance /
  240  btrfs-find-root
  241  btrfs-find-root  -a
  242  btrfs-find-root  -a /
  243  btrfs-find-root  /dev/sda2
  312  btrfs check /
  313  btrfs check
  321  btrfs-find-root /dev/sda2
  322  btrfs-find-root /dev/sdb2
  323  btrfs scrub status /
  325  btrfs check  /

>
> Or more precisely, were you sure to balance-convert both data AND
> metadata to raid5?

I am. But in front of the previous output, the question may be : are
you ? I wonder. Among the naughty things I did, maybe some were really
harmful ?

>
> Summary to ensure I'm getting it right:
>
> a) You had a working btrfs raid5
> b) You replaced one drive, which _appeared_ to work fine.

smartctl was ok ; and every btrfs checking tool I had at hand also.

> c) Reboot. (So it can't be a simple problem of btrfs getting confused
> with the device changes in memory)

Definitely yes.

> d) You tried to replace a second and things fell apart.

Before trying a second replacement, I did the complete rebuild,
reboot, and checked the btrfs file system.

>
> Unfortunately, an as yet not fully traced bug with exactly this sort of
> serial replace is actually one of the known bug's they're still
> investigating.  It's one of at least two known bugs that are severe
> enough to keep raid56 mode from stabilizing to the general level of the
> rest of btrfs and to continue to force that strongly negative-
> recommendation on anything but testing usage with data that can be safely
> lost, either because it's fully backed up or because it really is trivial
> testing data the loss of which is no big deal.

So if this is an already known bug, one of the motivations to post
here vanished : I guess the informations about this buggy setup is of
no interest to the developpers. Am I right ?

>
> Btrfs fi usage after the first replace may or may not have displayed a
> problem.  Similarly, btrfs scrub may or may not have detected and/or
> fixed a problem.  And again with btrfs check.  The problem right now is
> that while we have lots of reports of the serial replace bug, we don't
> have enough people confirmably doing these things after the first replace
> and reporting the results to know if they detect and possibly fix the
> issue, allowing the second replace to work fine if fixed, or not.

If you think I can help in anyway, please tell me.


In the meanwhile, I think I'll keep btrfs, but until linux 5.x reaches
the shelves, I may be more conservative and use the raid 1 or raid 10
modes following your advices. The very nice point with btrfs is its
flexibility : I'll be able to switch to an other mode later on. Likely
without any new installation. Nice.

Again, thank you very much for your reply and welcoming here.

On Sat, Apr 30, 2016 at 1:25 AM, Duncan <1i5t5.duncan@cox.net> wrote:
> Pierre-Matthieu anglade posted on Fri, 29 Apr 2016 11:24:12 +0000 as
> excerpted:
>
>> Setting up and then testing a system I've stumbled upon something that
>> looks exactly similar to the behaviour depicted by Marcin Solecki here
>> https://www.spinics.net/lists/linux-btrfs/msg53119.html.
>>
>> Maybe unlike Martin I still have all my disk working nicely. So the Raid
>> array is OK, the system running on it is ok. But If I remove one of the
>> drive and try to mount in degraded mode, mounting the filesystem, and
>> then recovering fails.
>>
>> More precisely, the situation is the following :
>> # uname -a
>> Linux ubuntu 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18
>> 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu
>>
>> # btrfs --version btrfs-progs v4.4
>
> 4.4 kernel and progs.  You are to be commended. =:^)
>
> Unfortunately too many people report way old versions here, apparently
> not taking into account that btrfs in general is still stabilizing, not
> fully stable and mature, and that as a result what they're running is
> many kernels and fixed bugs ago.
>
> And FWIW, btrfs parity-raid, aka raid56 mode, is newer still, and while
> nominally complete for a year with the release of 4.4 (original nominal
> completion in 3.19), still remains less stable than redundancy-raid, aka
> raid1 or raid10 modes.  In fact, there's still known bugs in raid56 mode
> in the current 4.5, and presumably in the upcoming 4.6 as well, as I've
> not seen discussion indicating they've actually fully traced the bugs and
> been able to fix them just yet.
>
> So while btrfs in general, being still not yet fully stable, isn't yet
> really recommended unless you're using data you can afford to lose,
> either because it's backed up, or because it really is data you can
> afford to lose, for raid56 that's *DEFINITELY* the case, because (as
> you've nicely demonstrated) there are known bugs that can affect raid56
> recovery from degraded, to the point it's known that btrfs raid56 can't
> always be relied upon, so you *better* either have backups and be
> prepared to use them, or simply not put anything on the btrfs raid56 that
> you're not willing to lose in the first place.
>
> That's the general picture.  Btrfs raid56 is strongly negatively-
> recommended for anything but testing usage, at this point, as there are
> still known bugs that can affect degraded recovery.
>
> There's a bit more specific suggestions and detail below.
>
>> # btrfs fi show
>> warning, device 1 is missing
>> warning, device 1 is missing
>> warning devid 1 not found already
>> bytenr mismatch, want=125903568896, have=125903437824
>> Couldn't read tree root Label: none
>>  uuid: 26220e12-d6bd-48b2-89bc-e5df29062484
>>     Total devices 4 FS bytes used 162.48GiB
>>     devid    2 size 2.71TiB used 64.38GiB path /dev/sdb2
>>     devid    3 size 2.71TiB used 64.91GiB path /dev/sdc2
>>     devid    4 size 2.71TiB used 64.91GiB path /dev/sdd2
>>     *** Some devices missing
>
> Unfortunately you can't get it if the filesystem won't mount, but a btrfs
> fi usage (newer, should work with 4.4) or btrfs fi df (should work with
> pretty much any btrfs-tools, going back a very long way, but needs to be
> combined with btrfs fi show output as well to interpret) would have been
> very helpful, here.  Nothing you can do about it when you can't mount,
> but if you had saved the output before the first device removal/replace
> and again before the second, that would have been useful information to
> have.
>
>> # mount -o degraded /dev/sdb2 /mnt
>> mount: /dev/sdb2: can't read superblock
>>
>> # dmesg |tail
>> [12852.044823] BTRFS info (device sdd2): allowing degraded mounts
>> [12852.044829] BTRFS info (device sdd2): disk space caching is enabled
>> [12852.044831] BTRFS: has skinny extents
>> [12852.073746] BTRFS error (device sdd2): bad tree block
>> start 196608 125257826304
>> [12852.121589] BTRFS: open_ctree failed
>
> FWIW, tho you may already have known/gathered this, open ctree failed is
> the generic btrfs mount failure message.  The bad tree block error does
> tell you what block failed to read, but that's more an aid to developer
> debugging than help at the machine admin level.
>
>> ----------------
>> In case it may help I came there the following way :
>> 1) *I've installed ubuntu on a single btrfs partition.
>> * Then I have added 3 other partitions
>> * convert the whole thing to a raid5 array
>> * play with the system and shut-down
>
> Presumably you used btrfs device add and then btrfs balance to do the
> convert.  Do you perhaps remember the balance command you used?
>
> Or more precisely, were you sure to balance-convert both data AND
> metadata to raid5?
>
> Here's where the output of btrfs fi df and/or btrfs fi usage would have
> helped, since that would have displayed exactly what chunk formats were
> actually being used.
>
>> 2) * Removed drive sdb and replaced it with a new drive
>> * restored the whole thing (using a livecd, and btrfs replace)
>> * reboot
>> * checked that the system is still working
>> * shut-down
>
>> 3) *removed drive sda and replaced it with a new one
>> * tried to perform the exact same operations I did when replacing sdb.
>> * It fails with some messages (not quite sure they were the same as
>> above).
>> * shutdown
>
>> 4) * put back sda
>> * check that I don't get any error message with my btrfs raid
>
>> 5. So I'm sure nothings looks like being corrupted
>> * shut-down
>
>> 5) * tried again step 3.
>> * get the messages shown above.
>>
>> I guess I can still put back my drive sda and get my btrfs working.
>> I'd be quite grateful for any comment or help.
>> I'm wondering if in my case the problem is not comming from the fact the
>> tree root (or something of that kind living only on sda) has not been
>> replicated when setting up the raid array ?
>
> Summary to ensure I'm getting it right:
>
> a) You had a working btrfs raid5
> b) You replaced one drive, which _appeared_ to work fine.
> c) Reboot. (So it can't be a simple problem of btrfs getting confused
> with the device changes in memory)
> d) You tried to replace a second and things fell apart.
>
> Unfortunately, an as yet not fully traced bug with exactly this sort of
> serial replace is actually one of the known bug's they're still
> investigating.  It's one of at least two known bugs that are severe
> enough to keep raid56 mode from stabilizing to the general level of the
> rest of btrfs and to continue to force that strongly negative-
> recommendation on anything but testing usage with data that can be safely
> lost, either because it's fully backed up or because it really is trivial
> testing data the loss of which is no big deal.
>
> Btrfs fi usage after the first replace may or may not have displayed a
> problem.  Similarly, btrfs scrub may or may not have detected and/or
> fixed a problem.  And again with btrfs check.  The problem right now is
> that while we have lots of reports of the serial replace bug, we don't
> have enough people confirmably doing these things after the first replace
> and reporting the results to know if they detect and possibly fix the
> issue, allowing the second replace to work fine if fixed, or not.
>
> In terms of a fix, I'm not a dev, just a btrfs user (raid1 and dup
> modes), and not sure of the current status based on list discussion.  But
> I do know it has been multi-reported by enough sources to be considered a
> known bug so the devs are looking into it, and that it's considered bad
> enough to keep btrfs parity raid from being considered anything close to
> the stability of btrfs in general, until such time as a fix is merged.
>
> I'd suggest waiting until at least 4.8 (better 4.9) before
> reconsideration for your own use, however, as it doesn't look like the
> fixes will make 4.6, and even if they hit 4.7, a couple of releases
> without any critical bugs before considering it usable won't hurt.
>
> Recommended alternatives? Btrfs raid1 and raid10 modes are considered to
> be at the same stability level as btrfs in general and I use btrfs raid1
> myself.  Because btrfs redundant-raid modes are all exactly two copy,
> four devices (assuming same-size) will give you two devices worth of
> usable space in either raid1 or raid10 mode.  That's down from the three
> devices worth of usable space you'd get with raid5, but unlike btrfs
> raid5, btrfs raid1 and raid10 are actually usably stable and generally
> recoverable from single-device-loss, tho with btrfs itself still
> considered stabilizing, not fully stable and mature, backups are still
> strongly recommended.
>
> Of course there's also mdraid and dmraid, on top of which you can run
> btrfs as well as other filesystems, but neither of those raid
> alternatives do the routine data integrity checks that btrfs does (when
> it's working correctly, of course), and btrfs, seeing a single device,
> will still do them and detect damage, but won't be able to actually fix
> it as it does in btrfs raid1/10 and does when working in raid56 mode.
> Unless you use btrfs dup mode on the single-device upper layer of course,
> but in that case it would be more efficient to use btrfs raid1 on the
> lower layers.
>
> Another possible alternative is btrfs raid1, on a pair of mdraid0s (or
> dmraid if you prefer).  This still gets the data integrity and repair at
> the btrfs raid1 level, while the underlying md/dmraid0s help speed things
> up a bit compared to the not yet optimized btrfs raid10.
>
> Of course you can and may in fact wish to return to older and more mature
> filesystems like ext4, or the reiserfs I use here, possibly on top of md/
> dmraid, but of course neither of them actually do the normal mode
> checksumming and verification that btrfs does, only using their
> redundancy or parity in recovery situations.
>
> And of course there's zfs, most directly comparable to btrfs' feature set
> and much more mature, but with hardware and licensing issues.  Hardware-
> wise, on Linux it requires relatively huge amounts of ECC-RAM, compared
> to btrfs.  (Its data integrity verification depends far more on error-
> free memory than btrfs, and without ECC-RAM, if there is a memory error,
> it can corrupt zfs, where btrfs would simply trigger an error.  So ECC-
> RAM is very strongly recommended and AFAIK no guarantees are made with
> regard to running it without ECC-RAM.)  But for zfs on Linux, if you're
> looking at existing hardware that lacks ECC-memory capacities, it's
> almost certainly cheaper to simply get another couple drives if you
> really need that third drive's space worth and do btrfs raid1 or raid10,
> than to switch to ECC-memory compatible hardware.
>
> As for zfs licensing issues you may or may not care, and apparently
> Ubuntu considers them minor enough to ship zfs now, but I'll just say
> they make zfs a non-option for me.
>
> Of course you can always switch to one of the bsds with zfs support if
> you're more comfortable with than than running zfs on linux.
>
> But regardless of all the above, zfs remains the most directly btrfs
> comparable actually stable and mature filesystem solution out there, so
> if that's your priority above all the other things, you'll probably find
> a way to run it.
>
> (FWIW, the other severe known raid56 bug has to do with (sometimes, not
> always, thus complicating tracing the bug) extremely slow balances in
> ordered to restripe to more or less devices, as one might do instead of
> failed device replacement or simply to change the number of devices in
> the array, to the point that completion could take weeks, so long that
> the chance of a device death during the balance is non-trivial, which
> means while the process technically works, in practice it's not actually
> usable.  Given that similar to the bug you came across, the ability to do
> this sort of thing is one of the traditional uses of parity-raid, being
> so slow as to be practically unusable makes this bug a blocker in terms
> of btrfs raid56 stability and ability to recommend it for use.  Both
> these bugs will need to be fixed, with no others at the same level
> showing up, before btrfs raid56 mode can be properly recommended for
> anything but testing use.)
>
> --
> Duncan - List replies preferred.   No HTML msgs.
> "Every nonfree program has a lord, a master --
> and if you use the program, he is your master."  Richard Stallman
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html



-- 
Pierre-Matthieu Anglade

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: recovery problem raid5
  2016-04-29 11:24 Pierre-Matthieu anglade
@ 2016-04-30  1:25 ` Duncan
  2016-05-03  9:48   ` Pierre-Matthieu anglade
  0 siblings, 1 reply; 11+ messages in thread
From: Duncan @ 2016-04-30  1:25 UTC (permalink / raw)
  To: linux-btrfs

Pierre-Matthieu anglade posted on Fri, 29 Apr 2016 11:24:12 +0000 as
excerpted:

> Setting up and then testing a system I've stumbled upon something that
> looks exactly similar to the behaviour depicted by Marcin Solecki here
> https://www.spinics.net/lists/linux-btrfs/msg53119.html.
> 
> Maybe unlike Martin I still have all my disk working nicely. So the Raid
> array is OK, the system running on it is ok. But If I remove one of the
> drive and try to mount in degraded mode, mounting the filesystem, and
> then recovering fails.
> 
> More precisely, the situation is the following :
> # uname -a
> Linux ubuntu 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18
> 18:33:37 UTC 2016 x86_64 x86_64 x86_64 GNU/Linu
> 
> # btrfs --version btrfs-progs v4.4

4.4 kernel and progs.  You are to be commended. =:^)

Unfortunately too many people report way old versions here, apparently 
not taking into account that btrfs in general is still stabilizing, not 
fully stable and mature, and that as a result what they're running is 
many kernels and fixed bugs ago.

And FWIW, btrfs parity-raid, aka raid56 mode, is newer still, and while 
nominally complete for a year with the release of 4.4 (original nominal 
completion in 3.19), still remains less stable than redundancy-raid, aka 
raid1 or raid10 modes.  In fact, there's still known bugs in raid56 mode 
in the current 4.5, and presumably in the upcoming 4.6 as well, as I've 
not seen discussion indicating they've actually fully traced the bugs and 
been able to fix them just yet.

So while btrfs in general, being still not yet fully stable, isn't yet 
really recommended unless you're using data you can afford to lose, 
either because it's backed up, or because it really is data you can 
afford to lose, for raid56 that's *DEFINITELY* the case, because (as 
you've nicely demonstrated) there are known bugs that can affect raid56 
recovery from degraded, to the point it's known that btrfs raid56 can't 
always be relied upon, so you *better* either have backups and be 
prepared to use them, or simply not put anything on the btrfs raid56 that 
you're not willing to lose in the first place.

That's the general picture.  Btrfs raid56 is strongly negatively-
recommended for anything but testing usage, at this point, as there are 
still known bugs that can affect degraded recovery.

There's a bit more specific suggestions and detail below.

> # btrfs fi show
> warning, device 1 is missing
> warning, device 1 is missing
> warning devid 1 not found already
> bytenr mismatch, want=125903568896, have=125903437824
> Couldn't read tree root Label: none
>  uuid: 26220e12-d6bd-48b2-89bc-e5df29062484
>     Total devices 4 FS bytes used 162.48GiB
>     devid    2 size 2.71TiB used 64.38GiB path /dev/sdb2
>     devid    3 size 2.71TiB used 64.91GiB path /dev/sdc2
>     devid    4 size 2.71TiB used 64.91GiB path /dev/sdd2
>     *** Some devices missing

Unfortunately you can't get it if the filesystem won't mount, but a btrfs 
fi usage (newer, should work with 4.4) or btrfs fi df (should work with 
pretty much any btrfs-tools, going back a very long way, but needs to be 
combined with btrfs fi show output as well to interpret) would have been 
very helpful, here.  Nothing you can do about it when you can't mount, 
but if you had saved the output before the first device removal/replace 
and again before the second, that would have been useful information to 
have.

> # mount -o degraded /dev/sdb2 /mnt
> mount: /dev/sdb2: can't read superblock
> 
> # dmesg |tail
> [12852.044823] BTRFS info (device sdd2): allowing degraded mounts
> [12852.044829] BTRFS info (device sdd2): disk space caching is enabled
> [12852.044831] BTRFS: has skinny extents
> [12852.073746] BTRFS error (device sdd2): bad tree block
> start 196608 125257826304
> [12852.121589] BTRFS: open_ctree failed

FWIW, tho you may already have known/gathered this, open ctree failed is 
the generic btrfs mount failure message.  The bad tree block error does 
tell you what block failed to read, but that's more an aid to developer 
debugging than help at the machine admin level.
 
> ----------------
> In case it may help I came there the following way :
> 1) *I've installed ubuntu on a single btrfs partition.
> * Then I have added 3 other partitions
> * convert the whole thing to a raid5 array
> * play with the system and shut-down

Presumably you used btrfs device add and then btrfs balance to do the 
convert.  Do you perhaps remember the balance command you used?

Or more precisely, were you sure to balance-convert both data AND 
metadata to raid5?

Here's where the output of btrfs fi df and/or btrfs fi usage would have 
helped, since that would have displayed exactly what chunk formats were 
actually being used.

> 2) * Removed drive sdb and replaced it with a new drive
> * restored the whole thing (using a livecd, and btrfs replace)
> * reboot
> * checked that the system is still working
> * shut-down

> 3) *removed drive sda and replaced it with a new one
> * tried to perform the exact same operations I did when replacing sdb.
> * It fails with some messages (not quite sure they were the same as
> above).
> * shutdown

> 4) * put back sda
> * check that I don't get any error message with my btrfs raid

> 5. So I'm sure nothings looks like being corrupted
> * shut-down

> 5) * tried again step 3.
> * get the messages shown above.
> 
> I guess I can still put back my drive sda and get my btrfs working.
> I'd be quite grateful for any comment or help.
> I'm wondering if in my case the problem is not comming from the fact the
> tree root (or something of that kind living only on sda) has not been
> replicated when setting up the raid array ?

Summary to ensure I'm getting it right:

a) You had a working btrfs raid5
b) You replaced one drive, which _appeared_ to work fine.
c) Reboot. (So it can't be a simple problem of btrfs getting confused 
with the device changes in memory)
d) You tried to replace a second and things fell apart.

Unfortunately, an as yet not fully traced bug with exactly this sort of 
serial replace is actually one of the known bug's they're still 
investigating.  It's one of at least two known bugs that are severe 
enough to keep raid56 mode from stabilizing to the general level of the 
rest of btrfs and to continue to force that strongly negative-
recommendation on anything but testing usage with data that can be safely 
lost, either because it's fully backed up or because it really is trivial 
testing data the loss of which is no big deal.

Btrfs fi usage after the first replace may or may not have displayed a 
problem.  Similarly, btrfs scrub may or may not have detected and/or 
fixed a problem.  And again with btrfs check.  The problem right now is 
that while we have lots of reports of the serial replace bug, we don't 
have enough people confirmably doing these things after the first replace 
and reporting the results to know if they detect and possibly fix the 
issue, allowing the second replace to work fine if fixed, or not.

In terms of a fix, I'm not a dev, just a btrfs user (raid1 and dup 
modes), and not sure of the current status based on list discussion.  But 
I do know it has been multi-reported by enough sources to be considered a 
known bug so the devs are looking into it, and that it's considered bad 
enough to keep btrfs parity raid from being considered anything close to 
the stability of btrfs in general, until such time as a fix is merged.  

I'd suggest waiting until at least 4.8 (better 4.9) before 
reconsideration for your own use, however, as it doesn't look like the 
fixes will make 4.6, and even if they hit 4.7, a couple of releases 
without any critical bugs before considering it usable won't hurt.

Recommended alternatives? Btrfs raid1 and raid10 modes are considered to 
be at the same stability level as btrfs in general and I use btrfs raid1 
myself.  Because btrfs redundant-raid modes are all exactly two copy, 
four devices (assuming same-size) will give you two devices worth of 
usable space in either raid1 or raid10 mode.  That's down from the three 
devices worth of usable space you'd get with raid5, but unlike btrfs 
raid5, btrfs raid1 and raid10 are actually usably stable and generally 
recoverable from single-device-loss, tho with btrfs itself still 
considered stabilizing, not fully stable and mature, backups are still 
strongly recommended.

Of course there's also mdraid and dmraid, on top of which you can run 
btrfs as well as other filesystems, but neither of those raid 
alternatives do the routine data integrity checks that btrfs does (when 
it's working correctly, of course), and btrfs, seeing a single device, 
will still do them and detect damage, but won't be able to actually fix 
it as it does in btrfs raid1/10 and does when working in raid56 mode.  
Unless you use btrfs dup mode on the single-device upper layer of course, 
but in that case it would be more efficient to use btrfs raid1 on the 
lower layers.

Another possible alternative is btrfs raid1, on a pair of mdraid0s (or 
dmraid if you prefer).  This still gets the data integrity and repair at 
the btrfs raid1 level, while the underlying md/dmraid0s help speed things 
up a bit compared to the not yet optimized btrfs raid10.

Of course you can and may in fact wish to return to older and more mature 
filesystems like ext4, or the reiserfs I use here, possibly on top of md/
dmraid, but of course neither of them actually do the normal mode 
checksumming and verification that btrfs does, only using their 
redundancy or parity in recovery situations.

And of course there's zfs, most directly comparable to btrfs' feature set 
and much more mature, but with hardware and licensing issues.  Hardware-
wise, on Linux it requires relatively huge amounts of ECC-RAM, compared 
to btrfs.  (Its data integrity verification depends far more on error-
free memory than btrfs, and without ECC-RAM, if there is a memory error, 
it can corrupt zfs, where btrfs would simply trigger an error.  So ECC-
RAM is very strongly recommended and AFAIK no guarantees are made with 
regard to running it without ECC-RAM.)  But for zfs on Linux, if you're 
looking at existing hardware that lacks ECC-memory capacities, it's 
almost certainly cheaper to simply get another couple drives if you 
really need that third drive's space worth and do btrfs raid1 or raid10, 
than to switch to ECC-memory compatible hardware.

As for zfs licensing issues you may or may not care, and apparently 
Ubuntu considers them minor enough to ship zfs now, but I'll just say 
they make zfs a non-option for me.

Of course you can always switch to one of the bsds with zfs support if 
you're more comfortable with than than running zfs on linux.

But regardless of all the above, zfs remains the most directly btrfs 
comparable actually stable and mature filesystem solution out there, so 
if that's your priority above all the other things, you'll probably find 
a way to run it.

(FWIW, the other severe known raid56 bug has to do with (sometimes, not 
always, thus complicating tracing the bug) extremely slow balances in 
ordered to restripe to more or less devices, as one might do instead of 
failed device replacement or simply to change the number of devices in 
the array, to the point that completion could take weeks, so long that 
the chance of a device death during the balance is non-trivial, which 
means while the process technically works, in practice it's not actually 
usable.  Given that similar to the bug you came across, the ability to do 
this sort of thing is one of the traditional uses of parity-raid, being 
so slow as to be practically unusable makes this bug a blocker in terms 
of btrfs raid56 stability and ability to recommend it for use.  Both 
these bugs will need to be fixed, with no others at the same level 
showing up, before btrfs raid56 mode can be properly recommended for 
anything but testing use.)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman


^ permalink raw reply	[flat|nested] 11+ messages in thread

* recovery problem raid5
@ 2016-04-29 11:24 Pierre-Matthieu anglade
  2016-04-30  1:25 ` Duncan
  0 siblings, 1 reply; 11+ messages in thread
From: Pierre-Matthieu anglade @ 2016-04-29 11:24 UTC (permalink / raw)
  To: linux-btrfs

Hello all,

Setting up and then testing a system I've stumbled upon something that
looks exactly similar to the behaviour depicted by Marcin Solecki here
https://www.spinics.net/lists/linux-btrfs/msg53119.html.

Maybe unlike Martin I still have all my disk working nicely. So the
Raid array is OK, the system running on it is ok. But If I remove one
of the drive and try to mount in degraded mode, mounting the
filesystem, and then recovering fails.

More precisely, the situation is the following :
# uname -a
Linux ubuntu 4.4.0-21-generic #37-Ubuntu SMP Mon Apr 18 18:33:37 UTC
2016 x86_64 x86_64 x86_64 GNU/Linu

# btrfs --version
btrfs-progs v4.4

# btrfs fi show
warning, device 1 is missing
warning, device 1 is missing
warning devid 1 not found already
bytenr mismatch, want=125903568896, have=125903437824
Couldn't read tree root
Label: none  uuid: 26220e12-d6bd-48b2-89bc-e5df29062484
    Total devices 4 FS bytes used 162.48GiB
    devid    2 size 2.71TiB used 64.38GiB path /dev/sdb2
    devid    3 size 2.71TiB used 64.91GiB path /dev/sdc2
    devid    4 size 2.71TiB used 64.91GiB path /dev/sdd2
    *** Some devices missing

# mount -o degraded /dev/sdb2 /mnt
mount: /dev/sdb2: can't read superblock

# dmesg |tail
[12852.044823] BTRFS info (device sdd2): allowing degraded mounts
[12852.044829] BTRFS info (device sdd2): disk space caching is enabled
[12852.044831] BTRFS: has skinny extents
[12852.073746] BTRFS error (device sdd2): bad tree block start 196608
125257826304
[12852.121589] BTRFS: open_ctree failed

----------------
In case it may help I came there the following way :
1) *I've installed ubuntu on a single btrfs partition.
* Then I have added 3 other partitions
* convert the whole thing to a raid5 array
* play with the system and shut-down
2) * Removed drive sdb and replaced it with a new drive
* restored the whole thing (using a livecd, and btrfs replace)
* reboot
* checked that the system is still working
* shut-down
3) *removed drive sda and replaced it with a new one
* tried to perform the exact same operations I did when replacing sdb.
* It fails with some messages (not quite sure they were the same as above).
* shutdown
4) * put back sda
* check that I don't get any error message with my btrfs raid 5. So
I'm sure nothings looks like being corrupted
* shut-down
5) * tried again step 3.
* get the messages shown above.

I guess I can still put back my drive sda and get my btrfs working.
I'd be quite grateful for any comment or help.
I'm wondering if in my case the problem is not comming from the fact
the tree root (or something of that kind living only on sda) has not
been replicated when setting up the raid array ?

Best regards,


-- 
Pierre-Matthieu Anglade

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-05-03  9:48 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-03-18 17:41 recovery problem raid5 Marcin Solecki
2016-03-18 18:02 ` Hugo Mills
2016-03-18 18:08   ` Marcin Solecki
2016-03-18 23:31   ` Chris Murphy
2016-03-18 23:34     ` Hugo Mills
2016-03-18 23:40     ` Chris Murphy
2016-03-19  8:21       ` Marcin Solecki
2016-03-19  0:39   ` Duncan
2016-04-29 11:24 Pierre-Matthieu anglade
2016-04-30  1:25 ` Duncan
2016-05-03  9:48   ` Pierre-Matthieu anglade

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.