* Recovering from hard disk failure in a pool @ 2014-02-14 10:35 Axelle 2014-02-14 10:58 ` Hugo Mills 0 siblings, 1 reply; 9+ messages in thread From: Axelle @ 2014-02-14 10:35 UTC (permalink / raw) To: linux-btrfs Hi, I've just encountered a hard disk crash in one of my btrfs pools. sudo btrfs filesystem show failed to open /dev/sr0: No medium found Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add Total devices 3 FS bytes used 112.70GB devid 1 size 100.61GB used 89.26GB path /dev/sdc6 devid 2 size 93.13GB used 84.00GB path /dev/sdc1 *** Some devices missing The device which is missing is /dev/sdb. I have replaced it with a new hard disk. How do I add it back to the volume and fix the device missing? The pool is expected to mount to /samples (it is not mounted yet). I tried this - which fails: sudo btrfs device add /dev/sdb /samples ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device Why isn't this working? I also tried this: sudo mount -o recovery /dev/sdc1 /samples mount: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so same with /dev/sdc6 I ran btrfsck --repair on /dev/sdc1 and /dev/sdc6. Apart that it reports a device is missing (/dev/sdb) seems okay. I also tried: sudo btrfs filesystem df /samples ERROR: couldn't get space info on '/samples' - Inappropriate ioctl for device and as I'm supposed to have a snapshot, this (but I suppose it's helpless as the volume isn't mounted) btrfs subvolume snapshot /samples/malwareSnapshot /before ERROR: error accessing '/samples/malwareSnapshot' Please help me out, thanks Axelle. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 10:35 Recovering from hard disk failure in a pool Axelle @ 2014-02-14 10:58 ` Hugo Mills 2014-02-14 11:04 ` Axelle 0 siblings, 1 reply; 9+ messages in thread From: Hugo Mills @ 2014-02-14 10:58 UTC (permalink / raw) To: Axelle; +Cc: linux-btrfs [-- Attachment #1: Type: text/plain, Size: 2490 bytes --] On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: > Hi, > I've just encountered a hard disk crash in one of my btrfs pools. > > sudo btrfs filesystem show > failed to open /dev/sr0: No medium found > Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add > Total devices 3 FS bytes used 112.70GB > devid 1 size 100.61GB used 89.26GB path /dev/sdc6 > devid 2 size 93.13GB used 84.00GB path /dev/sdc1 > *** Some devices missing > > The device which is missing is /dev/sdb. I have replaced it with a new > hard disk. How do I add it back to the volume and fix the device > missing? > The pool is expected to mount to /samples (it is not mounted yet). > > I tried this - which fails: > sudo btrfs device add /dev/sdb /samples > ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device > > Why isn't this working? Because it's not mounted. :) > I also tried this: > sudo mount -o recovery /dev/sdc1 /samples > mount: wrong fs type, bad option, bad superblock on /dev/sdc1, > missing codepage or helper program, or other error > In some cases useful info is found in syslog - try > dmesg | tail or so > same with /dev/sdc6 Close, but what you want here is: mount -o degraded /dev/sdc1 /samples not "recovery". That will tell the FS that there's a missing disk, and it should mount without complaining. If your data is not RAID-1 or RAID-10, then you will almost certainly have lost some data. At that point, since you've removed the dead disk, you can do: btrfs device delete missing /samples which forcibly removes the record of the missing device. Then you can add the new device: btrfs device add /dev/sdb /samples And finally balance to repair the RAID: btrfs balance start /samples It's worth noting that even if you have RAID-1 data and metadata, losing /dev/sdc in your current configuration is likely to cause severe data loss -- probably making the whole FS unrecoverable. This is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, and will happily put both copies of a piece of RAID-1 data (or metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore wouldn't recommend running like that for very long. Hugo. -- === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk --- All hope abandon, Ye who press Enter here. --- [-- Attachment #2: Digital signature --] [-- Type: application/pgp-signature, Size: 811 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 10:58 ` Hugo Mills @ 2014-02-14 11:04 ` Axelle 2014-02-14 11:16 ` Axelle 2014-02-14 14:58 ` Daniel Lee 0 siblings, 2 replies; 9+ messages in thread From: Axelle @ 2014-02-14 11:04 UTC (permalink / raw) To: Hugo Mills, linux-btrfs Hi Hugo, Thanks for your answer. Unfortunately, I had also tried sudo mount -o degraded /dev/sdc1 /samples mount: wrong fs type, bad option, bad superblock on /dev/sdc1, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so and dmesg says: [ 1177.695773] btrfs: open_ctree failed [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid 2 transid 31105 /dev/sdc1 [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid 1 transid 31105 /dev/sdc6 [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid 2 transid 31105 /dev/sdc1 [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid 1 transid 31105 /dev/sdc6 [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid 2 transid 31105 /dev/sdc1 [ 4013.408280] btrfs: allowing degraded mounts [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed [ 4015.630841] btrfs: open_ctree failed Yes, I know, I'll probably be losing a lot of data, but it's not "too much" my concern because I had a backup (sooo happy about that :D). If I can manage to recover a little more on the btrfs volume it's bonus, but in the event I do not, I'll be using my backup. So, how do I fix my volume? I guess there would be a solution apart from scratching/deleting everything and starting again... Regards, Axelle On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: > On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >> Hi, >> I've just encountered a hard disk crash in one of my btrfs pools. >> >> sudo btrfs filesystem show >> failed to open /dev/sr0: No medium found >> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >> Total devices 3 FS bytes used 112.70GB >> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >> *** Some devices missing >> >> The device which is missing is /dev/sdb. I have replaced it with a new >> hard disk. How do I add it back to the volume and fix the device >> missing? >> The pool is expected to mount to /samples (it is not mounted yet). >> >> I tried this - which fails: >> sudo btrfs device add /dev/sdb /samples >> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >> >> Why isn't this working? > > Because it's not mounted. :) > >> I also tried this: >> sudo mount -o recovery /dev/sdc1 /samples >> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >> missing codepage or helper program, or other error >> In some cases useful info is found in syslog - try >> dmesg | tail or so >> same with /dev/sdc6 > > Close, but what you want here is: > > mount -o degraded /dev/sdc1 /samples > > not "recovery". That will tell the FS that there's a missing disk, and > it should mount without complaining. If your data is not RAID-1 or > RAID-10, then you will almost certainly have lost some data. > > At that point, since you've removed the dead disk, you can do: > > btrfs device delete missing /samples > > which forcibly removes the record of the missing device. > > Then you can add the new device: > > btrfs device add /dev/sdb /samples > > And finally balance to repair the RAID: > > btrfs balance start /samples > > It's worth noting that even if you have RAID-1 data and metadata, > losing /dev/sdc in your current configuration is likely to cause > severe data loss -- probably making the whole FS unrecoverable. This > is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, > and will happily put both copies of a piece of RAID-1 data (or > metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore > wouldn't recommend running like that for very long. > > Hugo. > > -- > === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === > PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk > --- All hope abandon, Ye who press Enter here. --- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 11:04 ` Axelle @ 2014-02-14 11:16 ` Axelle 2014-02-14 14:58 ` Daniel Lee 1 sibling, 0 replies; 9+ messages in thread From: Axelle @ 2014-02-14 11:16 UTC (permalink / raw) To: Hugo Mills, linux-btrfs Hi, Some update: >sudo mount -o degraded /dev/sdc1 /samples >mount: wrong fs type, bad option, bad superblock on /dev/sdc1, I am mounting it read-only, and backuping what I can still access to another drive. Then, what should I do? Fully erase the volume and create a new one? Or is there a way I can use the snapshots I had? Or somehow fix the ro volume, add the new disk to it, and re-mount rw? Regards, Axelle. On Fri, Feb 14, 2014 at 12:04 PM, Axelle <aafortinet@gmail.com> wrote: > Hi Hugo, > > Thanks for your answer. > Unfortunately, I had also tried > > sudo mount -o degraded /dev/sdc1 /samples > mount: wrong fs type, bad option, bad superblock on /dev/sdc1, > missing codepage or helper program, or other error > In some cases useful info is found in syslog - try > dmesg | tail or so > > and dmesg says: > [ 1177.695773] btrfs: open_ctree failed > [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 1 transid 31105 /dev/sdc6 > [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 1 transid 31105 /dev/sdc6 > [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 4013.408280] btrfs: allowing degraded mounts > [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 > [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed > [ 4015.630841] btrfs: open_ctree failed > > Yes, I know, I'll probably be losing a lot of data, but it's not "too > much" my concern because I had a backup (sooo happy about that :D). If > I can manage to recover a little more on the btrfs volume it's bonus, > but in the event I do not, I'll be using my backup. > > So, how do I fix my volume? I guess there would be a solution apart > from scratching/deleting everything and starting again... > > > Regards, > Axelle > > > > On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>> Hi, >>> I've just encountered a hard disk crash in one of my btrfs pools. >>> >>> sudo btrfs filesystem show >>> failed to open /dev/sr0: No medium found >>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>> Total devices 3 FS bytes used 112.70GB >>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>> *** Some devices missing >>> >>> The device which is missing is /dev/sdb. I have replaced it with a new >>> hard disk. How do I add it back to the volume and fix the device >>> missing? >>> The pool is expected to mount to /samples (it is not mounted yet). >>> >>> I tried this - which fails: >>> sudo btrfs device add /dev/sdb /samples >>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>> >>> Why isn't this working? >> >> Because it's not mounted. :) >> >>> I also tried this: >>> sudo mount -o recovery /dev/sdc1 /samples >>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>> missing codepage or helper program, or other error >>> In some cases useful info is found in syslog - try >>> dmesg | tail or so >>> same with /dev/sdc6 >> >> Close, but what you want here is: >> >> mount -o degraded /dev/sdc1 /samples >> >> not "recovery". That will tell the FS that there's a missing disk, and >> it should mount without complaining. If your data is not RAID-1 or >> RAID-10, then you will almost certainly have lost some data. >> >> At that point, since you've removed the dead disk, you can do: >> >> btrfs device delete missing /samples >> >> which forcibly removes the record of the missing device. >> >> Then you can add the new device: >> >> btrfs device add /dev/sdb /samples >> >> And finally balance to repair the RAID: >> >> btrfs balance start /samples >> >> It's worth noting that even if you have RAID-1 data and metadata, >> losing /dev/sdc in your current configuration is likely to cause >> severe data loss -- probably making the whole FS unrecoverable. This >> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >> and will happily put both copies of a piece of RAID-1 data (or >> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >> wouldn't recommend running like that for very long. >> >> Hugo. >> >> -- >> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >> --- All hope abandon, Ye who press Enter here. --- ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 11:04 ` Axelle 2014-02-14 11:16 ` Axelle @ 2014-02-14 14:58 ` Daniel Lee 2014-02-14 15:22 ` Axelle 1 sibling, 1 reply; 9+ messages in thread From: Daniel Lee @ 2014-02-14 14:58 UTC (permalink / raw) To: Axelle; +Cc: linux-btrfs On 02/14/2014 03:04 AM, Axelle wrote: > Hi Hugo, > > Thanks for your answer. > Unfortunately, I had also tried > > sudo mount -o degraded /dev/sdc1 /samples > mount: wrong fs type, bad option, bad superblock on /dev/sdc1, > missing codepage or helper program, or other error > In some cases useful info is found in syslog - try > dmesg | tail or so > > and dmesg says: > [ 1177.695773] btrfs: open_ctree failed > [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 1 transid 31105 /dev/sdc6 > [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 1 transid 31105 /dev/sdc6 > [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid > 2 transid 31105 /dev/sdc1 > [ 4013.408280] btrfs: allowing degraded mounts > [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 > [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed > [ 4015.630841] btrfs: open_ctree failed Did the crashed /dev/sdb have more than 1 partitions in your raid1 filesystem? > > Yes, I know, I'll probably be losing a lot of data, but it's not "too > much" my concern because I had a backup (sooo happy about that :D). If > I can manage to recover a little more on the btrfs volume it's bonus, > but in the event I do not, I'll be using my backup. > > So, how do I fix my volume? I guess there would be a solution apart > from scratching/deleting everything and starting again... > > > Regards, > Axelle > > > > On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>> Hi, >>> I've just encountered a hard disk crash in one of my btrfs pools. >>> >>> sudo btrfs filesystem show >>> failed to open /dev/sr0: No medium found >>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>> Total devices 3 FS bytes used 112.70GB >>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>> *** Some devices missing >>> >>> The device which is missing is /dev/sdb. I have replaced it with a new >>> hard disk. How do I add it back to the volume and fix the device >>> missing? >>> The pool is expected to mount to /samples (it is not mounted yet). >>> >>> I tried this - which fails: >>> sudo btrfs device add /dev/sdb /samples >>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>> >>> Why isn't this working? >> Because it's not mounted. :) >> >>> I also tried this: >>> sudo mount -o recovery /dev/sdc1 /samples >>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>> missing codepage or helper program, or other error >>> In some cases useful info is found in syslog - try >>> dmesg | tail or so >>> same with /dev/sdc6 >> Close, but what you want here is: >> >> mount -o degraded /dev/sdc1 /samples >> >> not "recovery". That will tell the FS that there's a missing disk, and >> it should mount without complaining. If your data is not RAID-1 or >> RAID-10, then you will almost certainly have lost some data. >> >> At that point, since you've removed the dead disk, you can do: >> >> btrfs device delete missing /samples >> >> which forcibly removes the record of the missing device. >> >> Then you can add the new device: >> >> btrfs device add /dev/sdb /samples >> >> And finally balance to repair the RAID: >> >> btrfs balance start /samples >> >> It's worth noting that even if you have RAID-1 data and metadata, >> losing /dev/sdc in your current configuration is likely to cause >> severe data loss -- probably making the whole FS unrecoverable. This >> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >> and will happily put both copies of a piece of RAID-1 data (or >> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >> wouldn't recommend running like that for very long. >> >> Hugo. >> >> -- >> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >> --- All hope abandon, Ye who press Enter here. --- > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 14:58 ` Daniel Lee @ 2014-02-14 15:22 ` Axelle 2014-02-14 16:19 ` Daniel Lee 0 siblings, 1 reply; 9+ messages in thread From: Axelle @ 2014-02-14 15:22 UTC (permalink / raw) To: linux-btrfs >Did the crashed /dev/sdb have more than 1 partitions in your raid1 >filesystem? No, only 1 - as far as I recall. -- Axelle. On Fri, Feb 14, 2014 at 3:58 PM, Daniel Lee <longinus00@gmail.com> wrote: > On 02/14/2014 03:04 AM, Axelle wrote: >> Hi Hugo, >> >> Thanks for your answer. >> Unfortunately, I had also tried >> >> sudo mount -o degraded /dev/sdc1 /samples >> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >> missing codepage or helper program, or other error >> In some cases useful info is found in syslog - try >> dmesg | tail or so >> >> and dmesg says: >> [ 1177.695773] btrfs: open_ctree failed >> [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >> 2 transid 31105 /dev/sdc1 >> [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >> 1 transid 31105 /dev/sdc6 >> [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >> 2 transid 31105 /dev/sdc1 >> [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >> 1 transid 31105 /dev/sdc6 >> [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >> 2 transid 31105 /dev/sdc1 >> [ 4013.408280] btrfs: allowing degraded mounts >> [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 >> [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed >> [ 4015.630841] btrfs: open_ctree failed > Did the crashed /dev/sdb have more than 1 partitions in your raid1 > filesystem? >> >> Yes, I know, I'll probably be losing a lot of data, but it's not "too >> much" my concern because I had a backup (sooo happy about that :D). If >> I can manage to recover a little more on the btrfs volume it's bonus, >> but in the event I do not, I'll be using my backup. >> >> So, how do I fix my volume? I guess there would be a solution apart >> from scratching/deleting everything and starting again... >> >> >> Regards, >> Axelle >> >> >> >> On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >>> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>>> Hi, >>>> I've just encountered a hard disk crash in one of my btrfs pools. >>>> >>>> sudo btrfs filesystem show >>>> failed to open /dev/sr0: No medium found >>>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>>> Total devices 3 FS bytes used 112.70GB >>>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>>> *** Some devices missing >>>> >>>> The device which is missing is /dev/sdb. I have replaced it with a new >>>> hard disk. How do I add it back to the volume and fix the device >>>> missing? >>>> The pool is expected to mount to /samples (it is not mounted yet). >>>> >>>> I tried this - which fails: >>>> sudo btrfs device add /dev/sdb /samples >>>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>>> >>>> Why isn't this working? >>> Because it's not mounted. :) >>> >>>> I also tried this: >>>> sudo mount -o recovery /dev/sdc1 /samples >>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>> missing codepage or helper program, or other error >>>> In some cases useful info is found in syslog - try >>>> dmesg | tail or so >>>> same with /dev/sdc6 >>> Close, but what you want here is: >>> >>> mount -o degraded /dev/sdc1 /samples >>> >>> not "recovery". That will tell the FS that there's a missing disk, and >>> it should mount without complaining. If your data is not RAID-1 or >>> RAID-10, then you will almost certainly have lost some data. >>> >>> At that point, since you've removed the dead disk, you can do: >>> >>> btrfs device delete missing /samples >>> >>> which forcibly removes the record of the missing device. >>> >>> Then you can add the new device: >>> >>> btrfs device add /dev/sdb /samples >>> >>> And finally balance to repair the RAID: >>> >>> btrfs balance start /samples >>> >>> It's worth noting that even if you have RAID-1 data and metadata, >>> losing /dev/sdc in your current configuration is likely to cause >>> severe data loss -- probably making the whole FS unrecoverable. This >>> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >>> and will happily put both copies of a piece of RAID-1 data (or >>> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >>> wouldn't recommend running like that for very long. >>> >>> Hugo. >>> >>> -- >>> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >>> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >>> --- All hope abandon, Ye who press Enter here. --- >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 15:22 ` Axelle @ 2014-02-14 16:19 ` Daniel Lee 2014-02-14 17:53 ` Axelle 0 siblings, 1 reply; 9+ messages in thread From: Daniel Lee @ 2014-02-14 16:19 UTC (permalink / raw) To: Axelle; +Cc: linux-btrfs On 02/14/2014 07:22 AM, Axelle wrote: >> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >> filesystem? > No, only 1 - as far as I recall. > > -- Axelle. What does: btrfs filesystem df /samples say now that you've mounted the fs readonly? > On Fri, Feb 14, 2014 at 3:58 PM, Daniel Lee <longinus00@gmail.com> wrote: >> On 02/14/2014 03:04 AM, Axelle wrote: >>> Hi Hugo, >>> >>> Thanks for your answer. >>> Unfortunately, I had also tried >>> >>> sudo mount -o degraded /dev/sdc1 /samples >>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>> missing codepage or helper program, or other error >>> In some cases useful info is found in syslog - try >>> dmesg | tail or so >>> >>> and dmesg says: >>> [ 1177.695773] btrfs: open_ctree failed >>> [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>> 2 transid 31105 /dev/sdc1 >>> [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>> 1 transid 31105 /dev/sdc6 >>> [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>> 2 transid 31105 /dev/sdc1 >>> [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>> 1 transid 31105 /dev/sdc6 >>> [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>> 2 transid 31105 /dev/sdc1 >>> [ 4013.408280] btrfs: allowing degraded mounts >>> [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 >>> [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed >>> [ 4015.630841] btrfs: open_ctree failed >> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >> filesystem? >>> Yes, I know, I'll probably be losing a lot of data, but it's not "too >>> much" my concern because I had a backup (sooo happy about that :D). If >>> I can manage to recover a little more on the btrfs volume it's bonus, >>> but in the event I do not, I'll be using my backup. >>> >>> So, how do I fix my volume? I guess there would be a solution apart >>> from scratching/deleting everything and starting again... >>> >>> >>> Regards, >>> Axelle >>> >>> >>> >>> On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >>>> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>>>> Hi, >>>>> I've just encountered a hard disk crash in one of my btrfs pools. >>>>> >>>>> sudo btrfs filesystem show >>>>> failed to open /dev/sr0: No medium found >>>>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>>>> Total devices 3 FS bytes used 112.70GB >>>>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>>>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>>>> *** Some devices missing >>>>> >>>>> The device which is missing is /dev/sdb. I have replaced it with a new >>>>> hard disk. How do I add it back to the volume and fix the device >>>>> missing? >>>>> The pool is expected to mount to /samples (it is not mounted yet). >>>>> >>>>> I tried this - which fails: >>>>> sudo btrfs device add /dev/sdb /samples >>>>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>>>> >>>>> Why isn't this working? >>>> Because it's not mounted. :) >>>> >>>>> I also tried this: >>>>> sudo mount -o recovery /dev/sdc1 /samples >>>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>>> missing codepage or helper program, or other error >>>>> In some cases useful info is found in syslog - try >>>>> dmesg | tail or so >>>>> same with /dev/sdc6 >>>> Close, but what you want here is: >>>> >>>> mount -o degraded /dev/sdc1 /samples >>>> >>>> not "recovery". That will tell the FS that there's a missing disk, and >>>> it should mount without complaining. If your data is not RAID-1 or >>>> RAID-10, then you will almost certainly have lost some data. >>>> >>>> At that point, since you've removed the dead disk, you can do: >>>> >>>> btrfs device delete missing /samples >>>> >>>> which forcibly removes the record of the missing device. >>>> >>>> Then you can add the new device: >>>> >>>> btrfs device add /dev/sdb /samples >>>> >>>> And finally balance to repair the RAID: >>>> >>>> btrfs balance start /samples >>>> >>>> It's worth noting that even if you have RAID-1 data and metadata, >>>> losing /dev/sdc in your current configuration is likely to cause >>>> severe data loss -- probably making the whole FS unrecoverable. This >>>> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >>>> and will happily put both copies of a piece of RAID-1 data (or >>>> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >>>> wouldn't recommend running like that for very long. >>>> >>>> Hugo. >>>> >>>> -- >>>> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >>>> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >>>> --- All hope abandon, Ye who press Enter here. --- >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html > -- > To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 16:19 ` Daniel Lee @ 2014-02-14 17:53 ` Axelle 2014-02-14 18:27 ` Daniel Lee 0 siblings, 1 reply; 9+ messages in thread From: Axelle @ 2014-02-14 17:53 UTC (permalink / raw) To: Daniel Lee; +Cc: linux-btrfs Hi Daniel, This is what it answers now: sudo btrfs filesystem df /samples [sudo] password for axelle: Data, RAID0: total=252.00GB, used=108.99GB System, RAID1: total=8.00MB, used=28.00KB System: total=4.00MB, used=0.00 Metadata, RAID1: total=5.25GB, used=3.71GB By the way, I was happy to recover most of my data :) Of course, I still can't add my new /dev/sdb to /samples because it's read-only: sudo btrfs device add /dev/sdb /samples ERROR: error adding the device '/dev/sdb' - Read-only file system Regards Axelle On Fri, Feb 14, 2014 at 5:19 PM, Daniel Lee <longinus00@gmail.com> wrote: > On 02/14/2014 07:22 AM, Axelle wrote: >>> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >>> filesystem? >> No, only 1 - as far as I recall. >> >> -- Axelle. > What does: > > btrfs filesystem df /samples > > say now that you've mounted the fs readonly? >> On Fri, Feb 14, 2014 at 3:58 PM, Daniel Lee <longinus00@gmail.com> wrote: >>> On 02/14/2014 03:04 AM, Axelle wrote: >>>> Hi Hugo, >>>> >>>> Thanks for your answer. >>>> Unfortunately, I had also tried >>>> >>>> sudo mount -o degraded /dev/sdc1 /samples >>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>> missing codepage or helper program, or other error >>>> In some cases useful info is found in syslog - try >>>> dmesg | tail or so >>>> >>>> and dmesg says: >>>> [ 1177.695773] btrfs: open_ctree failed >>>> [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>> 2 transid 31105 /dev/sdc1 >>>> [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>> 1 transid 31105 /dev/sdc6 >>>> [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>> 2 transid 31105 /dev/sdc1 >>>> [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>> 1 transid 31105 /dev/sdc6 >>>> [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>> 2 transid 31105 /dev/sdc1 >>>> [ 4013.408280] btrfs: allowing degraded mounts >>>> [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 >>>> [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed >>>> [ 4015.630841] btrfs: open_ctree failed >>> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >>> filesystem? >>>> Yes, I know, I'll probably be losing a lot of data, but it's not "too >>>> much" my concern because I had a backup (sooo happy about that :D). If >>>> I can manage to recover a little more on the btrfs volume it's bonus, >>>> but in the event I do not, I'll be using my backup. >>>> >>>> So, how do I fix my volume? I guess there would be a solution apart >>>> from scratching/deleting everything and starting again... >>>> >>>> >>>> Regards, >>>> Axelle >>>> >>>> >>>> >>>> On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >>>>> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>>>>> Hi, >>>>>> I've just encountered a hard disk crash in one of my btrfs pools. >>>>>> >>>>>> sudo btrfs filesystem show >>>>>> failed to open /dev/sr0: No medium found >>>>>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>>>>> Total devices 3 FS bytes used 112.70GB >>>>>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>>>>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>>>>> *** Some devices missing >>>>>> >>>>>> The device which is missing is /dev/sdb. I have replaced it with a new >>>>>> hard disk. How do I add it back to the volume and fix the device >>>>>> missing? >>>>>> The pool is expected to mount to /samples (it is not mounted yet). >>>>>> >>>>>> I tried this - which fails: >>>>>> sudo btrfs device add /dev/sdb /samples >>>>>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>>>>> >>>>>> Why isn't this working? >>>>> Because it's not mounted. :) >>>>> >>>>>> I also tried this: >>>>>> sudo mount -o recovery /dev/sdc1 /samples >>>>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>>>> missing codepage or helper program, or other error >>>>>> In some cases useful info is found in syslog - try >>>>>> dmesg | tail or so >>>>>> same with /dev/sdc6 >>>>> Close, but what you want here is: >>>>> >>>>> mount -o degraded /dev/sdc1 /samples >>>>> >>>>> not "recovery". That will tell the FS that there's a missing disk, and >>>>> it should mount without complaining. If your data is not RAID-1 or >>>>> RAID-10, then you will almost certainly have lost some data. >>>>> >>>>> At that point, since you've removed the dead disk, you can do: >>>>> >>>>> btrfs device delete missing /samples >>>>> >>>>> which forcibly removes the record of the missing device. >>>>> >>>>> Then you can add the new device: >>>>> >>>>> btrfs device add /dev/sdb /samples >>>>> >>>>> And finally balance to repair the RAID: >>>>> >>>>> btrfs balance start /samples >>>>> >>>>> It's worth noting that even if you have RAID-1 data and metadata, >>>>> losing /dev/sdc in your current configuration is likely to cause >>>>> severe data loss -- probably making the whole FS unrecoverable. This >>>>> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >>>>> and will happily put both copies of a piece of RAID-1 data (or >>>>> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >>>>> wouldn't recommend running like that for very long. >>>>> >>>>> Hugo. >>>>> >>>>> -- >>>>> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >>>>> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >>>>> --- All hope abandon, Ye who press Enter here. --- >>>> -- >>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>> the body of a message to majordomo@vger.kernel.org >>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@vger.kernel.org >> More majordomo info at http://vger.kernel.org/majordomo-info.html > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: Recovering from hard disk failure in a pool 2014-02-14 17:53 ` Axelle @ 2014-02-14 18:27 ` Daniel Lee 0 siblings, 0 replies; 9+ messages in thread From: Daniel Lee @ 2014-02-14 18:27 UTC (permalink / raw) To: Axelle; +Cc: linux-btrfs On 02/14/2014 09:53 AM, Axelle wrote: > Hi Daniel, > > This is what it answers now: > > sudo btrfs filesystem df /samples > [sudo] password for axelle: > Data, RAID0: total=252.00GB, used=108.99GB > System, RAID1: total=8.00MB, used=28.00KB > System: total=4.00MB, used=0.00 > Metadata, RAID1: total=5.25GB, used=3.71GB So the issue here is that your data is raid0 which will not tolerate any loss of a device. I'd recommend trashing the current filesystem and creating a new one with some redundancy (use raid1 not raid0, don't add more than one partition from the same disk to a btrfs filesystem, etc.) so you can recover from this sort of scenario in the future. To do this, use wipefs on the remaining partitions to remove all traces of the current btrfs filesystem. > By the way, I was happy to recover most of my data :) This is the nice thing about the checksumming in btrfs, knowing that what data you did read off is correct. :) > Of course, I still can't add my new /dev/sdb to /samples because it's read-only: > sudo btrfs device add /dev/sdb /samples > ERROR: error adding the device '/dev/sdb' - Read-only file system > > Regards > Axelle > > On Fri, Feb 14, 2014 at 5:19 PM, Daniel Lee <longinus00@gmail.com> wrote: >> On 02/14/2014 07:22 AM, Axelle wrote: >>>> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >>>> filesystem? >>> No, only 1 - as far as I recall. >>> >>> -- Axelle. >> What does: >> >> btrfs filesystem df /samples >> >> say now that you've mounted the fs readonly? >>> On Fri, Feb 14, 2014 at 3:58 PM, Daniel Lee <longinus00@gmail.com> wrote: >>>> On 02/14/2014 03:04 AM, Axelle wrote: >>>>> Hi Hugo, >>>>> >>>>> Thanks for your answer. >>>>> Unfortunately, I had also tried >>>>> >>>>> sudo mount -o degraded /dev/sdc1 /samples >>>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>>> missing codepage or helper program, or other error >>>>> In some cases useful info is found in syslog - try >>>>> dmesg | tail or so >>>>> >>>>> and dmesg says: >>>>> [ 1177.695773] btrfs: open_ctree failed >>>>> [ 1247.448766] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>>> 2 transid 31105 /dev/sdc1 >>>>> [ 1247.449700] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>>> 1 transid 31105 /dev/sdc6 >>>>> [ 1247.458794] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>>> 2 transid 31105 /dev/sdc1 >>>>> [ 1247.459601] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>>> 1 transid 31105 /dev/sdc6 >>>>> [ 4013.363254] device fsid 545e95c6-d347-4a8c-8a49-38b9f9cb9add devid >>>>> 2 transid 31105 /dev/sdc1 >>>>> [ 4013.408280] btrfs: allowing degraded mounts >>>>> [ 4013.555764] btrfs: bdev (null) errs: wr 0, rd 14, flush 0, corrupt 0, gen 0 >>>>> [ 4015.600424] Btrfs: too many missing devices, writeable mount is not allowed >>>>> [ 4015.630841] btrfs: open_ctree failed >>>> Did the crashed /dev/sdb have more than 1 partitions in your raid1 >>>> filesystem? >>>>> Yes, I know, I'll probably be losing a lot of data, but it's not "too >>>>> much" my concern because I had a backup (sooo happy about that :D). If >>>>> I can manage to recover a little more on the btrfs volume it's bonus, >>>>> but in the event I do not, I'll be using my backup. >>>>> >>>>> So, how do I fix my volume? I guess there would be a solution apart >>>>> from scratching/deleting everything and starting again... >>>>> >>>>> >>>>> Regards, >>>>> Axelle >>>>> >>>>> >>>>> >>>>> On Fri, Feb 14, 2014 at 11:58 AM, Hugo Mills <hugo@carfax.org.uk> wrote: >>>>>> On Fri, Feb 14, 2014 at 11:35:56AM +0100, Axelle wrote: >>>>>>> Hi, >>>>>>> I've just encountered a hard disk crash in one of my btrfs pools. >>>>>>> >>>>>>> sudo btrfs filesystem show >>>>>>> failed to open /dev/sr0: No medium found >>>>>>> Label: none uuid: 545e95c6-d347-4a8c-8a49-38b9f9cb9add >>>>>>> Total devices 3 FS bytes used 112.70GB >>>>>>> devid 1 size 100.61GB used 89.26GB path /dev/sdc6 >>>>>>> devid 2 size 93.13GB used 84.00GB path /dev/sdc1 >>>>>>> *** Some devices missing >>>>>>> >>>>>>> The device which is missing is /dev/sdb. I have replaced it with a new >>>>>>> hard disk. How do I add it back to the volume and fix the device >>>>>>> missing? >>>>>>> The pool is expected to mount to /samples (it is not mounted yet). >>>>>>> >>>>>>> I tried this - which fails: >>>>>>> sudo btrfs device add /dev/sdb /samples >>>>>>> ERROR: error adding the device '/dev/sdb' - Inappropriate ioctl for device >>>>>>> >>>>>>> Why isn't this working? >>>>>> Because it's not mounted. :) >>>>>> >>>>>>> I also tried this: >>>>>>> sudo mount -o recovery /dev/sdc1 /samples >>>>>>> mount: wrong fs type, bad option, bad superblock on /dev/sdc1, >>>>>>> missing codepage or helper program, or other error >>>>>>> In some cases useful info is found in syslog - try >>>>>>> dmesg | tail or so >>>>>>> same with /dev/sdc6 >>>>>> Close, but what you want here is: >>>>>> >>>>>> mount -o degraded /dev/sdc1 /samples >>>>>> >>>>>> not "recovery". That will tell the FS that there's a missing disk, and >>>>>> it should mount without complaining. If your data is not RAID-1 or >>>>>> RAID-10, then you will almost certainly have lost some data. >>>>>> >>>>>> At that point, since you've removed the dead disk, you can do: >>>>>> >>>>>> btrfs device delete missing /samples >>>>>> >>>>>> which forcibly removes the record of the missing device. >>>>>> >>>>>> Then you can add the new device: >>>>>> >>>>>> btrfs device add /dev/sdb /samples >>>>>> >>>>>> And finally balance to repair the RAID: >>>>>> >>>>>> btrfs balance start /samples >>>>>> >>>>>> It's worth noting that even if you have RAID-1 data and metadata, >>>>>> losing /dev/sdc in your current configuration is likely to cause >>>>>> severe data loss -- probably making the whole FS unrecoverable. This >>>>>> is because the FS sees /dev/sdc1 and /dev/sdc6 as independent devices, >>>>>> and will happily put both copies of a piece of RAID-1 data (or >>>>>> metadata) on /dev/sdc -- one on each of sdc1 and sdc6. I therefore >>>>>> wouldn't recommend running like that for very long. >>>>>> >>>>>> Hugo. >>>>>> >>>>>> -- >>>>>> === Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk === >>>>>> PGP key: 65E74AC0 from wwwkeys.eu.pgp.net or http://www.carfax.org.uk >>>>>> --- All hope abandon, Ye who press Enter here. --- >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>> the body of a message to majordomo@vger.kernel.org >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>> -- >>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>> the body of a message to majordomo@vger.kernel.org >>> More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2014-02-14 18:27 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2014-02-14 10:35 Recovering from hard disk failure in a pool Axelle 2014-02-14 10:58 ` Hugo Mills 2014-02-14 11:04 ` Axelle 2014-02-14 11:16 ` Axelle 2014-02-14 14:58 ` Daniel Lee 2014-02-14 15:22 ` Axelle 2014-02-14 16:19 ` Daniel Lee 2014-02-14 17:53 ` Axelle 2014-02-14 18:27 ` Daniel Lee
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.