All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Petrini <john.d.petrini@gmail.com>
To: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>,
	linux-btrfs@vger.kernel.org
Subject: Re: Filesystem Went Read Only During Raid-10 to Raid-6 Data Conversion
Date: Thu, 16 Jul 2020 09:37:29 -0400	[thread overview]
Message-ID: <CADvYWxdvy5n3Tsa+MG9sSB2iAu-eA+W33ApzQ3q9D6sdGR9UYA@mail.gmail.com> (raw)
In-Reply-To: <20200716042739.GB8346@hungrycats.org>

On Thu, Jul 16, 2020 at 12:27 AM Zygo Blaxell
<ce3g8jdj@umail.furryterror.org> wrote:
>
> On Tue, Jul 14, 2020 at 10:49:08PM -0400, John Petrini wrote:
> > I've done this and the filesystem mounted successfully though when
> > attempting to cancel the balance it just tells me it's not running.
>
> That's fine, as long as it stops one way or another.
>
> > > Aside:  data-raid6 metadata-raid10 isn't a sane configuration.  It
> > > has 2 redundant disks for data and 1 redundant disk for metadata, so
> > > the second parity disk in raid6 is wasted space.
> > >
> > > The sane configurations for parity raid are:
> > >
> > >         data-raid6 metadata-raid1c3 (2 parity stripes for data, 3 copies
> > >         for metadata, 2 disks can fail, requires 3 or more disks)
> > >
> > >         data-raid5 metadata-raid10 (1 parity stripe for data, 2 copies
> > >         for metadata, 1 disk can fail, requires 4 or more disks)
> > >
> > >         data-raid5 metadata-raid1 (1 parity stripe for data, 2 copies
> > >         for metadata, 1 disk can fail, requires 2 or more disks)
> > >
> >
> > This is very interesting. I had no idea that raid1c3 was an option
> > though it sounds like I may need a really recent kernel version?
>
> 5.5 or later.

Okay I'll look into getting on this version since that's a killer feature.

>
> > btrfs fi usage /mnt/storage-array/
> > WARNING: RAID56 detected, not implemented
> > Overall:
> >     Device size:          67.31TiB
> >     Device allocated:          65.45TiB
> >     Device unallocated:           1.86TiB
> >     Device missing:             0.00B
> >     Used:              65.14TiB
> >     Free (estimated):           1.12TiB    (min: 1.09TiB)
> >     Data ratio:                  1.94
> >     Metadata ratio:              2.00
> >     Global reserve:         512.00MiB    (used: 0.00B)
> >
> > Data,RAID10: Size:32.68TiB, Used:32.53TiB
> >    /dev/sda       4.34TiB
> >    /dev/sdb       4.34TiB
> >    /dev/sdc       4.34TiB
> >    /dev/sdd       2.21TiB
> >    /dev/sde       2.21TiB
> >    /dev/sdf       4.34TiB
> >    /dev/sdi       1.82TiB
> >    /dev/sdj       1.82TiB
> >    /dev/sdk       1.82TiB
> >    /dev/sdl       1.82TiB
> >    /dev/sdm       1.82TiB
> >    /dev/sdn       1.82TiB
> >
> > Data,RAID6: Size:1.04TiB, Used:1.04TiB
> >    /dev/sda     413.92GiB
> >    /dev/sdb     413.92GiB
> >    /dev/sdc     413.92GiB
> >    /dev/sdd     119.07GiB
> >    /dev/sde     119.07GiB
> >    /dev/sdf     413.92GiB
> >
> > Metadata,RAID10: Size:40.84GiB, Used:39.80GiB
> >    /dev/sda       5.66GiB
> >    /dev/sdb       5.66GiB
> >    /dev/sdc       5.66GiB
> >    /dev/sdd       2.41GiB
> >    /dev/sde       2.41GiB
> >    /dev/sdf       5.66GiB
> >    /dev/sdi       2.23GiB
> >    /dev/sdj       2.23GiB
> >    /dev/sdk       2.23GiB
> >    /dev/sdl       2.23GiB
> >    /dev/sdm       2.23GiB
> >    /dev/sdn       2.23GiB
> >
> > System,RAID10: Size:96.00MiB, Used:3.06MiB
> >    /dev/sda       8.00MiB
> >    /dev/sdb       8.00MiB
> >    /dev/sdc       8.00MiB
> >    /dev/sdd       8.00MiB
> >    /dev/sde       8.00MiB
> >    /dev/sdf       8.00MiB
> >    /dev/sdi       8.00MiB
> >    /dev/sdj       8.00MiB
> >    /dev/sdk       8.00MiB
> >    /dev/sdl       8.00MiB
> >    /dev/sdm       8.00MiB
> >    /dev/sdn       8.00MiB
> >
> > Unallocated:
> >    /dev/sda       4.35TiB
> >    /dev/sdb       4.35TiB
> >    /dev/sdc       4.35TiB
> >    /dev/sdd       2.22TiB
> >    /dev/sde       2.22TiB
> >    /dev/sdf       4.35TiB
> >    /dev/sdi       1.82TiB
> >    /dev/sdj       1.82TiB
> >    /dev/sdk       1.82TiB
> >    /dev/sdl       1.82TiB
> >    /dev/sdm       1.82TiB
> >    /dev/sdn       1.82TiB
>
> Plenty of unallocated space.  It should be able to do the conversion.

After upgrading, the unallocated space tells a different story. Maybe
due to the newer kernel or btrfs-progs?

Unallocated:
   /dev/sdd        1.02MiB
   /dev/sde        1.02MiB
   /dev/sdl        1.02MiB
   /dev/sdn        1.02MiB
   /dev/sdm        1.02MiB
   /dev/sdk        1.02MiB
   /dev/sdj        1.02MiB
   /dev/sdi        1.02MiB
   /dev/sdb        1.00MiB
   /dev/sdc        1.00MiB
   /dev/sda        5.90GiB
   /dev/sdg        5.90GiB

This is after clearing up additional space on the filesytem. When I
started the conversion there was only ~300G available. There's now
close 1TB according to df.

/dev/sdd                      68T   66T  932G  99% /mnt/storage-array

So I'm not sure what to make of this and whether it's safe to start
the conversion again. I don't feel like I can trust the unallocated
space before or after the upgrade.


Here's the versions I'm on now:
sudo dpkg -l | grep btrfs-progs
ii  btrfs-progs                            5.4.1-2
        amd64        Checksumming Copy on Write Filesystem utilities

uname -r
5.4.0-40-generic

>
> > > You didn't post the dmesg messages from when the filesystem went
> > > read-only, but metadata 'total' is very close to 'used', you were doing
> > > a balance, and the filesystem went read-only, so I'm guessing you hit
> > > ENOSPC for metadata due to lack of unallocated space on at least 4 drives
> > > (minimum for raid10).
> > >
> >
> > Here's a paste of everything in dmesg: http://paste.openstack.org/show/795929/
>
> Unfortunately the original errors are no longer in the buffer.  Maybe
> try /var/log/kern.log?
>

Found it. So this was a space issue. I knew the filesystem was very
full but figured ~300G would be enough.

kernel: [3755232.352221] BTRFS: error (device sdd) in
__btrfs_free_extent:4860: errno=-28 No space left
kernel: [3755232.352227] BTRFS: Transaction aborted (error -28)
ernel: [3755232.354693] BTRFS info (device sdd): forced readonly
kernel: [3755232.354700] BTRFS: error (device sdd) in
btrfs_run_delayed_refs:2795: errno=-28 No space left


> > > > uname -r
> > > > 5.3.0-40-generic
> > >
> > > Please upgrade to 5.4.13 or later.  Kernels 5.1 through 5.4.12 have a
> > > rare but nasty bug that is triggered by writing at exactly the wrong
> > > moment during balance.  5.3 has some internal defenses against that bug
> > > (the "write time tree checker"), but if they fail, the result is metadata
> > > corruption that requires btrfs check to repair.
> > >
> >
> > Thanks for the heads up. I'm getting it updated now and will attempt
> > to remount once I do. Once it's remounted how should I proceed? Can I
> > just assume the filesystem is healthy at that point? Should I perform
> > a scrub?
>
> If scrub reports no errors it's probably OK.

I did run a scrub and it came back clean.

>
> A scrub will tell you if any data or metadata is corrupted or any
> parent-child pointers are broken.  That will cover most of the common
> problems.  If the original issue was a spurious ENOSPC then everything
> should be OK.  If the original issue was a write time tree corruption
> then it should be OK.  If the original issue was something else, it
> will present itself again during the scrub or balance.
>
> If there are errors, scrub won't attribute them to the right disks for
> raid6.  It might be worth reading
>
>         https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@hungrycats.org/
>
> for a list of current raid5/6 issues to be aware of.

Thanks. This is good info.

  parent reply	other threads:[~2020-07-16 13:37 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-14 16:13 Filesystem Went Read Only During Raid-10 to Raid-6 Data Conversion John Petrini
2020-07-15  1:18 ` Zygo Blaxell
     [not found]   ` <CADvYWxcq+-Fg0W9dmc-shwszF-7sX+GDVig0GncpvwKUDPfT7g@mail.gmail.com>
     [not found]     ` <20200716042739.GB8346@hungrycats.org>
2020-07-16 13:37       ` John Petrini [this message]
     [not found]         ` <CAJix6J9kmQjfFJJ1GwWXsX7WW6QKxPqpKx86g7hgA4PfbH5Rpg@mail.gmail.com>
2020-07-16 22:57           ` Zygo Blaxell
2020-07-17  1:11             ` John Petrini
2020-07-17  5:57               ` Zygo Blaxell
2020-07-17 22:54                 ` John Petrini
2020-07-18 10:36                 ` Steven Davies
2020-07-20 17:57                   ` Goffredo Baroncelli
2020-07-21 10:15                     ` Steven Davies
2020-07-21 20:48                       ` Goffredo Baroncelli
2020-07-23  8:57                         ` Steven Davies
2020-07-23 19:29                           ` Zygo Blaxell

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CADvYWxdvy5n3Tsa+MG9sSB2iAu-eA+W33ApzQ3q9D6sdGR9UYA@mail.gmail.com \
    --to=john.d.petrini@gmail.com \
    --cc=ce3g8jdj@umail.furryterror.org \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.