All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Su Yue <suy.fnst@cn.fujitsu.com>, linux-btrfs@vger.kernel.org
Subject: Re: So, does btrfs check lowmem take days? weeks?
Date: Mon, 2 Jul 2018 08:19:03 -0700	[thread overview]
Message-ID: <20180702151903.j3j522urrtyznylf@merlins.org> (raw)
In-Reply-To: <3728d88c-29c1-332b-b698-31a0b3d36e2b@gmx.com>

Hi Qu,

thanks for the detailled and honest answer.
A few comments inline.

On Mon, Jul 02, 2018 at 10:42:40PM +0800, Qu Wenruo wrote:
> For full, it depends. (but for most real world case, it's still flawed)
> We have small and crafted images as test cases, which btrfs check can
> repair without problem at all.
> But such images are *SMALL*, and only have *ONE* type of corruption,
> which can represent real world case at all.
 
right, they're just unittest images, I understand.

> 1) Too large fs (especially too many snapshots)
>    The use case (too many snapshots and shared extents, a lot of extents
>    get shared over 1000 times) is in fact a super large challenge for
>    lowmem mode check/repair.
>    It needs O(n^2) or even O(n^3) to check each backref, which hugely
>    slow the progress and make us hard to locate the real bug.
 
So, the non lowmem version would work better, but it's a problem if it
doesn't fit in RAM.
I've always considered it a grave bug that btrfs check repair can use so
much kernel memory that it will crash the entire system. This should not
be possible.
While it won't help me here, can btrfs check be improved not to suck all
the kernel memory, and ideally even allow using swap space if the RAM is
not enough?

Is btrfs check regular mode still being maintained? I think it's still
better than lowmem, correct?

> 2) Corruption in extent tree and our objective is to mount RW
>    Extent tree is almost useless if we just want to read data.
>    But when we do any write, we needs it and if it goes wrong even a
>    tiny bit, your fs could be damaged really badly.
> 
>    For other corruption, like some fs tree corruption, we could do
>    something to discard some corrupted files, but if it's extent tree,
>    we either mount RO and grab anything we have, or hopes the
>    almost-never-working --init-extent-tree can work (that's mostly
>    miracle).
 
I understand that it's the weak point of btrfs, thanks for explaining.

> 1) Don't keep too many snapshots.
>    Really, this is the core.
>    For send/receive backup, IIRC it only needs the parent subvolume
>    exists, there is no need to keep the whole history of all those
>    snapshots.

You are correct on history. The reason I keep history is because I may
want to recover a file from last week or 2 weeks ago after I finally
notice that it's gone. 
I have terabytes of space on the backup server, so it's easier to keep
history there than on the client which may not have enough space to keep
a month's worth of history.
As you know, back when we did tape backups, we also kept history of at
least several weeks (usually several months, but that's too much for
btrfs snapshots).

>    Keep the number of snapshots to minimal does greatly improve the
>    possibility (both manual patch or check repair) of a successful
>    repair.
>    Normally I would suggest 4 hourly snapshots, 7 daily snapshots, 12
>    monthly snapshots.

I actually have fewer snapshots than this per filesystem, but I backup
more than 10 filesystems.
If I used as many snapshots as you recommend, that would already be 230
snapshots for 10 filesystems :)

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

  parent reply	other threads:[~2018-07-02 15:19 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29  4:27 So, does btrfs check lowmem take days? weeks? Marc MERLIN
2018-06-29  5:07 ` Qu Wenruo
2018-06-29  5:28   ` Marc MERLIN
2018-06-29  5:48     ` Qu Wenruo
2018-06-29  6:06       ` Marc MERLIN
2018-06-29  6:29         ` Qu Wenruo
2018-06-29  6:59           ` Marc MERLIN
2018-06-29  7:09             ` Roman Mamedov
2018-06-29  7:22               ` Marc MERLIN
2018-06-29  7:34                 ` Roman Mamedov
2018-06-29  8:04                 ` Lionel Bouton
2018-06-29 16:24                   ` btrfs send/receive vs rsync Marc MERLIN
2018-06-30  8:18                     ` Duncan
2018-06-29  7:20             ` So, does btrfs check lowmem take days? weeks? Qu Wenruo
2018-06-29  7:28               ` Marc MERLIN
2018-06-29 17:10                 ` Marc MERLIN
2018-06-30  0:04                   ` Chris Murphy
2018-06-30  2:44                   ` Marc MERLIN
2018-06-30 14:49                     ` Qu Wenruo
2018-06-30 21:06                       ` Marc MERLIN
2018-06-29  6:02     ` Su Yue
2018-06-29  6:10       ` Marc MERLIN
2018-06-29  6:32         ` Su Yue
2018-06-29  6:43           ` Marc MERLIN
2018-07-01 23:22             ` Marc MERLIN
2018-07-02  2:02               ` Su Yue
2018-07-02  3:22                 ` Marc MERLIN
2018-07-02  6:22                   ` Su Yue
2018-07-02 14:05                     ` Marc MERLIN
2018-07-02 14:42                       ` Qu Wenruo
2018-07-02 15:18                         ` how to best segment a big block device in resizeable btrfs filesystems? Marc MERLIN
2018-07-02 16:59                           ` Austin S. Hemmelgarn
2018-07-02 17:34                             ` Marc MERLIN
2018-07-02 18:35                               ` Austin S. Hemmelgarn
2018-07-02 19:40                                 ` Marc MERLIN
2018-07-03  4:25                                 ` Andrei Borzenkov
2018-07-03  7:15                                   ` Duncan
2018-07-06  4:28                                     ` Andrei Borzenkov
2018-07-08  8:05                                       ` Duncan
2018-07-03  0:51                           ` Paul Jones
2018-07-03  4:06                             ` Marc MERLIN
2018-07-03  4:26                               ` Paul Jones
2018-07-03  5:42                                 ` Marc MERLIN
2018-07-03  1:37                           ` Qu Wenruo
2018-07-03  4:15                             ` Marc MERLIN
2018-07-03  9:55                               ` Paul Jones
2018-07-03 11:29                                 ` Qu Wenruo
2018-07-03  4:23                             ` Andrei Borzenkov
2018-07-02 15:19                         ` Marc MERLIN [this message]
2018-07-02 17:08                           ` So, does btrfs check lowmem take days? weeks? Austin S. Hemmelgarn
2018-07-02 17:33                           ` Roman Mamedov
2018-07-02 17:39                             ` Marc MERLIN
2018-07-03  0:31                         ` Chris Murphy
2018-07-03  4:22                           ` Marc MERLIN
2018-07-03  8:34                             ` Su Yue
2018-07-03 21:34                               ` Chris Murphy
2018-07-03 21:40                                 ` Marc MERLIN
2018-07-04  1:37                                   ` Su Yue
2018-07-03  8:50                             ` Qu Wenruo
2018-07-03 14:38                               ` Marc MERLIN
2018-07-03 21:46                               ` Chris Murphy
2018-07-03 22:00                                 ` Marc MERLIN
2018-07-03 22:52                                   ` Qu Wenruo
2018-06-29  5:35   ` Su Yue
2018-06-29  5:46     ` Marc MERLIN
     [not found] <94caf6c5-77e1-3da0-d026-a29edb08d410@cn.fujitsu.com>
     [not found] ` <CAKhhfD6svMo=28_UX=ZjRRmF6zNadd3H+8vVZKGX4zjqVr-giw@mail.gmail.com>
     [not found]   ` <3a83cb3c-de2b-e803-f07e-31f7de0ee25f@cn.fujitsu.com>
     [not found]     ` <b1b2d361-eb1a-f172-45d3-409abd131d2b@cn.fujitsu.com>
     [not found]       ` <20180705153023.GA30566@merlins.org>
     [not found]         ` <trinity-d028b6bd-31d9-41c0-a091-47bcb810cdc3-1530808069711@msvc-mesg-gmx023>
     [not found]           ` <20180705165049.t56dvqpz7ljjan5c@merlins.org>
     [not found]             ` <trinity-79578bdf-a849-4342-a082-f2b882f2251e-1530810500266@msvc-mesg-gmx024>
     [not found]               ` <20180706160523.kxwxjzwneseaamnt@merlins.org>
     [not found]                 ` <20180706175636.53ebp7drifiqu5b7@merlins.org>
     [not found]                   ` <20180707172114.bfc26eoahullffgg@merlins.org>
2018-07-10  1:37                     ` Su Yue
2018-07-10  1:34                       ` Qu Wenruo
2018-07-10  3:50                         ` Marc MERLIN
2018-07-10  4:55                           ` Qu Wenruo
2018-07-10 10:44                             ` Su Yue
     [not found] <f9bc21d6-fdc3-ca3a-793f-6fe574c7b8c6@cn.fujitsu.com>
     [not found] ` <20180709031054.qfg4x5yzcl4rao2k@merlins.org>
     [not found]   ` <20180709031501.iutlokfvodtkkfhe@merlins.org>
     [not found]     ` <17cc0cc1-b64d-4daa-18b5-bb2da3736ea1@cn.fujitsu.com>
     [not found]       ` <20180709034058.wjavwjdyixx6smbw@merlins.org>
     [not found]         ` <29302c14-e277-2c69-ac08-c4722c2b18aa@cn.fujitsu.com>
     [not found]           ` <20180709155306.zr3p2kolnanvkpny@merlins.org>
     [not found]             ` <trinity-4aae1c42-a85e-4c73-a30e-8b0d0be05e86-1531152875875@msvc-mesg-gmx023>
     [not found]               ` <20180709174818.wq2d4awmgasxgwad@merlins.org>
     [not found]                 ` <faba0923-8d1f-5270-ba03-ce9cc484e08a@gmx.com>
2018-07-10  4:00                   ` Marc MERLIN
     [not found]                 ` <trinity-4546309e-d603-4d29-885a-e76da594f792-1531159860064@msvc-mesg-gmx021>
     [not found]                   ` <20180709222218.GP9859@merlins.org>
     [not found]                     ` <440b7d12-3504-8b4f-5aa4-b1f39f549730@cn.fujitsu.com>
     [not found]                       ` <20180710041037.4ynitx3flubtwtvc@merlins.org>
     [not found]                         ` <58b36f04-3094-7de0-8d5e-e06e280aac00@cn.fujitsu.com>
2018-07-11  1:08                           ` Su Yue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180702151903.j3j522urrtyznylf@merlins.org \
    --to=marc@merlins.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    --cc=suy.fnst@cn.fujitsu.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.