All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marc MERLIN <marc@merlins.org>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: linux-btrfs@vger.kernel.org
Subject: Re: So, does btrfs check lowmem take days? weeks?
Date: Thu, 28 Jun 2018 23:06:57 -0700	[thread overview]
Message-ID: <20180629060657.qrtcxfcy22zkstfw@merlins.org> (raw)
In-Reply-To: <3b240898-a96d-77f6-efb9-f0af81ee0cd1@gmx.com>

On Fri, Jun 29, 2018 at 01:48:17PM +0800, Qu Wenruo wrote:
> Just normal btrfs check, and post the output.
> If normal check eats up all your memory, btrfs check --mode=lowmem.
 
Does check without --repair eat less RAM?

> --repair should be considered as the last method.

If --repair doesn't work, check is useless to me sadly. I know that for
FS analysis and bug reporting, you want to have the FS without changing
it to something maybe worse, but for my use, if it can't be mounted and
can't be fixed, then it gets deleted which is even worse than check
doing the wrong thing.

> > The last two ERROR lines took over a day to get generated, so I'm not sure if it's still working, but just slowly.
> 
> OK, that explains something.
> 
> One extent is referred hundreds times, no wonder it will take a long time.
> 
> Just one tip here, there are really too many snapshots/reflinked files.
> It's highly recommended to keep the number of snapshots to a reasonable
> number (lower two digits).
> Although btrfs snapshot is super fast, it puts a lot of pressure on its
> extent tree, so there is no free lunch here.
 
Agreed, I doubt I have over or much over 100 snapshots though (but I
can't check right now).
Sadly I'm not allowed to mount even read only while check is running:
gargamel:~# mount -o ro /dev/mapper/dshelf2 /mnt/mnt2
mount: /dev/mapper/dshelf2 already mounted or /mnt/mnt2 busy

> > I see. Is there any reasonably easy way to check on this running process?
> 
> GDB attach would be good.
> Interrupt and check the inode number if it's checking fs tree.
> Check the extent bytenr number if it's checking extent tree.
> 
> But considering how many snapshots there are, it's really hard to determine.
> 
> In this case, the super large extent tree is causing a lot of problem,
> maybe it's a good idea to allow btrfs check to skip extent tree check?

I only see --init-extent-tree in the man page, which option did you have
in mind?

> > Then again, maybe it already fixed enough that I can mount my filesystem again.
> 
> This needs the initial btrfs check report and the kernel messages how it
> fails to mount.

mount command hangs, kernel does not show anything special outside of disk access hanging.

Jun 23 17:23:26 gargamel kernel: [  341.802696] BTRFS warning (device dm-2): 'recovery' is deprecated, use 'useback
uproot' instead
Jun 23 17:23:26 gargamel kernel: [  341.828743] BTRFS info (device dm-2): trying to use backup root at mount time
Jun 23 17:23:26 gargamel kernel: [  341.850180] BTRFS info (device dm-2): disk space caching is enabled
Jun 23 17:23:26 gargamel kernel: [  341.869014] BTRFS info (device dm-2): has skinny extents
Jun 23 17:23:26 gargamel kernel: [  342.206289] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0
Jun 23 17:26:26 gargamel kernel: [  521.571392] BTRFS info (device dm-2): enabling ssd optimizations
Jun 23 17:55:58 gargamel kernel: [ 2293.914867] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79750
Jun 23 17:56:22 gargamel kernel: [ 2317.718406] BTRFS info (device dm-2): disk space caching is enabled
Jun 23 17:56:22 gargamel kernel: [ 2317.737277] BTRFS info (device dm-2): has skinny extents
Jun 23 17:56:22 gargamel kernel: [ 2318.069461] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0
Jun 23 17:59:22 gargamel kernel: [ 2498.256167] BTRFS info (device dm-2): enabling ssd optimizations
Jun 23 18:05:23 gargamel kernel: [ 2859.107057] BTRFS info (device dm-2): disk space caching is enabled
Jun 23 18:05:23 gargamel kernel: [ 2859.125883] BTRFS info (device dm-2): has skinny extents
Jun 23 18:05:24 gargamel kernel: [ 2859.448018] BTRFS info (device dm-2): bdev /dev/mapper/dshelf2 errs: wr 0, rd 0 , flush 0, corrupt 2, gen 0
Jun 23 18:08:23 gargamel kernel: [ 3039.023305] BTRFS info (device dm-2): enabling ssd optimizations
Jun 23 18:13:41 gargamel kernel: [ 3356.626037] perf: interrupt took too long (3143 > 3133), lowering kernel.perf_event_max_sample_rate to 63500
Jun 23 18:17:23 gargamel kernel: [ 3578.937225] Process accounting resumed
Jun 23 18:33:47 gargamel kernel: [ 4563.356252] JFS: nTxBlock = 8192, nTxLock = 65536
Jun 23 18:33:48 gargamel kernel: [ 4563.446715] ntfs: driver 2.1.32 [Flags: R/W MODULE].
Jun 23 18:42:20 gargamel kernel: [ 5075.995254] INFO: task sync:20253 blocked for more than 120 seconds.
Jun 23 18:42:20 gargamel kernel: [ 5076.015729]       Not tainted 4.17.2-amd64-preempt-sysrq-20180817 #1
Jun 23 18:42:20 gargamel kernel: [ 5076.036141] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 18:42:20 gargamel kernel: [ 5076.060637] sync            D    0 20253  15327 0x20020080
Jun 23 18:42:20 gargamel kernel: [ 5076.078032] Call Trace:
Jun 23 18:42:20 gargamel kernel: [ 5076.086366]  ? __schedule+0x53e/0x59b
Jun 23 18:42:20 gargamel kernel: [ 5076.098311]  schedule+0x7f/0x98
Jun 23 18:42:20 gargamel kernel: [ 5076.108665]  __rwsem_down_read_failed_common+0x127/0x1a8
Jun 23 18:42:20 gargamel kernel: [ 5076.125565]  ? sync_fs_one_sb+0x20/0x20
Jun 23 18:42:20 gargamel kernel: [ 5076.137982]  ? call_rwsem_down_read_failed+0x14/0x30
Jun 23 18:42:20 gargamel kernel: [ 5076.154081]  call_rwsem_down_read_failed+0x14/0x30
Jun 23 18:42:20 gargamel kernel: [ 5076.169429]  down_read+0x13/0x25
Jun 23 18:42:20 gargamel kernel: [ 5076.180444]  iterate_supers+0x57/0xbe
Jun 23 18:42:20 gargamel kernel: [ 5076.192619]  ksys_sync+0x40/0xa4
Jun 23 18:42:20 gargamel kernel: [ 5076.203192]  __ia32_sys_sync+0xa/0xd
Jun 23 18:42:20 gargamel kernel: [ 5076.214774]  do_fast_syscall_32+0xaf/0xf3
Jun 23 18:42:20 gargamel kernel: [ 5076.227740]  entry_SYSENTER_compat+0x7f/0x91
Jun 23 18:44:21 gargamel kernel: [ 5196.828764] INFO: task sync:20253 blocked for more than 120 seconds.
Jun 23 18:44:21 gargamel kernel: [ 5196.848724]       Not tainted 4.17.2-amd64-preempt-sysrq-20180817 #1
Jun 23 18:44:21 gargamel kernel: [ 5196.868789] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Jun 23 18:44:21 gargamel kernel: [ 5196.893615] sync            D    0 20253  15327 0x20020080

> > But back to the main point, it's sad that after so many years, the
> > repair situation is still so suboptimal, especially when it's apparently
> > pretty easy for btrfs to get damaged (through its own fault or not, hard
> > to say).
> 
> Unfortunately, yes.
> Especially the extent tree is pretty fragile and hard to repair.

So, I don't know the code, but if I may make a suggestion (which maybe
is totally wrong, if so forgive me):
I would love for a repair mode that gives me a back a fixed
filesystem. I don't really care how much data is lost (although ideally
it would give me a list of files lost), but I want a working filesystem
at the end. I can then decide if there is enough data left on it to
restore what's missing or if I'm better off starting from scratch.

Is that possible at all?

Thanks,
Marc
-- 
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
                                      .... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/                       | PGP 7F55D5F27AAF9D08

  reply	other threads:[~2018-06-29  6:07 UTC|newest]

Thread overview: 72+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29  4:27 So, does btrfs check lowmem take days? weeks? Marc MERLIN
2018-06-29  5:07 ` Qu Wenruo
2018-06-29  5:28   ` Marc MERLIN
2018-06-29  5:48     ` Qu Wenruo
2018-06-29  6:06       ` Marc MERLIN [this message]
2018-06-29  6:29         ` Qu Wenruo
2018-06-29  6:59           ` Marc MERLIN
2018-06-29  7:09             ` Roman Mamedov
2018-06-29  7:22               ` Marc MERLIN
2018-06-29  7:34                 ` Roman Mamedov
2018-06-29  8:04                 ` Lionel Bouton
2018-06-29 16:24                   ` btrfs send/receive vs rsync Marc MERLIN
2018-06-30  8:18                     ` Duncan
2018-06-29  7:20             ` So, does btrfs check lowmem take days? weeks? Qu Wenruo
2018-06-29  7:28               ` Marc MERLIN
2018-06-29 17:10                 ` Marc MERLIN
2018-06-30  0:04                   ` Chris Murphy
2018-06-30  2:44                   ` Marc MERLIN
2018-06-30 14:49                     ` Qu Wenruo
2018-06-30 21:06                       ` Marc MERLIN
2018-06-29  6:02     ` Su Yue
2018-06-29  6:10       ` Marc MERLIN
2018-06-29  6:32         ` Su Yue
2018-06-29  6:43           ` Marc MERLIN
2018-07-01 23:22             ` Marc MERLIN
2018-07-02  2:02               ` Su Yue
2018-07-02  3:22                 ` Marc MERLIN
2018-07-02  6:22                   ` Su Yue
2018-07-02 14:05                     ` Marc MERLIN
2018-07-02 14:42                       ` Qu Wenruo
2018-07-02 15:18                         ` how to best segment a big block device in resizeable btrfs filesystems? Marc MERLIN
2018-07-02 16:59                           ` Austin S. Hemmelgarn
2018-07-02 17:34                             ` Marc MERLIN
2018-07-02 18:35                               ` Austin S. Hemmelgarn
2018-07-02 19:40                                 ` Marc MERLIN
2018-07-03  4:25                                 ` Andrei Borzenkov
2018-07-03  7:15                                   ` Duncan
2018-07-06  4:28                                     ` Andrei Borzenkov
2018-07-08  8:05                                       ` Duncan
2018-07-03  0:51                           ` Paul Jones
2018-07-03  4:06                             ` Marc MERLIN
2018-07-03  4:26                               ` Paul Jones
2018-07-03  5:42                                 ` Marc MERLIN
2018-07-03  1:37                           ` Qu Wenruo
2018-07-03  4:15                             ` Marc MERLIN
2018-07-03  9:55                               ` Paul Jones
2018-07-03 11:29                                 ` Qu Wenruo
2018-07-03  4:23                             ` Andrei Borzenkov
2018-07-02 15:19                         ` So, does btrfs check lowmem take days? weeks? Marc MERLIN
2018-07-02 17:08                           ` Austin S. Hemmelgarn
2018-07-02 17:33                           ` Roman Mamedov
2018-07-02 17:39                             ` Marc MERLIN
2018-07-03  0:31                         ` Chris Murphy
2018-07-03  4:22                           ` Marc MERLIN
2018-07-03  8:34                             ` Su Yue
2018-07-03 21:34                               ` Chris Murphy
2018-07-03 21:40                                 ` Marc MERLIN
2018-07-04  1:37                                   ` Su Yue
2018-07-03  8:50                             ` Qu Wenruo
2018-07-03 14:38                               ` Marc MERLIN
2018-07-03 21:46                               ` Chris Murphy
2018-07-03 22:00                                 ` Marc MERLIN
2018-07-03 22:52                                   ` Qu Wenruo
2018-06-29  5:35   ` Su Yue
2018-06-29  5:46     ` Marc MERLIN
     [not found] <94caf6c5-77e1-3da0-d026-a29edb08d410@cn.fujitsu.com>
     [not found] ` <CAKhhfD6svMo=28_UX=ZjRRmF6zNadd3H+8vVZKGX4zjqVr-giw@mail.gmail.com>
     [not found]   ` <3a83cb3c-de2b-e803-f07e-31f7de0ee25f@cn.fujitsu.com>
     [not found]     ` <b1b2d361-eb1a-f172-45d3-409abd131d2b@cn.fujitsu.com>
     [not found]       ` <20180705153023.GA30566@merlins.org>
     [not found]         ` <trinity-d028b6bd-31d9-41c0-a091-47bcb810cdc3-1530808069711@msvc-mesg-gmx023>
     [not found]           ` <20180705165049.t56dvqpz7ljjan5c@merlins.org>
     [not found]             ` <trinity-79578bdf-a849-4342-a082-f2b882f2251e-1530810500266@msvc-mesg-gmx024>
     [not found]               ` <20180706160523.kxwxjzwneseaamnt@merlins.org>
     [not found]                 ` <20180706175636.53ebp7drifiqu5b7@merlins.org>
     [not found]                   ` <20180707172114.bfc26eoahullffgg@merlins.org>
2018-07-10  1:37                     ` Su Yue
2018-07-10  1:34                       ` Qu Wenruo
2018-07-10  3:50                         ` Marc MERLIN
2018-07-10  4:55                           ` Qu Wenruo
2018-07-10 10:44                             ` Su Yue
     [not found] <f9bc21d6-fdc3-ca3a-793f-6fe574c7b8c6@cn.fujitsu.com>
     [not found] ` <20180709031054.qfg4x5yzcl4rao2k@merlins.org>
     [not found]   ` <20180709031501.iutlokfvodtkkfhe@merlins.org>
     [not found]     ` <17cc0cc1-b64d-4daa-18b5-bb2da3736ea1@cn.fujitsu.com>
     [not found]       ` <20180709034058.wjavwjdyixx6smbw@merlins.org>
     [not found]         ` <29302c14-e277-2c69-ac08-c4722c2b18aa@cn.fujitsu.com>
     [not found]           ` <20180709155306.zr3p2kolnanvkpny@merlins.org>
     [not found]             ` <trinity-4aae1c42-a85e-4c73-a30e-8b0d0be05e86-1531152875875@msvc-mesg-gmx023>
     [not found]               ` <20180709174818.wq2d4awmgasxgwad@merlins.org>
     [not found]                 ` <faba0923-8d1f-5270-ba03-ce9cc484e08a@gmx.com>
2018-07-10  4:00                   ` Marc MERLIN
     [not found]                 ` <trinity-4546309e-d603-4d29-885a-e76da594f792-1531159860064@msvc-mesg-gmx021>
     [not found]                   ` <20180709222218.GP9859@merlins.org>
     [not found]                     ` <440b7d12-3504-8b4f-5aa4-b1f39f549730@cn.fujitsu.com>
     [not found]                       ` <20180710041037.4ynitx3flubtwtvc@merlins.org>
     [not found]                         ` <58b36f04-3094-7de0-8d5e-e06e280aac00@cn.fujitsu.com>
2018-07-11  1:08                           ` Su Yue

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180629060657.qrtcxfcy22zkstfw@merlins.org \
    --to=marc@merlins.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.