All of lore.kernel.org
 help / color / mirror / Atom feed
From: John Ettedgui <john.ettedgui@gmail.com>
To: Qu Wenruo <quwenruo.btrfs@gmx.com>
Cc: Austin S Hemmelgarn <ahferroin7@gmail.com>,
	btrfs <linux-btrfs@vger.kernel.org>
Subject: Re: mount btrfs takes 30 minutes, btrfs check runs out of memory
Date: Tue, 13 Feb 2018 04:06:43 -0800	[thread overview]
Message-ID: <CAJ3TwYTAhOc1eRUC_FoUFjpC=Ys_DRbHBp9NW6EdVSXArqiMww@mail.gmail.com> (raw)
In-Reply-To: <a27b619e-6e39-20e6-413e-c9d5c7a90b6f@gmx.com>

On Tue, Feb 13, 2018 at 3:40 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>
>
> On 2018年02月13日 19:25, John Ettedgui wrote:
>> On Tue, Feb 13, 2018 at 3:04 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>>
>>>
>>>
>>> The problem is not about how much space it takes, but how many extents
>>> are here in the filesystem.
>>>
>>> For new fs filled with normal data, I'm pretty sure data extents will be
>>> as large as its maximum size (256M), causing very little or even no
>>> pressure to block group search.
>>>
>> What do you mean by "new fs",
>
> I mean the 4TB partition on that 5400rpm HDD.
>
>> was there any change that would improve
>> the behavior if I were to recreate the FS?
>
> If you backed up your fs, and recreate a new, empty btrfs on your
> original SSD, then copying all data back, I believe it would be much
> faster to mount.
>
Alright, I'll have to wait on getting some more drives for that but I
look forward to trying that.

>> Last time we talked I believe max extent was 128M for non-compressed
>> files, so maybe there's been some good change.
>
> My fault, 128M is correct.
>
>>> And since I went to SUSE, some mail/info is lost during the procedure.
>> I still have all mails, if you want it. No dump left though.
>>>
>>> Despite that, I have several more assumption to this problem:
>>>
>>> 1) Metadata usage bumped by inline files
>> What are inline files? Should I view this as inline in C, in that the
>> small files are stored in the tree directly?
>
> Exactly.
>
>>>    If there are a lot of small files (<2K as default),
>> Of the slow to mount partitions:
>> 2 partitions have less than a dozen files smaller than 2K.
>> 1 has about 5 thousand and the last one 15 thousand.
>> Are the latter considered a lot?
>
> If using default 16K nodesize, 8 small files takes one leaf.
> And 15K small failes means about 2K tree extents.
>
> Not that much in my opinion, can't even fill half of a metadata chunk.
>
>>> and your metadata
>>>    usage is quite high (generally speaking, it meta:data ratio should be
>>>    way below 1:8), that may be the cause.
>>>
>> The ratio is about 1:900 on average so that should be ok I guess.
>
> Yep, that should be fine.
> So not metadata to blame.
>
> Then purely fragmented data extents.
>
>>>    If so, try mount the fs with "max_inline=0" mount option and then
>>>    try to rewrite such small files.
>>>
>> Should I try that?
>
> No need, it won't cause much difference.

Alright!

>>> 2) SSD write amplification along with dynamic remapping
>>>    To be honest, I'm not really buying this idea, since mount doesn't
>>>    have anything related to write.
>>>    But running fstrim won't harm anyway.
>>>
>> Oh I am not complaining about slow SSDs mounting. I was just amazed
>> that a partition on a slow HDD mounted faster.
>> Without any specific work, my SSDs partitions tend to mount around 1 sec or so.
>> Of course I'd be happy to worry about them once all the partitions on
>> HDDs mount in a handful of ms :)
>>
>>> 3) Rewrite the existing files (extreme defrag)
>>>    In fact, defrag doesn't work well if there are subvolumes/snapshots
>> I have no subvolume or snapshot so that's not a problem.
>>>    /reflink involved.
>>>    The most stupid and mindless way, is to write a small script and find
>>>    all regular files, read them out and rewrite it back.
>>>
>> That's fairly straightforward to do, though it should be quite slow so
>> I'd hope not to have to do that too often.
>
> Then it could be tried on the most frequently updated files then.

That's an interesting idea.
More than 3/4 of the data is just storage, so that should be very ok.

>
> And since you don't use snapshot, locate such files and then "chattr +C"
> would make them nodatacow, reducing later fragments.

I don't understand, why would that reduce later fragments?

>
>>>    This should acts much better than traditional defrag, although it's
>>>    time-consuming and makes snapshot completely meaningless.
>>>    (and since you're already hitting ENOSPC, I don't think the idea is
>>>     really working for you)
>>>
>>> And since you're already hitting ENOSPC, either it's caused by
>>> unbalanced meta/data usage, or it's really going hit the limit, I would
>>> recommend to enlarge the fs or delete some files to see if it helps.
>>>
>> Yup, I usually either slowly ramp up the {d,m}usage to pass it, or
>> when that does not work I free some space, then balance will finish.
>> Or did you mean to free some space to see about mount speed?
>
> Kind of, just do such freeing in advance, and try to make btrfs always
> have unallocated space in case.
>

I actually have very little free space on those partitions, usually
under 90Gb, maybe that's part of my problem.

> And finally, use latest kernel if possible.
> IIRC old kernel doesn't have empty block group auto remove, which makes
> user need to manually balance to free some space.
>
> Thanks,
> Qu
>

I am on 4.15 so no problem there.

So manual defrag and new FS to try.

Thank you!

  reply	other threads:[~2018-02-13 12:06 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CAJ3TwYQXqUZiKhYc5rciTmvGX1RLkHnkQb5SSYAJ7AD+kbudag@mail.gmail.com>
2015-07-31  2:34 ` mount btrfs takes 30 minutes, btrfs check runs out of memory Qu Wenruo
2015-07-31  4:10   ` John Ettedgui
2015-08-02  5:44     ` Georgi Georgiev
     [not found]   ` <CAJ3TwYRN+1tJY+paz=qZT0_XP=r9CcTKbBgX_kZRFOWj8vSK=w@mail.gmail.com>
2015-07-31  4:52     ` Qu Wenruo
     [not found]       ` <CAJ3TwYR5g-JhjmGnZUXqLXc7qV1_=AN5_6sj54JQODbtgG9Aag@mail.gmail.com>
2015-07-31  5:40         ` Qu Wenruo
2015-07-31  5:45           ` John Ettedgui
2015-08-01  4:35             ` John Ettedgui
2015-08-01 10:05               ` Russell Coker
2015-08-04  1:39               ` Qu Wenruo
2015-08-04  1:55                 ` John Ettedgui
2015-08-04  2:31                   ` John Ettedgui
2015-08-04  3:01                   ` Qu Wenruo
2015-08-04  4:58                     ` John Ettedgui
2015-08-04  6:47                       ` Duncan
2015-08-04 11:28                       ` Austin S Hemmelgarn
2015-08-04 17:36                         ` John Ettedgui
2015-08-05 11:30                           ` Austin S Hemmelgarn
2015-08-13 22:38                             ` Vincent Olivier
2015-08-13 23:19                               ` Chris Murphy
2015-08-14  0:30                                 ` Duncan
2015-08-14  2:42                                   ` Vincent Olivier
2015-08-18 17:36                                     ` Vincent Olivier
2015-08-14  2:39                                 ` Vincent Olivier
     [not found]                             ` <CAJ3TwYSW+SvbBrh1u_x+c3HTRx03qSR6BoH5cj_VzCXxZYv6EA@mail.gmail.com>
2016-07-15  3:56                               ` Qu Wenruo
     [not found]                                 ` <CAJ3TwYRXwDVVfT0TRRiM9dEw-7TvY8qG=WvMYKczZOv6wkFWAQ@mail.gmail.com>
2016-07-15  5:24                                   ` Qu Wenruo
2016-07-15  6:56                                     ` Kai Krakow
     [not found]                                     ` <CAJ3TwYSTnQfj=qmBLtnmtXQKexMMD4x=9Gk3p3anf4uF+G26kw@mail.gmail.com>
     [not found]                                       ` <CAJ3TwYTnMPVwkrZEU-=Q_Nq+9Bn0vM3z+EFC8RP=RTyaufSoqw@mail.gmail.com>
2016-07-18  1:13                                         ` Qu Wenruo
     [not found]                                           ` <CAJ3TwYRpc_R-wVur0T6+Uy_aPVXTGpvp_ag1Ar9K2HoB0H1ySQ@mail.gmail.com>
2016-07-18  8:41                                             ` Qu Wenruo
     [not found]                                               ` <CAJ3TwYRH8JVkuv2Hu7FYb+BSwKGrq1spx079zwOF_FO1y=9NFA@mail.gmail.com>
2016-07-18  9:07                                                 ` Qu Wenruo
2016-07-18 15:31                                                   ` Duncan
     [not found]                                                   ` <CAJ3TwYS6UTkWf=PNku3RG7hPrXMKz3yhk2WqCRLix4v_VwgrmA@mail.gmail.com>
2016-07-21  8:10                                                     ` Qu Wenruo
     [not found]                                                       ` <CAJ3TwYQ47SVpbO1Pb-TWjhaTCCpMFFmijwTgmV8=7+1_a6_3Ww@mail.gmail.com>
2016-07-21  8:19                                                         ` Qu Wenruo
2016-07-21 15:47                                                           ` Graham Cobb
2017-04-10  0:52                                                             ` Qu Wenruo
2018-02-13 10:21                                                           ` John Ettedgui
2018-02-13 11:04                                                             ` Qu Wenruo
2018-02-13 11:25                                                               ` John Ettedgui
2018-02-13 11:40                                                                 ` Qu Wenruo
2018-02-13 12:06                                                                   ` John Ettedgui [this message]
2018-02-13 12:46                                                                     ` Qu Wenruo
2018-02-13 12:52                                                                       ` John Ettedgui
2018-02-13 12:26                                                                   ` Holger Hoffstätte
2018-02-13 12:54                                                                     ` Qu Wenruo
2018-02-13 16:24                                                                       ` Holger Hoffstätte
2018-02-14  0:43                                                                         ` Qu Wenruo
2016-07-15 11:29                                 ` Christian Rohmann
2016-07-16 23:53                                   ` Qu Wenruo
2016-07-18 13:42                                     ` Josef Bacik
2016-07-19  0:35                                       ` Qu Wenruo
2016-07-25 13:01                                       ` David Sterba
2016-07-25 13:38                                         ` Josef Bacik
2015-08-04 14:38                     ` Chris Murphy
2015-07-29  5:46 Georgi Georgiev
2015-07-29  6:19 ` Qu Wenruo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAJ3TwYTAhOc1eRUC_FoUFjpC=Ys_DRbHBp9NW6EdVSXArqiMww@mail.gmail.com' \
    --to=john.ettedgui@gmail.com \
    --cc=ahferroin7@gmail.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=quwenruo.btrfs@gmx.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.