linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris Dunlop <chris@onthe.net.au>
To: Dave Chinner <david@fromorbit.com>
Cc: linux-xfs@vger.kernel.org
Subject: Re: Highly reflinked and fragmented considered harmful?
Date: Tue, 10 May 2022 12:55:41 +1000	[thread overview]
Message-ID: <20220510025541.GA192172@onthe.net.au> (raw)
In-Reply-To: <20220509230918.GP1098723@dread.disaster.area>

Hi Dave,

On Tue, May 10, 2022 at 09:09:18AM +1000, Dave Chinner wrote:
> On Mon, May 09, 2022 at 12:46:59PM +1000, Chris Dunlop wrote:
>> Is it to be expected that removing 29TB of highly reflinked and fragmented
>> data could take days, the entire time blocking other tasks like "rm" and
>> "df" on the same filesystem?
...
> At some point, you have to pay the price of creating billions of
> random fine-grained cross references in tens of TBs of data spread
> across weeks and months of production. You don't notice the scale of
> the cross-reference because it's taken weeks and months of normal
> operations to get there. It's only when you finally have to perform
> an operation that needs to iterate all those references that the
> scale suddenly becomes apparent. XFS scales to really large numbers
> without significant degradation, so people don't notice things like
> object counts or cross references until something like this
> happens.
>
> I don't think there's much we can do at the filesystem level to help
> you at this point - the inode output in the transaction dump above
> indicates that you haven't been using extent size hints to limit
> fragmentation or extent share/COW sizes, so the damage is already
> present and we can't really do anything to fix that up.

Thanks for taking the time to provide a detailed and informative
exposition, it certainly helps me understand what I'm asking of the fs, 
the areas that deserve more attention, and how to approach analyzing the 
situation.

At this point I'm about 3 days from completing copying the data (from a 
snapshot of the troubled fs mounted with 'norecovery') over to a brand new 
fs. Unfortunately the new fs is also rmapbt=1 so I'll go through all the 
copying again (under more controlled circumstances) to get onto a rmapbt=0 
fs (losing the ability to do online repairs whenever that arrives - 
hopefully that won't come back to haunt me).

Out of interest:

>> - with a reboot/remount, does the log replay continue from where it left
>> off, or start again?

Sorry, if you provided an answer to this, I didn't understand it.

Basically the question is, if a recovery on mount were going to take 10 
hours, but the box rebooted and fs mounted again at 8 hours, would the 
recovery this time take 2 hours or once again 10 hours?

Cheers,

Chris

  reply	other threads:[~2022-05-10  2:55 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-09  2:46 Highly reflinked and fragmented considered harmful? Chris Dunlop
2022-05-09 23:09 ` Dave Chinner
2022-05-10  2:55   ` Chris Dunlop [this message]
2022-05-10  5:14     ` Darrick J. Wong
2022-05-10  4:07   ` Amir Goldstein
2022-05-10  5:10     ` Darrick J. Wong
2022-05-10  6:30       ` Chris Dunlop
2022-05-10  8:16         ` Dave Chinner
2022-05-10 19:19           ` Darrick J. Wong
2022-05-10 21:54             ` Dave Chinner
2022-05-11  0:37               ` Darrick J. Wong
2022-05-11  1:36                 ` Dave Chinner
2022-05-11  2:16                   ` Chris Dunlop
2022-05-11  2:52                     ` Dave Chinner
2022-05-11  3:58                       ` Chris Dunlop
2022-05-11  5:18                         ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220510025541.GA192172@onthe.net.au \
    --to=chris@onthe.net.au \
    --cc=david@fromorbit.com \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).