All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dave Chinner <david@fromorbit.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: linux-btrfs@vger.kernel.org, xfs <linux-xfs@vger.kernel.org>
Subject: Re: Unexpected reflink/subvol snapshot behaviour
Date: Tue, 2 Feb 2021 17:02:13 +1100	[thread overview]
Message-ID: <20210202060213.GU4662@dread.disaster.area> (raw)
In-Reply-To: <20210202021421.GA7181@magnolia>

On Mon, Feb 01, 2021 at 06:14:21PM -0800, Darrick J. Wong wrote:
> On Fri, Jan 22, 2021 at 09:20:51AM +1100, Dave Chinner wrote:
> > Hi btrfs-gurus,
> > 
> > I'm running a simple reflink/snapshot/COW scalability test at the
> > moment. It is just a loop that does "fio overwrite of 10,000 4kB
> > random direct IOs in a 4GB file; snapshot" and I want to check a
> > couple of things I'm seeing with btrfs. fio config file is appended
> > to the email.
> > 
> > Firstly, what is the expected "space amplification" of such a
> > workload over 1000 iterations on btrfs? This will write 40GB of user
> > data, and I'm seeing btrfs consume ~220GB of space for the workload
> > regardless of whether I use subvol snapshot or file clones
> > (reflink).  That's a space amplification of ~5.5x (a lot!) so I'm
> > wondering if this is expected or whether there's something else
> > going on. XFS amplification for 1000 iterations using reflink is
> > only 1.4x, so 5.5x seems somewhat excessive to me.
> > 
> > On a similar note, the IO bandwidth consumed by btrfs is way out of
> > proportion with the amount of user data being written. I'm seeing
> > multiple GBs being written by btrfs on every iteration - easily
> > exceeding 5GB of writes per cycle in the later iterations of the
> > test. Given that only 40MB of user data is being written per cycle,
> > there's a write amplification factor of well over 100x ocurring
> > here. In comparison, XFS is writing roughly consistently at 80MB/s
> > to disk over the course of the entire workload, largely because of
> > journal traffic for the transactions run during COW and clone
> > operations.  Is such a huge amount of of IO expected for btrfs in
> > this situation?
> 
> <just gonna snip this part>
> 
> > FYI, I've compared btrfs reflink to XFS reflink, too, and XFS fio
> > performance stays largely consistent across all 1000 iterations at
> > around 13-14k +/-2k IOPS. The reflink time also scales linearly with
> > the number of extents in the source file and levels off at about
> > 10-11s per cycle as the extent count in the source file levels off
> > at ~850,000 extents. XFS completes the 1000 iterations of
> > write/clone in about 4 hours, btrfs completels the same part of the
> > workload in about 9 hours.
> 
> Just out of curiosity, do any of the patches in [1] improve those
> numbers for xfs?  As you noted a long time ago, the transaction
> reservations are kind of huge, so I fixed those and shook out a few
> other warts while I was at it.

I'll give it a spin, but my initial reaction is "I don't think so".
The workload is does not have the concurrency necessary to be
sensitive to log reservation space running out...

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

      reply	other threads:[~2021-02-02  6:03 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-21 22:20 Unexpected reflink/subvol snapshot behaviour Dave Chinner
2021-01-23  8:42 ` Qu Wenruo
2021-01-23  8:51   ` Qu Wenruo
2021-01-23 10:39   ` Roman Mamedov
2021-01-23 10:58     ` Qu Wenruo
2021-01-24 13:08   ` Filipe Manana
2021-01-24 22:36   ` Dave Chinner
2021-01-25  1:09     ` Qu Wenruo
2021-01-29 23:25     ` Zygo Blaxell
2021-02-02  0:13       ` Dave Chinner
2021-02-12  3:04         ` Zygo Blaxell
2021-01-24  0:19 ` Zygo Blaxell
2021-01-24 21:43   ` Dave Chinner
2021-01-30  1:03     ` Zygo Blaxell
2021-02-02  2:14 ` Darrick J. Wong
2021-02-02  6:02   ` Dave Chinner [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210202060213.GU4662@dread.disaster.area \
    --to=david@fromorbit.com \
    --cc=djwong@kernel.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.