linux-btrfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 00/10] btrfs: make lseek and fiemap much more efficient
Date: Fri, 02 Sep 2022 19:41:07 +0800	[thread overview]
Message-ID: <20220902194107.8878.409509F4@e16-tech.com> (raw)
In-Reply-To: <CAL3q7H79BWAJVk2ecWqa4mbW0+WFJrEX-=a+Zg9FOc_UcAKjLg@mail.gmail.com>

Hi,

> On Fri, Sep 2, 2022 at 2:09 AM Wang Yugui <wangyugui@e16-tech.com> wrote:
> >
> > Hi,
> >
> > > From: Filipe Manana <fdmanana@suse.com>
> > >
> > > We often get reports of fiemap and hole/data seeking (lseek) being too slow
> > > on btrfs, or even unusable in some cases due to being extremely slow.
> > >
> > > Some recent reports for fiemap:
> > >
> > >     https://lore.kernel.org/linux-btrfs/21dd32c6-f1f9-f44a-466a-e18fdc6788a7@virtuozzo.com/
> > >     https://lore.kernel.org/linux-btrfs/Ysace25wh5BbLd5f@atmark-techno.com/
> > >
> > > For lseek (LSF/MM from 2017):
> > >
> > >    https://lwn.net/Articles/718805/
> > >
> > > Basically both are slow due to very high algorithmic complexity which
> > > scales badly with the number of extents in a file and the heigth of
> > > subvolume and extent b+trees.
> > >
> > > Using Pavel's test case (first Link tag for fiemap), which uses files with
> > > many 4K extents and holes before and after each extent (kind of a worst
> > > case scenario), the speedup is of several orders of magnitude (for the 1G
> > > file, from ~225 seconds down to ~0.1 seconds).
> > >
> > > Finally the new algorithm for fiemap also ends up solving a bug with the
> > > current algorithm. This happens because we are currently relying on extent
> > > maps to report extents, which can be merged, and this may cause us to
> > > report 2 different extents as a single one that is not shared but one of
> > > them is shared (or the other way around). More details on this on patches
> > > 9/10 and 10/10.
> > >
> > > Patches 1/10 and 2/10 are for lseek, introducing some code that will later
> > > be used by fiemap too (patch 10/10). More details in the changelogs.
> > >
> > > There are a few more things that can be done to speedup fiemap and lseek,
> > > but I'll leave those other optimizations I have in mind for some other time.
> > >
> > > Filipe Manana (10):
> > >   btrfs: allow hole and data seeking to be interruptible
> > >   btrfs: make hole and data seeking a lot more efficient
> > >   btrfs: remove check for impossible block start for an extent map at fiemap
> > >   btrfs: remove zero length check when entering fiemap
> > >   btrfs: properly flush delalloc when entering fiemap
> > >   btrfs: allow fiemap to be interruptible
> > >   btrfs: rename btrfs_check_shared() to a more descriptive name
> > >   btrfs: speedup checking for extent sharedness during fiemap
> > >   btrfs: skip unnecessary extent buffer sharedness checks during fiemap
> > >   btrfs: make fiemap more efficient and accurate reporting extent sharedness
> > >
> > >  fs/btrfs/backref.c     | 153 ++++++++-
> > >  fs/btrfs/backref.h     |  20 +-
> > >  fs/btrfs/ctree.h       |  22 +-
> > >  fs/btrfs/extent-tree.c |  10 +-
> > >  fs/btrfs/extent_io.c   | 703 ++++++++++++++++++++++++++++-------------
> > >  fs/btrfs/file.c        | 439 +++++++++++++++++++++++--
> > >  fs/btrfs/inode.c       | 146 ++-------
> > >  7 files changed, 1111 insertions(+), 382 deletions(-)
> >
> >
> > An infinite loop happen when the 10 pathes applied to 6.0-rc3.
> 
> Nop, it's not an infinite loop, and it happens as well before the patchset.
> The reason is that the files created by the test are very sparse and
> with small extents.
> It's full of 4K extents surrounded by 8K holes.
> 
> So any one doing hole seeking, advances 8K on every lseek call.
> If you strace the cp process, with
> 
> strace -p <cp pid>
> 
> You'll see something like this filling your terminal:
> 
> (...)
> lseek(3, 18808832, SEEK_SET)            = 18808832
> write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 4096) = 4096
> read(3, "a\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> 4096) = 4096
...
> lseek(3, 18857984, SEEK_SET)            = 18857984
> (...)
> 
> It takes a long time, but it finishes. If you notice the difference
> between each return
> value is exactly 8K.
> 
> That happens both before and after the patchset.

Yes. It takes a long time, but it finishes.
Thanks for the advice of 'strace -p <cp pid>'

more tests show that the performance depends on whether the data is
cached.

When data is not cached (echo 3 >/proc/sys/vm/drop_caches),
'/bin/cp /mnt/test/file1 /dev/null' take 97.37s.

When data is cached (/bin/cp again),
'/bin/cp /mnt/test/file1 /dev/null' take 2056.53s.

/mnt/test/file1 is 512M created by that producer.

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2022/09/02


  reply	other threads:[~2022-09-02 11:41 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-09-01 13:18 [PATCH 00/10] btrfs: make lseek and fiemap much more efficient fdmanana
2022-09-01 13:18 ` [PATCH 01/10] btrfs: allow hole and data seeking to be interruptible fdmanana
2022-09-01 13:58   ` Josef Bacik
2022-09-01 21:49   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 02/10] btrfs: make hole and data seeking a lot more efficient fdmanana
2022-09-01 14:03   ` Josef Bacik
2022-09-01 15:00     ` Filipe Manana
2022-09-02 13:26       ` Josef Bacik
2022-09-01 22:18   ` Qu Wenruo
2022-09-02  8:36     ` Filipe Manana
2022-09-11 22:12   ` Qu Wenruo
2022-09-12  8:38     ` Filipe Manana
2022-09-01 13:18 ` [PATCH 03/10] btrfs: remove check for impossible block start for an extent map at fiemap fdmanana
2022-09-01 14:03   ` Josef Bacik
2022-09-01 22:19   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 04/10] btrfs: remove zero length check when entering fiemap fdmanana
2022-09-01 14:04   ` Josef Bacik
2022-09-01 22:24   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 05/10] btrfs: properly flush delalloc " fdmanana
2022-09-01 14:06   ` Josef Bacik
2022-09-01 22:38   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 06/10] btrfs: allow fiemap to be interruptible fdmanana
2022-09-01 14:07   ` Josef Bacik
2022-09-01 22:42   ` Qu Wenruo
2022-09-02  8:38     ` Filipe Manana
2022-09-01 13:18 ` [PATCH 07/10] btrfs: rename btrfs_check_shared() to a more descriptive name fdmanana
2022-09-01 14:08   ` Josef Bacik
2022-09-01 22:45   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 08/10] btrfs: speedup checking for extent sharedness during fiemap fdmanana
2022-09-01 14:23   ` Josef Bacik
2022-09-01 22:50   ` Qu Wenruo
2022-09-02  8:46     ` Filipe Manana
2022-09-01 13:18 ` [PATCH 09/10] btrfs: skip unnecessary extent buffer sharedness checks " fdmanana
2022-09-01 14:26   ` Josef Bacik
2022-09-01 23:01   ` Qu Wenruo
2022-09-01 13:18 ` [PATCH 10/10] btrfs: make fiemap more efficient and accurate reporting extent sharedness fdmanana
2022-09-01 14:35   ` Josef Bacik
2022-09-01 15:04     ` Filipe Manana
2022-09-02 13:25       ` Josef Bacik
2022-09-01 23:27   ` Qu Wenruo
2022-09-02  8:59     ` Filipe Manana
2022-09-02  9:34       ` Qu Wenruo
2022-09-02  9:41         ` Filipe Manana
2022-09-02  9:50           ` Qu Wenruo
2022-09-02  0:53 ` [PATCH 00/10] btrfs: make lseek and fiemap much more efficient Wang Yugui
2022-09-02  8:24   ` Filipe Manana
2022-09-02 11:41     ` Wang Yugui [this message]
2022-09-02 11:45     ` Filipe Manana
2022-09-05 14:39       ` Filipe Manana
2022-09-06 16:20 ` David Sterba
2022-09-06 17:13   ` Filipe Manana
2022-09-07  9:12 ` Christoph Hellwig
2022-09-07  9:47   ` Filipe Manana

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220902194107.8878.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).