From: ethanwu <ethanwu@synology.com>
To: linux-btrfs@vger.kernel.org
Cc: ethanwu <ethanwu@synology.com>
Subject: [PATCH 0/4] btrfs: improve normal backref walking
Date: Fri, 7 Feb 2020 17:38:14 +0800 [thread overview]
Message-ID: <20200207093818.23710-1-ethanwu@synology.com> (raw)
Btrfs has two types of data backref.
For BTRFS_EXTENT_DATA_REF_KEY type of backref, we don't have the
exact block number. Therefore, we need to call resolve_indirect_refs.
It uses btrfs_search_slot to locate the leaf block. Then
we need to walk through the leaves to search for the EXTENT_DATA items
that have disk bytenr matching the extent item(add_all_parents).
When resolving indirect refs, we could take entries that don't
belong to the backref entry we are searching for right now.
For that reason when searching backref entry, we always use total
refs of that EXTENT_ITEM rather than individual count.
For example:
item 11 key (40831553536 EXTENT_ITEM 4194304) itemoff 15460 itemsize
extent refs 24 gen 7302 flags DATA
shared data backref parent 394985472 count 10 #1
extent data backref root 257 objectid 260 offset 1048576 count 3 #2
extent data backref root 256 objectid 260 offset 65536 count 6 #3
extent data backref root 257 objectid 260 offset 65536 count 5 #4
For example, when searching backref entry #4, we'll use total_refs
24, a very loose loop ending condition, instead of total_refs = 5.
But using total_refs=24 is not accurate. Sometimes, we'll never find
all the refs from specific root.
As a result, the loop keeps on going until we reach the end of that inode.
The first 3 patches, handle 3 different types refs we might encounter.
These refs do not belong to the normal backref we are searching, and
hence need to be skipped.
The last patch changes the total_refs to correct number so that we could
end loop as soon as we find all the refs we want.
btrfs send uses backref to find possible clone sources, the following
is a simple test to compare the results with and without this patch
btrfs subvolume create /volume1/sub1
for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=64K count=1 seek=$((i-1)) conv=notrunc oflag=direct 2>/dev/null; done
btrfs subvolume snapshot /volume1/sub1 /volume1/sub2
for i in `seq 1 163840`; do dd if=/dev/zero of=/volume1/sub1/file bs=4K count=1 seek=$(((i-1)*16+10)) conv=notrunc oflag=direct 2>/dev/null; done
btrfs subvolume snapshot -r /volume1/sub1 /volume1/snap1
time btrfs send /volume1/snap1 | btrfs receive /volume2
without this patch
real 69m48.124s
user 0m50.199s
sys 70m15.600s
with this patch
real 1m59.683s
user 0m35.421s
sys 2m42.684s
ethanwu (4):
btrfs: backref, only collect file extent items matching backref offset
btrfs: backref, not adding refs from shared block when resolving
normal backref
btrfs: backref, only search backref entries from leaves of the same
root
btrfs: backref, use correct count to resolve normal data refs
fs/btrfs/backref.c | 156 +++++++++++++++++++++++++++++----------------
1 file changed, 100 insertions(+), 56 deletions(-)
--
2.17.1
next reply other threads:[~2020-02-07 9:39 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-02-07 9:38 ethanwu [this message]
2020-02-07 9:38 ` [PATCH 1/4] btrfs: backref, only collect file extent items matching backref offset ethanwu
2020-02-07 16:26 ` Josef Bacik
2020-02-10 9:12 ` ethanwu
2020-02-10 16:29 ` David Sterba
2020-02-11 4:03 ` ethanwu
2020-02-11 4:33 ` Qu Wenruo
2020-02-11 18:21 ` David Sterba
2020-02-12 11:32 ` ethanwu
2020-02-12 12:03 ` Filipe Manana
2020-02-12 12:11 ` Qu Wenruo
2020-02-12 14:57 ` David Sterba
2020-02-13 0:59 ` Qu Wenruo
2020-02-18 16:54 ` David Sterba
2020-02-10 10:33 ` Johannes Thumshirn
2020-02-07 9:38 ` [PATCH 2/4] btrfs: backref, not adding refs from shared block when resolving normal backref ethanwu
2020-02-07 16:35 ` Josef Bacik
2020-02-10 10:51 ` Johannes Thumshirn
2020-02-07 9:38 ` [PATCH 3/4] btrfs: backref, only search backref entries from leaves of the same root ethanwu
2020-02-07 16:37 ` Josef Bacik
2020-02-10 10:54 ` Johannes Thumshirn
2020-02-07 9:38 ` [PATCH 4/4] btrfs: backref, use correct count to resolve normal data refs ethanwu
2020-02-07 16:39 ` Josef Bacik
2020-02-10 10:55 ` Johannes Thumshirn
2020-02-20 16:41 ` [PATCH 0/4] btrfs: improve normal backref walking David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200207093818.23710-1-ethanwu@synology.com \
--to=ethanwu@synology.com \
--cc=linux-btrfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).