* [PATCH 0/2] Drop some mis-uses of READA @ 2020-03-13 21:09 Josef Bacik 2020-03-13 21:09 ` [PATCH 1/2] btrfs: do not use READA for running delayed refs Josef Bacik ` (3 more replies) 0 siblings, 4 replies; 8+ messages in thread From: Josef Bacik @ 2020-03-13 21:09 UTC (permalink / raw) To: linux-btrfs, kernel-team In debugging Zygo's huge commit delays I noticed we were burning a bunch of time doing READA in cases where we don't need to. The way READA works in btrfs is we'll load up adjacent nodes and leaves as we walk down. This is useful for operations where we're going to be reading sequentially across the tree. But for delayed refs we're looking up one bytenr, and then another one which could be elsewhere in the tree. With large enough extent trees this results in a lot of unneeded latency. The same applies to build_backref_tree, but that's even worse because we're looking up backrefs, which are essentially randomly spread out across the extent root. Thanks, Josef ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 1/2] btrfs: do not use READA for running delayed refs 2020-03-13 21:09 [PATCH 0/2] Drop some mis-uses of READA Josef Bacik @ 2020-03-13 21:09 ` Josef Bacik 2020-03-14 0:18 ` Qu Wenruo 2020-03-13 21:09 ` [PATCH 2/2] btrfs: do not READA in build_backref_tree Josef Bacik ` (2 subsequent siblings) 3 siblings, 1 reply; 8+ messages in thread From: Josef Bacik @ 2020-03-13 21:09 UTC (permalink / raw) To: linux-btrfs, kernel-team READA will generate a lot of extra reads for adjacent nodes, but when running delayed refs we have no idea if the next ref is going to be adjacent or not, so this potentially just generates a lot of extra IO. To make matters worse each ref is truly just looking for one item, it doesn't generally search forward, so we simply don't need it here. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/extent-tree.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index a24ef1cef9fa..8e5b49baad98 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -1469,7 +1469,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, if (!path) return -ENOMEM; - path->reada = READA_FORWARD; path->leave_spinning = 1; /* this will setup the path even if it fails to insert the back ref */ ret = insert_inline_extent_backref(trans, path, bytenr, num_bytes, @@ -1494,7 +1493,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, btrfs_mark_buffer_dirty(leaf); btrfs_release_path(path); - path->reada = READA_FORWARD; path->leave_spinning = 1; /* now insert the actual backref */ ret = insert_extent_backref(trans, path, bytenr, parent, root_objectid, @@ -1604,7 +1602,6 @@ static int run_delayed_extent_op(struct btrfs_trans_handle *trans, } again: - path->reada = READA_FORWARD; path->leave_spinning = 1; ret = btrfs_search_slot(trans, fs_info->extent_root, &key, path, 0, 1); if (ret < 0) { @@ -2999,7 +2996,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, if (!path) return -ENOMEM; - path->reada = READA_FORWARD; path->leave_spinning = 1; is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID; -- 2.24.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 1/2] btrfs: do not use READA for running delayed refs 2020-03-13 21:09 ` [PATCH 1/2] btrfs: do not use READA for running delayed refs Josef Bacik @ 2020-03-14 0:18 ` Qu Wenruo 0 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2020-03-14 0:18 UTC (permalink / raw) To: Josef Bacik, linux-btrfs, kernel-team [-- Attachment #1.1: Type: text/plain, Size: 2019 bytes --] On 2020/3/14 上午5:09, Josef Bacik wrote: > READA will generate a lot of extra reads for adjacent nodes, but when > running delayed refs we have no idea if the next ref is going to be > adjacent or not, so this potentially just generates a lot of extra IO. > To make matters worse each ref is truly just looking for one item, it > doesn't generally search forward, so we simply don't need it here. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/extent-tree.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c > index a24ef1cef9fa..8e5b49baad98 100644 > --- a/fs/btrfs/extent-tree.c > +++ b/fs/btrfs/extent-tree.c > @@ -1469,7 +1469,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, > if (!path) > return -ENOMEM; > > - path->reada = READA_FORWARD; > path->leave_spinning = 1; > /* this will setup the path even if it fails to insert the back ref */ > ret = insert_inline_extent_backref(trans, path, bytenr, num_bytes, > @@ -1494,7 +1493,6 @@ static int __btrfs_inc_extent_ref(struct btrfs_trans_handle *trans, > btrfs_mark_buffer_dirty(leaf); > btrfs_release_path(path); > > - path->reada = READA_FORWARD; > path->leave_spinning = 1; > /* now insert the actual backref */ > ret = insert_extent_backref(trans, path, bytenr, parent, root_objectid, > @@ -1604,7 +1602,6 @@ static int run_delayed_extent_op(struct btrfs_trans_handle *trans, > } > > again: > - path->reada = READA_FORWARD; > path->leave_spinning = 1; > ret = btrfs_search_slot(trans, fs_info->extent_root, &key, path, 0, 1); > if (ret < 0) { > @@ -2999,7 +2996,6 @@ static int __btrfs_free_extent(struct btrfs_trans_handle *trans, > if (!path) > return -ENOMEM; > > - path->reada = READA_FORWARD; > path->leave_spinning = 1; > > is_data = owner_objectid >= BTRFS_FIRST_FREE_OBJECTID; > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH 2/2] btrfs: do not READA in build_backref_tree 2020-03-13 21:09 [PATCH 0/2] Drop some mis-uses of READA Josef Bacik 2020-03-13 21:09 ` [PATCH 1/2] btrfs: do not use READA for running delayed refs Josef Bacik @ 2020-03-13 21:09 ` Josef Bacik 2020-03-14 0:19 ` Qu Wenruo 2020-03-14 2:56 ` [PATCH 0/2] Drop some mis-uses of READA Qu Wenruo 2020-03-18 14:40 ` David Sterba 3 siblings, 1 reply; 8+ messages in thread From: Josef Bacik @ 2020-03-13 21:09 UTC (permalink / raw) To: linux-btrfs, kernel-team Here we are just searching down to the bytenr we're building the backref tree for, and all of it's paths to the roots. These bytenrs are not guaranteed to be anywhere near each other, so the READA just generates extra latency. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/relocation.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 4fb7e3cc2aca..3ccc126d0df3 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -759,8 +759,6 @@ struct backref_node *build_backref_tree(struct reloc_control *rc, err = -ENOMEM; goto out; } - path1->reada = READA_FORWARD; - path2->reada = READA_FORWARD; node = alloc_backref_node(cache); if (!node) { -- 2.24.1 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH 2/2] btrfs: do not READA in build_backref_tree 2020-03-13 21:09 ` [PATCH 2/2] btrfs: do not READA in build_backref_tree Josef Bacik @ 2020-03-14 0:19 ` Qu Wenruo 0 siblings, 0 replies; 8+ messages in thread From: Qu Wenruo @ 2020-03-14 0:19 UTC (permalink / raw) To: Josef Bacik, linux-btrfs, kernel-team [-- Attachment #1.1: Type: text/plain, Size: 919 bytes --] On 2020/3/14 上午5:09, Josef Bacik wrote: > Here we are just searching down to the bytenr we're building the backref > tree for, and all of it's paths to the roots. These bytenrs are not > guaranteed to be anywhere near each other, so the READA just generates > extra latency. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/relocation.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c > index 4fb7e3cc2aca..3ccc126d0df3 100644 > --- a/fs/btrfs/relocation.c > +++ b/fs/btrfs/relocation.c > @@ -759,8 +759,6 @@ struct backref_node *build_backref_tree(struct reloc_control *rc, > err = -ENOMEM; > goto out; > } > - path1->reada = READA_FORWARD; > - path2->reada = READA_FORWARD; > > node = alloc_backref_node(cache); > if (!node) { > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] Drop some mis-uses of READA 2020-03-13 21:09 [PATCH 0/2] Drop some mis-uses of READA Josef Bacik 2020-03-13 21:09 ` [PATCH 1/2] btrfs: do not use READA for running delayed refs Josef Bacik 2020-03-13 21:09 ` [PATCH 2/2] btrfs: do not READA in build_backref_tree Josef Bacik @ 2020-03-14 2:56 ` Qu Wenruo 2020-03-18 14:44 ` David Sterba 2020-03-18 14:40 ` David Sterba 3 siblings, 1 reply; 8+ messages in thread From: Qu Wenruo @ 2020-03-14 2:56 UTC (permalink / raw) To: Josef Bacik, linux-btrfs, kernel-team [-- Attachment #1.1: Type: text/plain, Size: 1091 bytes --] On 2020/3/14 上午5:09, Josef Bacik wrote: > In debugging Zygo's huge commit delays I noticed we were burning a bunch of time > doing READA in cases where we don't need to. The way READA works in btrfs is > we'll load up adjacent nodes and leaves as we walk down. This is useful for > operations where we're going to be reading sequentially across the tree. > > But for delayed refs we're looking up one bytenr, and then another one which > could be elsewhere in the tree. With large enough extent trees this results in > a lot of unneeded latency. > > The same applies to build_backref_tree, but that's even worse because we're > looking up backrefs, which are essentially randomly spread out across the extent > root. Thanks, There are quite some other locations abusing READA. E.g. btrfs_read_block_groups(), where we're just searching for block group items. There is no guarantee that next block group item is in next a few leaves. I guess it's a good time to review all READA abuse. Or would you mind me to do that? Thanks, Qu > > Josef > [-- Attachment #2: OpenPGP digital signature --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] Drop some mis-uses of READA 2020-03-14 2:56 ` [PATCH 0/2] Drop some mis-uses of READA Qu Wenruo @ 2020-03-18 14:44 ` David Sterba 0 siblings, 0 replies; 8+ messages in thread From: David Sterba @ 2020-03-18 14:44 UTC (permalink / raw) To: Qu Wenruo; +Cc: Josef Bacik, linux-btrfs, kernel-team On Sat, Mar 14, 2020 at 10:56:06AM +0800, Qu Wenruo wrote: > > > On 2020/3/14 上午5:09, Josef Bacik wrote: > > In debugging Zygo's huge commit delays I noticed we were burning a bunch of time > > doing READA in cases where we don't need to. The way READA works in btrfs is > > we'll load up adjacent nodes and leaves as we walk down. This is useful for > > operations where we're going to be reading sequentially across the tree. > > > > But for delayed refs we're looking up one bytenr, and then another one which > > could be elsewhere in the tree. With large enough extent trees this results in > > a lot of unneeded latency. > > > > The same applies to build_backref_tree, but that's even worse because we're > > looking up backrefs, which are essentially randomly spread out across the extent > > root. Thanks, > > There are quite some other locations abusing READA. > > E.g. btrfs_read_block_groups(), where we're just searching for block > group items. There is no guarantee that next block group item is in next > a few leaves. > > I guess it's a good time to review all READA abuse. Or would you mind me > to do that? If you find some clear example where the items are scattered over the tree then yes. For the rest it would be good to put a comment that the readahead really helps. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH 0/2] Drop some mis-uses of READA 2020-03-13 21:09 [PATCH 0/2] Drop some mis-uses of READA Josef Bacik ` (2 preceding siblings ...) 2020-03-14 2:56 ` [PATCH 0/2] Drop some mis-uses of READA Qu Wenruo @ 2020-03-18 14:40 ` David Sterba 3 siblings, 0 replies; 8+ messages in thread From: David Sterba @ 2020-03-18 14:40 UTC (permalink / raw) To: Josef Bacik; +Cc: linux-btrfs, kernel-team On Fri, Mar 13, 2020 at 05:09:52PM -0400, Josef Bacik wrote: > In debugging Zygo's huge commit delays I noticed we were burning a bunch of time > doing READA in cases where we don't need to. The way READA works in btrfs is > we'll load up adjacent nodes and leaves as we walk down. This is useful for > operations where we're going to be reading sequentially across the tree. > > But for delayed refs we're looking up one bytenr, and then another one which > could be elsewhere in the tree. With large enough extent trees this results in > a lot of unneeded latency. > > The same applies to build_backref_tree, but that's even worse because we're > looking up backrefs, which are essentially randomly spread out across the extent > root. Thanks, Makes sense, I'll add it to misc-next. Thanks. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2020-03-18 14:44 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-03-13 21:09 [PATCH 0/2] Drop some mis-uses of READA Josef Bacik 2020-03-13 21:09 ` [PATCH 1/2] btrfs: do not use READA for running delayed refs Josef Bacik 2020-03-14 0:18 ` Qu Wenruo 2020-03-13 21:09 ` [PATCH 2/2] btrfs: do not READA in build_backref_tree Josef Bacik 2020-03-14 0:19 ` Qu Wenruo 2020-03-14 2:56 ` [PATCH 0/2] Drop some mis-uses of READA Qu Wenruo 2020-03-18 14:44 ` David Sterba 2020-03-18 14:40 ` David Sterba
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).