git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, chriscool@tuxfamily.org
Subject: [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal
Date: Mon, 4 May 2020 17:12:38 -0600	[thread overview]
Message-ID: <f1eb194b9fdf5e385610fa3c0ddee22b89cf4f0e.1588633810.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1588633810.git.me@ttaylorr.com>

From: Jeff King <peff@peff.net>

Sometimes a bitmap traversal still has to walk some commits manually,
because those commits aren't included in the bitmap packfile (e.g., due
to a push or commit since the last full repack). If we're given an
object filter, we don't pass it down to this traversal. It's not
necessary for correctness because the bitmap code has its own filters to
post-process the bitmap result (which it must, to filter out the objects
that _are_ mentioned in the bitmapped packfile).

And with blob filters, there was no performance reason to pass along
those filters, either. The fill-in traversal could omit them from the
result, but it wouldn't save us any time to do so, since we'd still have
to walk each tree entry to see if it's a blob or not.

But now that we support tree filters, there's opportunity for savings. A
tree:depth=0 filter means we can avoid accessing trees entirely, since
we know we won't them (or any of the subtrees or blobs they point to).
The new test in p5310 shows this off (the "partial bitmap" state is one
where HEAD~100 and its ancestors are all in a bitmapped pack, but
HEAD~100..HEAD are not). Here are the results (run against linux.git):

  Test                                                  HEAD^               HEAD
  -------------------------------------------------------------------------------------------------
  [...]
  5310.16: rev-list with tree filter (partial bitmap)   0.19(0.17+0.02)     0.03(0.02+0.01) -84.2%

The absolute number of savings isn't _huge_, but keep in mind that we
only omitted 100 first-parent links (in the version of linux.git here,
that's 894 actual commits). In a more pathological case, we might have a
much larger proportion of non-bitmapped commits. I didn't bother
creating such a case in the perf script because the setup is expensive,
and this is plenty to show the savings as a percentage.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-bitmap.c                | 14 +++++++++-----
 t/perf/p5310-pack-bitmaps.sh |  5 +++++
 2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/pack-bitmap.c b/pack-bitmap.c
index 195ee8cad0..4077e731e8 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -506,7 +506,8 @@ static int should_include(struct commit *commit, void *_data)
 static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
 				   struct rev_info *revs,
 				   struct object_list *roots,
-				   struct bitmap *seen)
+				   struct bitmap *seen,
+				   struct list_objects_filter_options *filter)
 {
 	struct bitmap *base = NULL;
 	int needs_walk = 0;
@@ -599,8 +600,9 @@ static struct bitmap *find_objects(struct bitmap_index *bitmap_git,
 		show_data.bitmap_git = bitmap_git;
 		show_data.base = base;
 
-		traverse_commit_list(revs, show_commit, show_object,
-				     &show_data);
+		traverse_commit_list_filtered(filter, revs,
+					      show_commit, show_object,
+					      &show_data, NULL);
 	}
 
 	return base;
@@ -999,7 +1001,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
 
 	if (haves) {
 		revs->ignore_missing_links = 1;
-		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL);
+		haves_bitmap = find_objects(bitmap_git, revs, haves, NULL,
+					    filter);
 		reset_revision_walk();
 		revs->ignore_missing_links = 0;
 
@@ -1007,7 +1010,8 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs,
 			BUG("failed to perform bitmap walk");
 	}
 
-	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap);
+	wants_bitmap = find_objects(bitmap_git, revs, wants, haves_bitmap,
+				    filter);
 
 	if (!wants_bitmap)
 		BUG("failed to perform bitmap walk");
diff --git a/t/perf/p5310-pack-bitmaps.sh b/t/perf/p5310-pack-bitmaps.sh
index 75ccf9f4e3..b3e725f031 100755
--- a/t/perf/p5310-pack-bitmaps.sh
+++ b/t/perf/p5310-pack-bitmaps.sh
@@ -91,4 +91,9 @@ test_perf 'pack to file (partial bitmap)' '
 	git pack-objects --use-bitmap-index --all pack2b </dev/null >/dev/null
 '
 
+test_perf 'rev-list with tree filter (partial bitmap)' '
+	git rev-list --use-bitmap-index --count --objects --all \
+		--filter=tree:0 >/dev/null
+'
+
 test_done
-- 
2.26.0.113.ge9739cdccc

  parent reply	other threads:[~2020-05-04 23:12 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-04 23:12 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
2020-05-04 23:12 ` [PATCH 1/4] list-objects-filter: treat NULL filter_options as "disabled" Taylor Blau
2020-05-05  5:07   ` Junio C Hamano
2020-05-04 23:12 ` [PATCH 2/4] pack-bitmap.c: make object filtering functions generic Taylor Blau
2020-05-05  5:12   ` Junio C Hamano
2020-05-04 23:12 ` [PATCH 3/4] pack-bitmap.c: support 'tree:0' filtering Taylor Blau
2020-05-05  5:25   ` Junio C Hamano
2020-05-05 15:59     ` Taylor Blau
2020-05-05 18:20       ` Junio C Hamano
2020-05-04 23:12 ` Taylor Blau [this message]
2020-05-05  5:40   ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Junio C Hamano
2020-05-05 16:00     ` Taylor Blau
  -- strict thread matches above, loose matches on Subject: below --
2020-04-22 23:13 [PATCH 0/4] pack-bitmap: use bitmaps for traversals with '--filter=tree:0' Taylor Blau
2020-04-22 23:13 ` [PATCH 4/4] pack-bitmap: pass object filter to fill-in traversal Taylor Blau
2020-04-24  5:42   ` Jeff King
2020-04-24 16:54     ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f1eb194b9fdf5e385610fa3c0ddee22b89cf4f0e.1588633810.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=chriscool@tuxfamily.org \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).