All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: Jeff King <peff@peff.net>,
	Derrick Stolee <derrickstolee@github.com>,
	Junio C Hamano <gitster@pobox.com>
Subject: [PATCH 0/3] pack-bitmap: boundary-based bitmap traversal
Date: Mon, 24 Apr 2023 20:00:17 -0400	[thread overview]
Message-ID: <cover.1682380788.git.me@ttaylorr.com> (raw)

Here is a short (but dense) series that I worked on towards the end of
2021 with Peff.

Its goal is to improve the bitmap traversal we implement in
prepare_bitmap_walk(). Specifically, it avoids building up a complete
bitmap of the "haves" side, and instead uses a combination of (a)
UNINTERESTING commits between the tips and boundary, and (b) the
boundary itself.

The gory details are laid out in 3/3, but the high-level overview of the
new algorithm to compute the set of objects between "haves" and "wants"
using bitmaps is:

  1. Build a (partial) bitmap of the haves side by first OR-ing any
     bitmap(s) that already exist for UNINTERESTING commits between the
     haves and the boundary.

  2. For each commit along the boundary, add it as a fill-in traversal
     tip (where the traversal terminates once an existing bitmap is
     found), and perform fill-in traversal.

  3. Build up a complete bitmap of the wants side as usual, stopping any
     time we intersect the (partial) haves side.

This improves many cases where using bitmaps was significantly *slower*
than regular, non-bitmap traversal. In some instances, using bitmaps is
still slower than the non-bitmap case, but it is a substantial
improvement over the classic bitmap walk.

    $ ours="$(git branch --show-current)"
      argv="--count --objects $ours --not --exclude=$ours --branches"
      hyperfine \
        -n 'no bitmaps' "git.compile rev-list $argv" \
        -n 'existing bitmap traversal' "git rev-list --use-bitmap-index $argv" \
        -n 'this commit' "git.compile rev-list --use-bitmap-index $argv"
    Benchmark 1: no bitmaps
      Time (mean ± σ):      15.1 ms ±   4.1 ms    [User: 8.1 ms, System: 6.9 ms]
      Range (min … max):     7.4 ms …  21.8 ms    131 runs

    Benchmark 2: existing bitmap traversal
      Time (mean ± σ):      82.6 ms ±   9.2 ms    [User: 63.6 ms, System: 19.0 ms]
      Range (min … max):    73.8 ms … 105.4 ms    28 runs

    Benchmark 3: this commit
      Time (mean ± σ):      19.8 ms ±   3.1 ms    [User: 13.0 ms, System: 6.8 ms]
      Range (min … max):    17.7 ms …  38.6 ms    158 runs

    Summary
      'no bitmaps' ran
        1.31 ± 0.41 times faster than 'this commit'
        5.49 ± 1.62 times faster than 'existing bitmap traversal'

In a large repository on GitHub.com, the timings to compute the objects
unique to "master", as in:

    $ git rev-list --count --objects master --not --exclude=master --branches

improve as follows:

  - 1.012 sec (without bitmaps)
  - 29.571 sec (with the existing bitmap walk)
  - 2.279 sec (with these patches)

The first couple of patches are preparatory, and the third patch is the
substantive one.

Please have a look and poke any holes you can find in the new approach
:-). Thanks in advance for your review.

Taylor Blau (3):
  revision: support tracking uninteresting commits
  pack-bitmap.c: extract `fill_in_bitmap()`
  pack-bitmap.c: use commit boundary during bitmap traversal

 pack-bitmap.c | 252 ++++++++++++++++++++++++++++++++++++++------------
 revision.c    |  10 +-
 revision.h    |   5 +
 3 files changed, 205 insertions(+), 62 deletions(-)

-- 
2.40.0.380.gda896aa358.dirty

             reply	other threads:[~2023-04-25  0:00 UTC|newest]

Thread overview: 51+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-25  0:00 Taylor Blau [this message]
2023-04-25  0:00 ` [PATCH 1/3] revision: support tracking uninteresting commits Taylor Blau
2023-04-25 18:15   ` Derrick Stolee
2023-05-03 21:48     ` Taylor Blau
2023-05-04 13:46       ` Derrick Stolee
2023-05-03 22:08     ` Taylor Blau
2023-05-04 13:59       ` Derrick Stolee
2023-05-05 17:30         ` Taylor Blau
2023-04-25 18:22   ` Junio C Hamano
2023-04-25 18:48     ` Taylor Blau
2023-04-25  0:00 ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-04-25 18:32   ` Junio C Hamano
2023-04-25 18:51     ` [PATCH 2/3] pack-bitmap.c: extract `fill_in_bitmap()`t Taylor Blau
2023-04-25  0:00 ` [PATCH 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-04-25 18:52   ` Junio C Hamano
2023-05-02 21:31     ` Felipe Contreras
2023-05-03 21:42     ` Taylor Blau
2023-05-03 21:58       ` Junio C Hamano
2023-04-25 18:53   ` Derrick Stolee
2023-04-25 18:02 ` [PATCH 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-04-25 18:30   ` Derrick Stolee
2023-04-25 18:57     ` Taylor Blau
2023-04-25 19:52       ` Derrick Stolee
2023-05-03 21:43         ` Taylor Blau
2023-04-25 18:06 ` Derrick Stolee
2023-04-25 19:01   ` Taylor Blau
2023-04-25 20:27     ` Derrick Stolee
2023-05-01 22:08 ` Junio C Hamano
2023-05-02 23:52   ` Taylor Blau
2023-05-05 17:27 ` [PATCH v2 0/2] " Taylor Blau
2023-05-05 17:27   ` [PATCH v2 1/2] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-05 17:27   ` [PATCH v2 2/2] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-05 18:13     ` Derrick Stolee
2023-05-05 18:43       ` Taylor Blau
2023-05-05 17:33   ` [PATCH v2 0/2] pack-bitmap: boundary-based " Junio C Hamano
2023-05-05 17:59   ` Derrick Stolee
2023-05-05 18:46     ` [PATCH v2 0/2] pack-bitmap: boundary-based bitmap traversalt Taylor Blau
2023-05-05 20:45       ` Derrick Stolee
2023-05-08 17:38 ` [PATCH v3 0/3] pack-bitmap: boundary-based bitmap traversal Taylor Blau
2023-05-08 17:38   ` [PATCH v3 1/3] object: add object_array initializer helper function Taylor Blau
2023-05-08 17:38   ` [PATCH v3 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-05-08 17:38   ` [PATCH v3 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau
2023-05-08 20:53     ` Derrick Stolee
2023-05-08 22:12       ` Taylor Blau
2023-05-10 22:55   ` [PATCH v3 0/3] pack-bitmap: boundary-based " Junio C Hamano
2023-05-10 23:10     ` Taylor Blau
2023-05-11 15:23       ` Derrick Stolee
2023-06-08 16:25 ` [PATCH v4 " Taylor Blau
2023-06-08 16:25   ` [PATCH v4 1/3] object: add object_array initializer helper function Taylor Blau
2023-06-08 16:25   ` [PATCH v4 2/3] pack-bitmap.c: extract `fill_in_bitmap()` Taylor Blau
2023-06-08 16:25   ` [PATCH v4 3/3] pack-bitmap.c: use commit boundary during bitmap traversal Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1682380788.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=derrickstolee@github.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.