All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, dstolee@microsoft.com, gitster@pobox.com
Subject: [PATCH v4 0/8] repack: support repacking into a geometric sequence
Date: Mon, 22 Feb 2021 21:24:59 -0500	[thread overview]
Message-ID: <cover.1614047097.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1611098616.git.me@ttaylorr.com>

Here's a very lightly modified version on v3 of mine and Peff's series
to add a new 'git repack --geometric' mode. Almost nothing has changed
since last time, with the exception of:

  - Packs listed over standard input to 'git pack-objects --stdin-packs'
    are sorted in descending mtime order (and objects are strung
    together in pack order as before) so that objects are laid out
    roughly newest-to-oldest in the resulting pack.

  - Swapped the order of two paragraphs in patch 5 to make the perf
    results clearer.

  - Mention '--unpacked' specifically in the documentation for 'git
    repack --geometric'.

  - Typo fixes.

Range-diff is below. It would be good to start merging this down since
we have a release candidate coming up soon, and I'd rather focus future
reviewer efforts on the multi-pack reverse index and bitmaps series
instead of this one.

Jeff King (4):
  p5303: add missing &&-chains
  p5303: measure time to repack with keep
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  packfile: add kept-pack cache for find_kept_pack_entry()

Taylor Blau (4):
  packfile: introduce 'find_kept_pack_entry()'
  revision: learn '--no-kept-objects'
  builtin/pack-objects.c: add '--stdin-packs' option
  builtin/repack.c: add '--geometric' option

 Documentation/git-pack-objects.txt |  10 +
 Documentation/git-repack.txt       |  23 ++
 builtin/pack-objects.c             | 333 ++++++++++++++++++++++++-----
 builtin/repack.c                   | 187 +++++++++++++++-
 object-store.h                     |   5 +
 packfile.c                         |  67 ++++++
 packfile.h                         |   5 +
 revision.c                         |  15 ++
 revision.h                         |   4 +
 t/perf/p5303-many-packs.sh         |  36 +++-
 t/t5300-pack-object.sh             |  97 +++++++++
 t/t6114-keep-packs.sh              |  69 ++++++
 t/t7703-repack-geometric.sh        | 137 ++++++++++++
 13 files changed, 926 insertions(+), 62 deletions(-)
 create mode 100755 t/t6114-keep-packs.sh
 create mode 100755 t/t7703-repack-geometric.sh

Range-diff against v3:
1:  aa94edf39b = 1:  bb674e5119 packfile: introduce 'find_kept_pack_entry()'
2:  82f6b45463 = 2:  c85a915597 revision: learn '--no-kept-objects'
3:  033e4e3f67 ! 3:  649cf9020b builtin/pack-objects.c: add '--stdin-packs' option
    @@ builtin/pack-objects.c: static int git_pack_config(const char *k, const char *v,
     +	struct packed_git *a = ((const struct string_list_item*)_a)->util;
     +	struct packed_git *b = ((const struct string_list_item*)_b)->util;
     +
    ++	/*
    ++	 * order packs by descending mtime so that objects are laid out
    ++	 * roughly as newest-to-oldest
    ++	 */
     +	if (a->mtime < b->mtime)
    -+		return -1;
    -+	else if (b->mtime < a->mtime)
     +		return 1;
    ++	else if (b->mtime < a->mtime)
    ++		return -1;
     +	else
     +		return 0;
     +}
4:  f9a5faf773 = 4:  6de9f0c52b p5303: add missing &&-chains
5:  181c104a03 ! 5:  94e4f3ee3a p5303: measure time to repack with keep
    @@ Metadata
      ## Commit message ##
         p5303: measure time to repack with keep
     
    -    Add two new tests to measure repack performance. Both test split the
    +    Add two new tests to measure repack performance. Both tests split the
         repository into synthetic "pushes", and then leave the remaining objects
         in a big base pack.
     
    @@ Commit message
           5303.17: repack (1000)                      216.87(490.79+14.57)
           5303.18: repack with kept (1000)            665.63(938.87+15.76)
     
    -    Likewise, the scaling is pretty extreme on --stdin-packs:
    -
    -      5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
    -      5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
    -      5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)
    -
         That's because the code paths around handling .keep files are known to
         scale badly; they look in every single pack file to find each object.
         Our solution to that was to notice that most repos don't have keep
    @@ Commit message
         single .keep, that part of pack-objects slows down again (even if we
         have fewer objects total to look at).
     
    +    Likewise, the scaling is pretty extreme on --stdin-packs (but each
    +    subsequent test is also being asked to do more work):
    +
    +      5303.7: repack with --stdin-packs (1)       0.01(0.01+0.00)
    +      5303.13: repack with --stdin-packs (50)     3.53(12.07+0.24)
    +      5303.19: repack with --stdin-packs (1000)   195.83(371.82+8.10)
    +
         Signed-off-by: Jeff King <peff@peff.net>
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
6:  67af143fd1 = 6:  a116587fb2 builtin/pack-objects.c: rewrite honor-pack-keep logic
7:  e9e04b95e7 = 7:  db9f07ec1a packfile: add kept-pack cache for find_kept_pack_entry()
8:  bd492ec142 ! 8:  51f57d5da2 builtin/repack.c: add '--geometric' option
    @@ Documentation/git-repack.txt: depth is 4095.
     +packs determined to need to be combined in order to restore a geometric
     +progression.
     ++
    -+Loose objects are implicitly included in this "roll-up", without respect
    -+to their reachability. This is subject to change in the future. This
    -+option (implying a drastically different repack mode) is not guarenteed
    -+to work with all other combinations of option to `git repack`).
    ++When `--unpacked` is specified, loose objects are implicitly included in
    ++this "roll-up", without respect to their reachability. This is subject
    ++to change in the future. This option (implying a drastically different
    ++repack mode) is not guaranteed to work with all other combinations of
    ++option to `git repack`).
     +
      Configuration
      -------------
-- 
2.30.0.667.g81c0cbc6fd

  parent reply	other threads:[~2021-02-23  2:26 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-19 23:23 [PATCH 00/10] repack: support repacking into a geometric sequence Taylor Blau
2021-01-19 23:24 ` [PATCH 01/10] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-01-20 13:40   ` Derrick Stolee
2021-01-20 14:38     ` Taylor Blau
2021-01-29  2:33   ` Junio C Hamano
2021-01-29 18:38     ` Taylor Blau
2021-01-29 19:31     ` Jeff King
2021-01-29 20:20       ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 02/10] revision: learn '--no-kept-objects' Taylor Blau
2021-01-29  3:10   ` Junio C Hamano
2021-01-29 19:13     ` Taylor Blau
2021-01-19 23:24 ` [PATCH 03/10] builtin/pack-objects.c: learn '--assume-kept-packs-closed' Taylor Blau
2021-01-29  3:21   ` Junio C Hamano
2021-01-29 19:19     ` Jeff King
2021-01-29 20:01       ` Taylor Blau
2021-01-29 20:25         ` Jeff King
2021-01-29 22:10           ` Taylor Blau
2021-01-29 22:57             ` Jeff King
2021-01-29 23:03             ` Junio C Hamano
2021-01-29 23:28               ` Taylor Blau
2021-02-02  3:04                 ` Taylor Blau
2021-01-29 23:31               ` Jeff King
2021-01-29 22:13           ` Junio C Hamano
2021-01-29 20:30       ` Junio C Hamano
2021-01-29 22:43         ` Jeff King
2021-01-29 22:53           ` Taylor Blau
2021-01-29 23:00             ` Jeff King
2021-01-29 23:10             ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 04/10] p5303: add missing &&-chains Taylor Blau
2021-01-19 23:24 ` [PATCH 05/10] p5303: measure time to repack with keep Taylor Blau
2021-01-29  3:40   ` Junio C Hamano
2021-01-29 19:32     ` Jeff King
2021-01-29 20:04       ` [PATCH] p5303: avoid sed GNU-ism Jeff King
2021-01-29 20:19         ` Eric Sunshine
2021-01-29 20:27           ` Jeff King
2021-01-29 20:36             ` Eric Sunshine
2021-01-29 22:11               ` Taylor Blau
2021-01-29 20:38       ` [PATCH 05/10] p5303: measure time to repack with keep Junio C Hamano
2021-01-29 22:10         ` Jeff King
2021-01-29 23:12           ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 06/10] pack-objects: rewrite honor-pack-keep logic Taylor Blau
2021-01-19 23:24 ` [PATCH 07/10] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-01-19 23:24 ` [PATCH 08/10] builtin/pack-objects.c: teach '--keep-pack-stdin' Taylor Blau
2021-01-19 23:24 ` [PATCH 09/10] builtin/repack.c: extract loose object handling Taylor Blau
2021-01-20 13:59   ` Derrick Stolee
2021-01-20 14:34     ` Taylor Blau
2021-01-20 15:51       ` Derrick Stolee
2021-01-21  3:45     ` Junio C Hamano
2021-01-19 23:24 ` [PATCH 10/10] builtin/repack.c: add '--geometric' option Taylor Blau
2021-01-20 14:05 ` [PATCH 00/10] repack: support repacking into a geometric sequence Derrick Stolee
2021-02-04  3:58 ` [PATCH v2 0/8] " Taylor Blau
2021-02-04  3:58   ` [PATCH v2 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-16 21:42     ` Jeff King
2021-02-16 21:48       ` Taylor Blau
2021-02-04  3:58   ` [PATCH v2 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-16 23:17     ` Jeff King
2021-02-17 18:35       ` Taylor Blau
2021-02-04  3:59   ` [PATCH v2 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-16 23:46     ` Jeff King
2021-02-17 18:59       ` Taylor Blau
2021-02-17 19:21         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-04  3:59   ` [PATCH v2 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-16 23:58     ` Jeff King
2021-02-17  0:02       ` Jeff King
2021-02-17 19:13       ` Taylor Blau
2021-02-17 19:25         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-17 16:05     ` Jeff King
2021-02-17 19:23       ` Taylor Blau
2021-02-17 19:29         ` Jeff King
2021-02-04  3:59   ` [PATCH v2 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-17 17:11     ` Jeff King
2021-02-17 19:54       ` Taylor Blau
2021-02-17 20:25         ` Jeff King
2021-02-17 20:29           ` Taylor Blau
2021-02-17 21:43             ` Jeff King
2021-02-04  3:59   ` [PATCH v2 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-17 18:17     ` Jeff King
2021-02-17 20:01       ` Taylor Blau
2021-02-17  0:01   ` [PATCH v2 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-17 18:18     ` Jeff King
2021-02-18  3:14 ` [PATCH v3 " Taylor Blau
2021-02-18  3:14   ` [PATCH v3 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-18  3:14   ` [PATCH v3 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-18  3:14   ` [PATCH v3 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-18  3:14   ` [PATCH v3 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-18  3:14   ` [PATCH v3 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-18  3:14   ` [PATCH v3 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-18  3:14   ` [PATCH v3 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-23  0:31   ` [PATCH v3 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  1:06     ` Taylor Blau
2021-02-23  1:42       ` Jeff King
2021-02-23  2:24 ` Taylor Blau [this message]
2021-02-23  2:25   ` [PATCH v4 1/8] packfile: introduce 'find_kept_pack_entry()' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 2/8] revision: learn '--no-kept-objects' Taylor Blau
2021-02-23  2:25   ` [PATCH v4 3/8] builtin/pack-objects.c: add '--stdin-packs' option Taylor Blau
2021-02-23  8:07     ` Junio C Hamano
2021-02-23 18:51       ` Jeff King
2021-02-23  2:25   ` [PATCH v4 4/8] p5303: add missing &&-chains Taylor Blau
2021-02-23  2:25   ` [PATCH v4 5/8] p5303: measure time to repack with keep Taylor Blau
2021-02-23  2:25   ` [PATCH v4 6/8] builtin/pack-objects.c: rewrite honor-pack-keep logic Taylor Blau
2021-02-23  2:25   ` [PATCH v4 7/8] packfile: add kept-pack cache for find_kept_pack_entry() Taylor Blau
2021-02-23  2:25   ` [PATCH v4 8/8] builtin/repack.c: add '--geometric' option Taylor Blau
2021-02-24 23:19     ` Junio C Hamano
2021-02-24 23:43       ` Junio C Hamano
2021-03-04 21:40         ` Taylor Blau
2021-03-04 21:55       ` Taylor Blau
2021-02-23  3:39   ` [PATCH v4 0/8] repack: support repacking into a geometric sequence Jeff King
2021-02-23  7:43   ` Junio C Hamano
2021-02-23 18:44     ` Jeff King
2021-02-23 19:54       ` Martin Fick
2021-02-23 20:06         ` Taylor Blau
2021-02-23 21:57           ` Martin Fick
2021-02-23 20:15         ` Jeff King
2021-02-23 21:41           ` Martin Fick
2021-02-23 21:53             ` Jeff King
2021-02-24 18:13               ` Martin Fick
2021-02-26  6:23                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1614047097.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=dstolee@microsoft.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.