From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, avarab@gmail.com, gitster@pobox.com,
jonathantanmy@google.com, steadmon@google.com
Subject: [PATCH v3 0/9] repack: introduce `--write-midx`
Date: Tue, 28 Sep 2021 21:54:37 -0400 [thread overview]
Message-ID: <cover.1632880469.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1631730270.git.me@ttaylorr.com>
Here is another small reroll of my series to implement the final component of
multi-pack reachability bitamps, which is to be able to write them from `git
repack`.
This version incorporates feedback from Jonathan Tan and others at Google who
graciously provided review. A range-diff is included below, but the major
changes are:
- A comment to explain the relationship between 'ctx->m' and 'ctx->to_include'
in midx.c:add_pack_to_midx().
- A comment to explain the meaning of write_midx_file_only().
- Clean-up of stray hunks, a missing strbuf_release().
- Replacing the name `existing_packs` with `existing_nonkept_packs` to
emphasize the relationship between it and the similarly-named
`existing_kept_packs`.
Instead of depending on tb/multi-pack-bitmaps, this series now depends on the
tip of master.
Taylor Blau (9):
midx: expose `write_midx_file_only()` publicly
builtin/multi-pack-index.c: support `--stdin-packs` mode
midx: preliminary support for `--refs-snapshot`
builtin/repack.c: keep track of existing packs unconditionally
builtin/repack.c: rename variables that deal with non-kept packs
builtin/repack.c: extract showing progress to a variable
builtin/repack.c: support writing a MIDX while repacking
builtin/repack.c: make largest pack preferred
builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
Documentation/git-multi-pack-index.txt | 19 ++
Documentation/git-repack.txt | 18 +-
builtin/multi-pack-index.c | 36 +++-
builtin/repack.c | 279 ++++++++++++++++++++++---
midx.c | 110 ++++++++--
midx.h | 15 +-
pack-bitmap.c | 2 +-
pack-bitmap.h | 1 +
t/helper/test-read-midx.c | 25 ++-
t/lib-midx.sh | 8 +
t/t5319-multi-pack-index.sh | 15 ++
t/t5326-multi-pack-bitmaps.sh | 82 ++++++++
t/t7700-repack.sh | 96 +++++++++
t/t7703-repack-geometric.sh | 24 ++-
14 files changed, 674 insertions(+), 56 deletions(-)
create mode 100644 t/lib-midx.sh
Range-diff against v2:
1: 03c1b2c4d3 ! 1: b7c72d0bf5 midx: expose `write_midx_file_only()` publicly
@@ midx.c: struct write_midx_context {
static void add_pack_to_midx(const char *full_path, size_t full_path_len,
@@ midx.c: static void add_pack_to_midx(const char *full_path, size_t full_path_len,
+
+ if (ends_with(file_name, ".idx")) {
display_progress(ctx->progress, ++ctx->pack_paths_checked);
++ /*
++ * Note that at most one of ctx->m and ctx->to_include are set,
++ * so we are testing midx_contains_pack() and
++ * string_list_has_string() independently (guarded by the
++ * appropriate NULL checks).
++ *
++ * We could support passing to_include while reusing an existing
++ * MIDX, but don't currently since the reuse process drags
++ * forward all packs from an existing MIDX (without checking
++ * whether or not they appear in the to_include list).
++ *
++ * If we added support for that, these next two conditional
++ * should be performed independently (likely checking
++ * to_include before the existing MIDX).
++ */
if (ctx->m && midx_contains_pack(ctx->m, file_name))
return;
+ else if (ctx->to_include &&
@@ midx.c: static int write_midx_internal(const char *object_dir,
struct bitmap_index *bitmap_git;
int bitmap_exists;
int want_bitmap = flags & MIDX_WRITE_BITMAP;
-@@ midx.c: static int write_midx_internal(const char *object_dir,
-
- QSORT(ctx.info, ctx.nr, pack_info_compare);
-
-- if (packs_to_drop && packs_to_drop->nr) {
-+ if (ctx.m && packs_to_drop && packs_to_drop->nr) {
- int drop_index = 0;
- int missing_drops = 0;
-
@@ midx.c: int write_midx_file(const char *object_dir,
const char *preferred_pack_name,
unsigned flags)
@@ midx.h: int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pa
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
++/*
++ * Variant of write_midx_file which writes a MIDX containing only the packs
++ * specified in packs_to_include.
++ */
+int write_midx_file_only(const char *object_dir,
+ struct string_list *packs_to_include,
+ const char *preferred_pack_name,
2: 59556e5545 = 2: 986ef14f2a builtin/multi-pack-index.c: support `--stdin-packs` mode
3: 42f1ae9ede ! 3: 4e3769a4f3 midx: preliminary support for `--refs-snapshot`
@@ Commit message
commit in the MIDX reaches something that isn't.
This can happen when a multi-pack index contains some pack which refers
- to loose objects (which by definition aren't included in the multi-pack
- index).
+ to loose objects (e.g., if a pack was pushed after starting the repack
+ but before generating the MIDX which depends on an object which is
+ stored as loose in the repository, and by definition isn't included in
+ the multi-pack index).
By taking a snapshot of the references before we start repacking, we can
close that race window. In the above scenario (where we have a packed
@@ midx.c: static void bitmap_show_commit(struct commit *commit, void *_data)
+ }
+
+ fclose(f);
++ strbuf_release(&buf);
+ return 0;
+}
+
@@ midx.h: int fill_midx_entry(struct repository *r, const struct object_id *oid, s
int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
-int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
+ /*
+ * Variant of write_midx_file which writes a MIDX containing only the packs
+ * specified in packs_to_include.
+ */
+int write_midx_file(const char *object_dir,
+ const char *preferred_pack_name,
+ const char *refs_snapshot,
4: c0d045a9de ! 4: 1b3dd331ca builtin/repack.c: keep track of existing packs unconditionally
@@ Commit message
## builtin/repack.c ##
@@ builtin/repack.c: static void remove_pack_on_signal(int signo)
- * have a corresponding .keep file. These packs are not to
- * be kept if we are going to pack everything into one file.
+ }
+
+ /*
+- * Adds all packs hex strings to the fname list, which do not
+- * have a corresponding .keep file. These packs are not to
+- * be kept if we are going to pack everything into one file.
++ * Adds all packs hex strings to either fname_list or fname_kept_list
++ * based on whether each pack has a corresponding .keep file or not.
++ * Packs without a .keep file are not to be kept if we are going to
++ * pack everything into one file.
*/
-static void get_non_kept_pack_filenames(struct string_list *fname_list,
- const struct string_list *extra_keep)
-: ---------- > 5: 15831a201a builtin/repack.c: rename variables that deal with non-kept packs
5: 09de03de47 = 6: 1a40161baf builtin/repack.c: extract showing progress to a variable
6: 83dfdb8b12 ! 7: 6854f0751d builtin/repack.c: support writing a MIDX while repacking
@@ Commit message
check we can make consistently when (1) telling the MIDX which packs we
want to exclude, and (2) actually unlinking the redundant packs.
+ There is also a tiny change to short-circuit the body of
+ write_midx_included_packs() when no packs remain in the case of an empty
+ repository. The MIDX code does not handle this, so avoid trying to
+ generate a MIDX covering zero packs in the first place.
+
Signed-off-by: Taylor Blau <me@ttaylorr.com>
## Documentation/git-repack.txt ##
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
}
+static void midx_included_packs(struct string_list *include,
-+ struct string_list *existing_packs,
++ struct string_list *existing_nonkept_packs,
+ struct string_list *existing_kept_packs,
+ struct string_list *names,
+ struct pack_geometry *geometry)
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+ string_list_insert(include, strbuf_detach(&buf, NULL));
+ }
+ } else {
-+ for_each_string_list_item(item, existing_packs) {
++ for_each_string_list_item(item, existing_nonkept_packs) {
+ if (item->util)
+ continue;
+ string_list_insert(include, xstrfmt("%s.idx", item->string));
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+ FILE *in;
+ int ret;
+
++ if (!include->nr)
++ return 0;
++
+ cmd.in = -1;
+ cmd.git_cmd = 1;
+
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
+ if (delete_redundant && pack_everything & ALL_INTO_ONE) {
+ const int hexsz = the_hash_algo->hexsz;
+ string_list_sort(&names);
-+ for_each_string_list_item(item, &existing_packs) {
++ for_each_string_list_item(item, &existing_nonkept_packs) {
+ char *sha1;
+ size_t len = strlen(item->string);
+ if (len < hexsz)
+ continue;
+ sha1 = item->string + len - hexsz;
++ /*
++ * Mark this pack for deletion, which ensures that this
++ * pack won't be included in a MIDX (if `--write-midx`
++ * was given) and that we will actually delete this pack
++ * (if `-d` was given).
++ */
+ item->util = (void*)(intptr_t)!string_list_has_string(&names, sha1);
+ }
+ }
+
+ if (write_midx) {
+ struct string_list include = STRING_LIST_INIT_NODUP;
-+ midx_included_packs(&include, &existing_packs,
++ midx_included_packs(&include, &existing_nonkept_packs,
+ &existing_kept_packs, &names, geometry);
+
+ ret = write_midx_included_packs(&include,
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
- if (pack_everything & ALL_INTO_ONE) {
- const int hexsz = the_hash_algo->hexsz;
- string_list_sort(&names);
-- for_each_string_list_item(item, &existing_packs) {
+- for_each_string_list_item(item, &existing_nonkept_packs) {
- char *sha1;
- size_t len = strlen(item->string);
- if (len < hexsz)
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
- if (!string_list_has_string(&names, sha1))
- remove_redundant_pack(packdir, item->string);
- }
-+ for_each_string_list_item(item, &existing_packs) {
++ for_each_string_list_item(item, &existing_nonkept_packs) {
+ if (!item->util)
+ continue;
+ remove_redundant_pack(packdir, item->string);
@@ t/t7700-repack.sh: test_expect_success 'auto-bitmaps do not complain if unavaila
+'
+
test_done
+
+ ## t/t7703-repack-geometric.sh ##
+@@ t/t7703-repack-geometric.sh: test_expect_success '--geometric with no packs' '
+ (
+ cd geometric &&
+
+- git repack --geometric 2 >out &&
++ git repack --write-midx --geometric 2 >out &&
+ test_i18ngrep "Nothing new to pack" out
+ )
+ '
7: 68bc49d8ae ! 8: 3596c76daf builtin/repack.c: make largest pack preferred
@@ builtin/repack.c: static int write_midx_included_packs(struct string_list *inclu
if (ret)
return ret;
@@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix)
- midx_included_packs(&include, &existing_packs,
+ midx_included_packs(&include, &existing_nonkept_packs,
&existing_kept_packs, &names, geometry);
- ret = write_midx_included_packs(&include,
8: eb24b308ec ! 9: d99f075321 builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
@@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
+}
+
static void midx_included_packs(struct string_list *include,
- struct string_list *existing_packs,
+ struct string_list *existing_nonkept_packs,
struct string_list *existing_kept_packs,
@@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
--
2.33.0.96.g73915697e6
next prev parent reply other threads:[~2021-09-29 1:55 UTC|newest]
Thread overview: 76+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-09-11 3:32 [PATCH 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-11 3:32 ` [PATCH 1/8] midx: expose 'write_midx_file_only()' publicly Taylor Blau
2021-09-11 5:00 ` Junio C Hamano
2021-09-11 16:17 ` Taylor Blau
2021-09-11 10:07 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:21 ` Taylor Blau
2021-09-11 3:32 ` [PATCH 2/8] builtin/multi-pack-index.c: support --stdin-packs mode Taylor Blau
2021-09-11 10:05 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:25 ` Taylor Blau
2021-09-11 16:28 ` Taylor Blau
2021-09-12 2:08 ` Eric Sunshine
2021-09-12 2:21 ` Taylor Blau
2021-09-12 15:15 ` Ævar Arnfjörð Bjarmason
2021-09-12 22:30 ` Junio C Hamano
2021-09-12 22:32 ` Ævar Arnfjörð Bjarmason
2021-09-14 19:02 ` Jeff King
2021-09-14 23:48 ` Taylor Blau
2021-09-15 1:55 ` Eric Sunshine
2021-09-11 3:32 ` [PATCH 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-11 3:32 ` [PATCH 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-11 3:32 ` [PATCH 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-11 3:32 ` [PATCH 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-11 3:32 ` [PATCH 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-11 10:17 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:35 ` Taylor Blau
2021-09-11 3:32 ` [PATCH 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-11 10:27 ` Ævar Arnfjörð Bjarmason
2021-09-11 11:19 ` Ævar Arnfjörð Bjarmason
2021-09-11 16:51 ` Taylor Blau
2021-09-14 18:55 ` Jeff King
2021-09-14 23:34 ` Taylor Blau
2021-09-14 23:56 ` Ævar Arnfjörð Bjarmason
2021-09-15 4:31 ` Taylor Blau
2021-09-11 16:49 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 1/8] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-22 23:14 ` Jonathan Tan
2021-09-23 3:09 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 2/8] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-22 22:29 ` Josh Steadmon
2021-09-23 2:03 ` Taylor Blau
2021-09-22 23:11 ` Jonathan Tan
2021-09-23 2:06 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-22 22:34 ` Josh Steadmon
2021-09-23 2:08 ` Taylor Blau
2021-09-22 23:00 ` Jonathan Tan
2021-09-23 2:18 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-22 22:56 ` Jonathan Tan
2021-09-23 2:59 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-15 18:24 ` [PATCH v2 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-22 22:39 ` Jonathan Tan
2021-09-23 2:40 ` Taylor Blau
2021-09-15 18:24 ` [PATCH v2 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-15 18:24 ` [PATCH v2 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-24 18:22 ` Jonathan Tan
2021-10-01 22:38 ` Taylor Blau
2021-09-15 19:22 ` [PATCH v2 0/8] repack: introduce `--write-midx` Junio C Hamano
2021-09-15 19:29 ` Junio C Hamano
2021-09-15 21:19 ` Taylor Blau
2021-09-16 22:16 ` Junio C Hamano
2021-09-29 1:54 ` Taylor Blau [this message]
2021-09-29 1:55 ` [PATCH v3 1/9] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-29 1:55 ` [PATCH v3 2/9] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-29 1:55 ` [PATCH v3 3/9] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-29 1:55 ` [PATCH v3 4/9] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-29 1:55 ` [PATCH v3 5/9] builtin/repack.c: rename variables that deal with non-kept packs Taylor Blau
2021-09-29 1:55 ` [PATCH v3 6/9] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-29 1:55 ` [PATCH v3 7/9] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-29 1:55 ` [PATCH v3 8/9] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-29 1:55 ` [PATCH v3 9/9] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-29 4:24 ` [PATCH v3 0/9] repack: introduce `--write-midx` Junio C Hamano
2021-10-01 20:01 ` Jonathan Tan
2021-10-01 22:40 ` Taylor Blau
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1632880469.git.me@ttaylorr.com \
--to=me@ttaylorr.com \
--cc=avarab@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jonathantanmy@google.com \
--cc=peff@peff.net \
--cc=steadmon@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).