git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, avarab@gmail.com
Subject: [PATCH v2 0/8] repack: introduce `--write-midx`
Date: Wed, 15 Sep 2021 14:24:33 -0400	[thread overview]
Message-ID: <cover.1631730270.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1631331139.git.me@ttaylorr.com>

Here is a small reroll of my series to implement the final component of
multi-pack reachability bitamps, which is to be able to write them from `git
repack`.

Nothing substantial has changed since last time. Mostly re-wrapping lines and
removing braces in macro'd for_each_xyz() loops. A larger (but still small)
change is how we read input to `--refs-snapshot`. This version contains a more
idiomatic implementation, but see the sub-thread beginning at [1] for a more
complete discussion.

Otherwise, this series is unchagned. It depends on tb/multi-pack-bitmaps, which
has graduated to master. Range-diff below:

[1]: https://lore.kernel.org/git/2a16f11790b79ab452233b6f28acac607c0abd28.1631331139.git.me@ttaylorr.com/

Taylor Blau (8):
  midx: expose `write_midx_file_only()` publicly
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: preliminary support for `--refs-snapshot`
  builtin/repack.c: keep track of existing packs unconditionally
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps

 Documentation/git-multi-pack-index.txt |  19 ++
 Documentation/git-repack.txt           |  18 +-
 builtin/multi-pack-index.c             |  36 +++-
 builtin/repack.c                       | 255 ++++++++++++++++++++++---
 midx.c                                 |  96 ++++++++--
 midx.h                                 |  11 +-
 pack-bitmap.c                          |   2 +-
 pack-bitmap.h                          |   1 +
 t/helper/test-read-midx.c              |  25 ++-
 t/lib-midx.sh                          |   8 +
 t/t5319-multi-pack-index.sh            |  15 ++
 t/t5326-multi-pack-bitmaps.sh          |  82 ++++++++
 t/t7700-repack.sh                      |  96 ++++++++++
 t/t7703-repack-geometric.sh            |  22 +++
 14 files changed, 637 insertions(+), 49 deletions(-)
 create mode 100644 t/lib-midx.sh

Range-diff against v1:
1:  6d50f46080 ! 1:  03c1b2c4d3 midx: expose 'write_midx_file_only()' publicly
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>
     
      ## Commit message ##
    -    midx: expose 'write_midx_file_only()' publicly
    +    midx: expose `write_midx_file_only()` publicly
     
         Expose a variant of the write_midx_file() function which ignores packs
         that aren't included in an explicit "allow" list.
    @@ midx.c: struct write_midx_context {
      
      static void add_pack_to_midx(const char *full_path, size_t full_path_len,
     @@ midx.c: static void add_pack_to_midx(const char *full_path, size_t full_path_len,
    - 
    - 	if (ends_with(file_name, ".idx")) {
      		display_progress(ctx->progress, ++ctx->pack_paths_checked);
    --		if (ctx->m && midx_contains_pack(ctx->m, file_name))
    --			return;
    -+		if (ctx->m) {
    -+			if (midx_contains_pack(ctx->m, file_name))
    -+				return;
    -+		} else if (ctx->to_include) {
    -+			if (!string_list_has_string(ctx->to_include, file_name))
    -+				return;
    -+		}
    + 		if (ctx->m && midx_contains_pack(ctx->m, file_name))
    + 			return;
    ++		else if (ctx->to_include &&
    ++			 !string_list_has_string(ctx->to_include, file_name))
    ++			return;
      
      		ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc);
      
    @@ midx.h: int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pa
      int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
      
      int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
    -+int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags);
    ++int write_midx_file_only(const char *object_dir,
    ++			 struct string_list *packs_to_include,
    ++			 const char *preferred_pack_name,
    ++			 unsigned flags);
      void clear_midx_file(struct repository *r);
      int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
      int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
2:  66aa7f4b48 ! 2:  59556e5545 builtin/multi-pack-index.c: support --stdin-packs mode
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>
     
      ## Commit message ##
    -    builtin/multi-pack-index.c: support --stdin-packs mode
    +    builtin/multi-pack-index.c: support `--stdin-packs` mode
     
         To power a new `--write-midx` mode, `git repack` will want to write a
         multi-pack index containing a certain set of packs in the repository.
    @@ builtin/multi-pack-index.c: static struct option *add_common_options(struct opti
     +static void read_packs_from_stdin(struct string_list *to)
     +{
     +	struct strbuf buf = STRBUF_INIT;
    -+	while (strbuf_getline(&buf, stdin) != EOF) {
    -+		string_list_append(to, strbuf_detach(&buf, NULL));
    -+	}
    ++	while (strbuf_getline(&buf, stdin) != EOF)
    ++		string_list_append(to, buf.buf);
     +	string_list_sort(to);
    ++
    ++	strbuf_release(&buf);
     +}
     +
      static int cmd_multi_pack_index_write(int argc, const char **argv)
    @@ builtin/multi-pack-index.c: static int cmd_multi_pack_index_write(int argc, cons
      	FREE_AND_NULL(options);
      
     +	if (opts.stdin_packs) {
    -+		struct string_list packs = STRING_LIST_INIT_NODUP;
    ++		struct string_list packs = STRING_LIST_INIT_DUP;
     +		int ret;
     +
     +		read_packs_from_stdin(&packs);
3:  f74a967be3 ! 3:  42f1ae9ede midx: preliminary support for `--refs-snapshot`
    @@ midx.h: int fill_midx_entry(struct repository *r, const struct object_id *oid, s
      int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
      
     -int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
    --int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags);
    -+int write_midx_file(const char *object_dir, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags);
    -+int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags);
    ++int write_midx_file(const char *object_dir,
    ++		    const char *preferred_pack_name,
    ++		    const char *refs_snapshot,
    ++		    unsigned flags);
    + int write_midx_file_only(const char *object_dir,
    + 			 struct string_list *packs_to_include,
    + 			 const char *preferred_pack_name,
    ++			 const char *refs_snapshot,
    + 			 unsigned flags);
      void clear_midx_file(struct repository *r);
      int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
    - int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
     
      ## t/t5326-multi-pack-bitmaps.sh ##
     @@ t/t5326-multi-pack-bitmaps.sh: test_expect_success 'pack.preferBitmapTips' '
4:  5067c2cc65 = 4:  c0d045a9de builtin/repack.c: keep track of existing packs unconditionally
5:  d5990e3b69 = 5:  09de03de47 builtin/repack.c: extract showing progress to a variable
6:  d32d800954 ! 6:  83dfdb8b12 builtin/repack.c: support writing a MIDX while repacking
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +{
     +	struct string_list_item *item;
     +
    -+	for_each_string_list_item(item, existing_kept_packs) {
    ++	for_each_string_list_item(item, existing_kept_packs)
     +		string_list_insert(include, xstrfmt("%s.idx", item->string));
    -+	}
    -+	for_each_string_list_item(item, names) {
    ++	for_each_string_list_item(item, names)
     +		string_list_insert(include, xstrfmt("pack-%s.idx", item->string));
    -+	}
     +	if (geometry) {
     +		struct strbuf buf = STRBUF_INIT;
     +		uint32_t i;
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +		return ret;
     +
     +	in = xfdopen(cmd.in, "w");
    -+	for_each_string_list_item(item, include) {
    ++	for_each_string_list_item(item, include)
     +		fprintf(in, "%s\n", item->string);
    -+	}
     +	fclose(in);
     +
     +	return finish_command(&cmd);
7:  515e1d95a6 ! 7:  68bc49d8ae builtin/repack.c: make largest pack preferred
    @@ builtin/repack.c: static void split_pack_geometry(struct pack_geometry *geometry
      
     +static struct packed_git *get_largest_active_pack(struct pack_geometry *geometry)
     +{
    ++	if (!geometry) {
    ++		/*
    ++		 * No geometry means either an all-into-one repack (in which
    ++		 * case there is only one pack left and it is the largest) or an
    ++		 * incremental one.
    ++		 *
    ++		 * If repacking incrementally, then we could check the size of
    ++		 * all packs to determine which should be preferred, but leave
    ++		 * this for later.
    ++		 */
    ++		return NULL;
    ++	}
     +	if (geometry->split == geometry->pack_nr)
     +		return NULL;
     +	return geometry->pack[geometry->pack_nr - 1];
    @@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
      				     int show_progress, int write_bitmaps)
      {
      	struct child_process cmd = CHILD_PROCESS_INIT;
    + 	struct string_list_item *item;
    ++	struct packed_git *largest = get_largest_active_pack(geometry);
    + 	FILE *in;
    + 	int ret;
    + 
     @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
      	if (write_bitmaps)
      		strvec_push(&cmd.args, "--bitmap");
      
    -+	if (geometry) {
    -+		struct packed_git *largest = get_largest_active_pack(geometry);
    -+		if (largest)
    -+			strvec_pushf(&cmd.args, "--preferred-pack=%s",
    -+				     pack_basename(largest));
    -+		else
    -+			/*
    -+			 * The largest pack was repacked, meaning only one pack
    -+			 * exists (and tautologically, it is the largest).
    -+			 */
    -+			;
    -+	}
    ++	if (largest)
    ++		strvec_pushf(&cmd.args, "--preferred-pack=%s",
    ++			     pack_basename(largest));
     +
      	ret = start_command(&cmd);
      	if (ret)
    @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric ignores kept packs
     +	test_when_finished "rm -fr geometric" &&
     +	(
     +		cd geometric &&
    -+		git config core.multiPackIndex true &&
     +
     +		# These packs already form a geometric progression.
     +		test_commit_bulk --start=1 1 && # 3 objects
8:  74a9da0ef0 ! 8:  eb24b308ec builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +	return 0;
     +}
     +
    -+static int midx_snapshot_refs(struct tempfile *f)
    ++static void midx_snapshot_refs(struct tempfile *f)
     +{
     +	struct midx_snapshot_ref_data data;
     +	const struct string_list *preferred = bitmap_preferred_tips(the_repository);
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +		struct string_list_item *item;
     +
     +		data.preferred = 1;
    -+		for_each_string_list_item(item, preferred) {
    ++		for_each_string_list_item(item, preferred)
     +			for_each_ref_in(item->string, midx_snapshot_ref_one, &data);
    -+		}
     +		data.preferred = 0;
     +	}
     +
     +	for_each_ref(midx_snapshot_ref_one, &data);
     +
    -+	return close_tempfile_gently(f);
    ++	if (close_tempfile_gently(f)) {
    ++		int save_errno = errno;
    ++		delete_tempfile(&f);
    ++		errno = save_errno;
    ++		die_errno(_("could not close refs snapshot tempfile"));
    ++	}
     +}
     +
      static void midx_included_packs(struct string_list *include,
    @@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
      {
      	struct child_process cmd = CHILD_PROCESS_INIT;
     @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
    - 			;
    - 	}
    + 		strvec_pushf(&cmd.args, "--preferred-pack=%s",
    + 			     pack_basename(largest));
      
     +	if (refs_snapshot)
     +		strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot);
    @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
     +			    "bitmap-ref-tips");
     +
     +		refs_snapshot = xmks_tempfile(path.buf);
    -+		if (midx_snapshot_refs(refs_snapshot) < 0)
    -+			die(_("could not take a snapshot of references"));
    ++		midx_snapshot_refs(refs_snapshot);
     +
     +		strbuf_release(&path);
     +	}
-- 
2.33.0.96.g73915697e6

  parent reply	other threads:[~2021-09-15 18:24 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11  3:32 [PATCH 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-11  3:32 ` [PATCH 1/8] midx: expose 'write_midx_file_only()' publicly Taylor Blau
2021-09-11  5:00   ` Junio C Hamano
2021-09-11 16:17     ` Taylor Blau
2021-09-11 10:07   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:21     ` Taylor Blau
2021-09-11  3:32 ` [PATCH 2/8] builtin/multi-pack-index.c: support --stdin-packs mode Taylor Blau
2021-09-11 10:05   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:25     ` Taylor Blau
2021-09-11 16:28       ` Taylor Blau
2021-09-12  2:08       ` Eric Sunshine
2021-09-12  2:21         ` Taylor Blau
2021-09-12 15:15           ` Ævar Arnfjörð Bjarmason
2021-09-12 22:30             ` Junio C Hamano
2021-09-12 22:32               ` Ævar Arnfjörð Bjarmason
2021-09-14 19:02       ` Jeff King
2021-09-14 23:48         ` Taylor Blau
2021-09-15  1:55           ` Eric Sunshine
2021-09-11  3:32 ` [PATCH 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-11  3:32 ` [PATCH 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-11  3:32 ` [PATCH 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-11  3:32 ` [PATCH 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-11  3:32 ` [PATCH 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-11 10:17   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:35     ` Taylor Blau
2021-09-11  3:32 ` [PATCH 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-11 10:27   ` Ævar Arnfjörð Bjarmason
2021-09-11 11:19     ` Ævar Arnfjörð Bjarmason
2021-09-11 16:51       ` Taylor Blau
2021-09-14 18:55         ` Jeff King
2021-09-14 23:34           ` Taylor Blau
2021-09-14 23:56             ` Ævar Arnfjörð Bjarmason
2021-09-15  4:31               ` Taylor Blau
2021-09-11 16:49     ` Taylor Blau
2021-09-15 18:24 ` Taylor Blau [this message]
2021-09-15 18:24   ` [PATCH v2 1/8] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-22 23:14     ` Jonathan Tan
2021-09-23  3:09       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 2/8] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-22 22:29     ` Josh Steadmon
2021-09-23  2:03       ` Taylor Blau
2021-09-22 23:11     ` Jonathan Tan
2021-09-23  2:06       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-22 22:34     ` Josh Steadmon
2021-09-23  2:08       ` Taylor Blau
2021-09-22 23:00     ` Jonathan Tan
2021-09-23  2:18       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-22 22:56     ` Jonathan Tan
2021-09-23  2:59       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-15 18:24   ` [PATCH v2 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-22 22:39     ` Jonathan Tan
2021-09-23  2:40       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-15 18:24   ` [PATCH v2 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-24 18:22     ` Jonathan Tan
2021-10-01 22:38       ` Taylor Blau
2021-09-15 19:22   ` [PATCH v2 0/8] repack: introduce `--write-midx` Junio C Hamano
2021-09-15 19:29     ` Junio C Hamano
2021-09-15 21:19       ` Taylor Blau
2021-09-16 22:16         ` Junio C Hamano
2021-09-29  1:54   ` [PATCH v3 0/9] " Taylor Blau
2021-09-29  1:55     ` [PATCH v3 1/9] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-29  1:55     ` [PATCH v3 2/9] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-29  1:55     ` [PATCH v3 3/9] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-29  1:55     ` [PATCH v3 4/9] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-29  1:55     ` [PATCH v3 5/9] builtin/repack.c: rename variables that deal with non-kept packs Taylor Blau
2021-09-29  1:55     ` [PATCH v3 6/9] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-29  1:55     ` [PATCH v3 7/9] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-29  1:55     ` [PATCH v3 8/9] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-29  1:55     ` [PATCH v3 9/9] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-29  4:24     ` [PATCH v3 0/9] repack: introduce `--write-midx` Junio C Hamano
2021-10-01 20:01     ` Jonathan Tan
2021-10-01 22:40       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1631730270.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).