All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: git@vger.kernel.org
Cc: peff@peff.net, gitster@pobox.com, avarab@gmail.com
Subject: [PATCH v2 0/8] repack: introduce `--write-midx`
Date: Wed, 15 Sep 2021 14:24:33 -0400	[thread overview]
Message-ID: <cover.1631730270.git.me@ttaylorr.com> (raw)
In-Reply-To: <cover.1631331139.git.me@ttaylorr.com>

Here is a small reroll of my series to implement the final component of
multi-pack reachability bitamps, which is to be able to write them from `git
repack`.

Nothing substantial has changed since last time. Mostly re-wrapping lines and
removing braces in macro'd for_each_xyz() loops. A larger (but still small)
change is how we read input to `--refs-snapshot`. This version contains a more
idiomatic implementation, but see the sub-thread beginning at [1] for a more
complete discussion.

Otherwise, this series is unchagned. It depends on tb/multi-pack-bitmaps, which
has graduated to master. Range-diff below:

[1]: https://lore.kernel.org/git/2a16f11790b79ab452233b6f28acac607c0abd28.1631331139.git.me@ttaylorr.com/

Taylor Blau (8):
  midx: expose `write_midx_file_only()` publicly
  builtin/multi-pack-index.c: support `--stdin-packs` mode
  midx: preliminary support for `--refs-snapshot`
  builtin/repack.c: keep track of existing packs unconditionally
  builtin/repack.c: extract showing progress to a variable
  builtin/repack.c: support writing a MIDX while repacking
  builtin/repack.c: make largest pack preferred
  builtin/repack.c: pass `--refs-snapshot` when writing bitmaps

 Documentation/git-multi-pack-index.txt |  19 ++
 Documentation/git-repack.txt           |  18 +-
 builtin/multi-pack-index.c             |  36 +++-
 builtin/repack.c                       | 255 ++++++++++++++++++++++---
 midx.c                                 |  96 ++++++++--
 midx.h                                 |  11 +-
 pack-bitmap.c                          |   2 +-
 pack-bitmap.h                          |   1 +
 t/helper/test-read-midx.c              |  25 ++-
 t/lib-midx.sh                          |   8 +
 t/t5319-multi-pack-index.sh            |  15 ++
 t/t5326-multi-pack-bitmaps.sh          |  82 ++++++++
 t/t7700-repack.sh                      |  96 ++++++++++
 t/t7703-repack-geometric.sh            |  22 +++
 14 files changed, 637 insertions(+), 49 deletions(-)
 create mode 100644 t/lib-midx.sh

Range-diff against v1:
1:  6d50f46080 ! 1:  03c1b2c4d3 midx: expose 'write_midx_file_only()' publicly
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>
     
      ## Commit message ##
    -    midx: expose 'write_midx_file_only()' publicly
    +    midx: expose `write_midx_file_only()` publicly
     
         Expose a variant of the write_midx_file() function which ignores packs
         that aren't included in an explicit "allow" list.
    @@ midx.c: struct write_midx_context {
      
      static void add_pack_to_midx(const char *full_path, size_t full_path_len,
     @@ midx.c: static void add_pack_to_midx(const char *full_path, size_t full_path_len,
    - 
    - 	if (ends_with(file_name, ".idx")) {
      		display_progress(ctx->progress, ++ctx->pack_paths_checked);
    --		if (ctx->m && midx_contains_pack(ctx->m, file_name))
    --			return;
    -+		if (ctx->m) {
    -+			if (midx_contains_pack(ctx->m, file_name))
    -+				return;
    -+		} else if (ctx->to_include) {
    -+			if (!string_list_has_string(ctx->to_include, file_name))
    -+				return;
    -+		}
    + 		if (ctx->m && midx_contains_pack(ctx->m, file_name))
    + 			return;
    ++		else if (ctx->to_include &&
    ++			 !string_list_has_string(ctx->to_include, file_name))
    ++			return;
      
      		ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc);
      
    @@ midx.h: int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pa
      int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
      
      int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
    -+int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags);
    ++int write_midx_file_only(const char *object_dir,
    ++			 struct string_list *packs_to_include,
    ++			 const char *preferred_pack_name,
    ++			 unsigned flags);
      void clear_midx_file(struct repository *r);
      int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
      int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
2:  66aa7f4b48 ! 2:  59556e5545 builtin/multi-pack-index.c: support --stdin-packs mode
    @@ Metadata
     Author: Taylor Blau <me@ttaylorr.com>
     
      ## Commit message ##
    -    builtin/multi-pack-index.c: support --stdin-packs mode
    +    builtin/multi-pack-index.c: support `--stdin-packs` mode
     
         To power a new `--write-midx` mode, `git repack` will want to write a
         multi-pack index containing a certain set of packs in the repository.
    @@ builtin/multi-pack-index.c: static struct option *add_common_options(struct opti
     +static void read_packs_from_stdin(struct string_list *to)
     +{
     +	struct strbuf buf = STRBUF_INIT;
    -+	while (strbuf_getline(&buf, stdin) != EOF) {
    -+		string_list_append(to, strbuf_detach(&buf, NULL));
    -+	}
    ++	while (strbuf_getline(&buf, stdin) != EOF)
    ++		string_list_append(to, buf.buf);
     +	string_list_sort(to);
    ++
    ++	strbuf_release(&buf);
     +}
     +
      static int cmd_multi_pack_index_write(int argc, const char **argv)
    @@ builtin/multi-pack-index.c: static int cmd_multi_pack_index_write(int argc, cons
      	FREE_AND_NULL(options);
      
     +	if (opts.stdin_packs) {
    -+		struct string_list packs = STRING_LIST_INIT_NODUP;
    ++		struct string_list packs = STRING_LIST_INIT_DUP;
     +		int ret;
     +
     +		read_packs_from_stdin(&packs);
3:  f74a967be3 ! 3:  42f1ae9ede midx: preliminary support for `--refs-snapshot`
    @@ midx.h: int fill_midx_entry(struct repository *r, const struct object_id *oid, s
      int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
      
     -int write_midx_file(const char *object_dir, const char *preferred_pack_name, unsigned flags);
    --int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, unsigned flags);
    -+int write_midx_file(const char *object_dir, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags);
    -+int write_midx_file_only(const char *object_dir, struct string_list *packs_to_include, const char *preferred_pack_name, const char *refs_snapshot, unsigned flags);
    ++int write_midx_file(const char *object_dir,
    ++		    const char *preferred_pack_name,
    ++		    const char *refs_snapshot,
    ++		    unsigned flags);
    + int write_midx_file_only(const char *object_dir,
    + 			 struct string_list *packs_to_include,
    + 			 const char *preferred_pack_name,
    ++			 const char *refs_snapshot,
    + 			 unsigned flags);
      void clear_midx_file(struct repository *r);
      int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
    - int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
     
      ## t/t5326-multi-pack-bitmaps.sh ##
     @@ t/t5326-multi-pack-bitmaps.sh: test_expect_success 'pack.preferBitmapTips' '
4:  5067c2cc65 = 4:  c0d045a9de builtin/repack.c: keep track of existing packs unconditionally
5:  d5990e3b69 = 5:  09de03de47 builtin/repack.c: extract showing progress to a variable
6:  d32d800954 ! 6:  83dfdb8b12 builtin/repack.c: support writing a MIDX while repacking
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +{
     +	struct string_list_item *item;
     +
    -+	for_each_string_list_item(item, existing_kept_packs) {
    ++	for_each_string_list_item(item, existing_kept_packs)
     +		string_list_insert(include, xstrfmt("%s.idx", item->string));
    -+	}
    -+	for_each_string_list_item(item, names) {
    ++	for_each_string_list_item(item, names)
     +		string_list_insert(include, xstrfmt("pack-%s.idx", item->string));
    -+	}
     +	if (geometry) {
     +		struct strbuf buf = STRBUF_INIT;
     +		uint32_t i;
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +		return ret;
     +
     +	in = xfdopen(cmd.in, "w");
    -+	for_each_string_list_item(item, include) {
    ++	for_each_string_list_item(item, include)
     +		fprintf(in, "%s\n", item->string);
    -+	}
     +	fclose(in);
     +
     +	return finish_command(&cmd);
7:  515e1d95a6 ! 7:  68bc49d8ae builtin/repack.c: make largest pack preferred
    @@ builtin/repack.c: static void split_pack_geometry(struct pack_geometry *geometry
      
     +static struct packed_git *get_largest_active_pack(struct pack_geometry *geometry)
     +{
    ++	if (!geometry) {
    ++		/*
    ++		 * No geometry means either an all-into-one repack (in which
    ++		 * case there is only one pack left and it is the largest) or an
    ++		 * incremental one.
    ++		 *
    ++		 * If repacking incrementally, then we could check the size of
    ++		 * all packs to determine which should be preferred, but leave
    ++		 * this for later.
    ++		 */
    ++		return NULL;
    ++	}
     +	if (geometry->split == geometry->pack_nr)
     +		return NULL;
     +	return geometry->pack[geometry->pack_nr - 1];
    @@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
      				     int show_progress, int write_bitmaps)
      {
      	struct child_process cmd = CHILD_PROCESS_INIT;
    + 	struct string_list_item *item;
    ++	struct packed_git *largest = get_largest_active_pack(geometry);
    + 	FILE *in;
    + 	int ret;
    + 
     @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
      	if (write_bitmaps)
      		strvec_push(&cmd.args, "--bitmap");
      
    -+	if (geometry) {
    -+		struct packed_git *largest = get_largest_active_pack(geometry);
    -+		if (largest)
    -+			strvec_pushf(&cmd.args, "--preferred-pack=%s",
    -+				     pack_basename(largest));
    -+		else
    -+			/*
    -+			 * The largest pack was repacked, meaning only one pack
    -+			 * exists (and tautologically, it is the largest).
    -+			 */
    -+			;
    -+	}
    ++	if (largest)
    ++		strvec_pushf(&cmd.args, "--preferred-pack=%s",
    ++			     pack_basename(largest));
     +
      	ret = start_command(&cmd);
      	if (ret)
    @@ t/t7703-repack-geometric.sh: test_expect_success '--geometric ignores kept packs
     +	test_when_finished "rm -fr geometric" &&
     +	(
     +		cd geometric &&
    -+		git config core.multiPackIndex true &&
     +
     +		# These packs already form a geometric progression.
     +		test_commit_bulk --start=1 1 && # 3 objects
8:  74a9da0ef0 ! 8:  eb24b308ec builtin/repack.c: pass `--refs-snapshot` when writing bitmaps
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +	return 0;
     +}
     +
    -+static int midx_snapshot_refs(struct tempfile *f)
    ++static void midx_snapshot_refs(struct tempfile *f)
     +{
     +	struct midx_snapshot_ref_data data;
     +	const struct string_list *preferred = bitmap_preferred_tips(the_repository);
    @@ builtin/repack.c: static void clear_pack_geometry(struct pack_geometry *geometry
     +		struct string_list_item *item;
     +
     +		data.preferred = 1;
    -+		for_each_string_list_item(item, preferred) {
    ++		for_each_string_list_item(item, preferred)
     +			for_each_ref_in(item->string, midx_snapshot_ref_one, &data);
    -+		}
     +		data.preferred = 0;
     +	}
     +
     +	for_each_ref(midx_snapshot_ref_one, &data);
     +
    -+	return close_tempfile_gently(f);
    ++	if (close_tempfile_gently(f)) {
    ++		int save_errno = errno;
    ++		delete_tempfile(&f);
    ++		errno = save_errno;
    ++		die_errno(_("could not close refs snapshot tempfile"));
    ++	}
     +}
     +
      static void midx_included_packs(struct string_list *include,
    @@ builtin/repack.c: static void midx_included_packs(struct string_list *include,
      {
      	struct child_process cmd = CHILD_PROCESS_INIT;
     @@ builtin/repack.c: static int write_midx_included_packs(struct string_list *include,
    - 			;
    - 	}
    + 		strvec_pushf(&cmd.args, "--preferred-pack=%s",
    + 			     pack_basename(largest));
      
     +	if (refs_snapshot)
     +		strvec_pushf(&cmd.args, "--refs-snapshot=%s", refs_snapshot);
    @@ builtin/repack.c: int cmd_repack(int argc, const char **argv, const char *prefix
     +			    "bitmap-ref-tips");
     +
     +		refs_snapshot = xmks_tempfile(path.buf);
    -+		if (midx_snapshot_refs(refs_snapshot) < 0)
    -+			die(_("could not take a snapshot of references"));
    ++		midx_snapshot_refs(refs_snapshot);
     +
     +		strbuf_release(&path);
     +	}
-- 
2.33.0.96.g73915697e6

  parent reply	other threads:[~2021-09-15 18:24 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-09-11  3:32 [PATCH 0/8] repack: introduce `--write-midx` Taylor Blau
2021-09-11  3:32 ` [PATCH 1/8] midx: expose 'write_midx_file_only()' publicly Taylor Blau
2021-09-11  5:00   ` Junio C Hamano
2021-09-11 16:17     ` Taylor Blau
2021-09-11 10:07   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:21     ` Taylor Blau
2021-09-11  3:32 ` [PATCH 2/8] builtin/multi-pack-index.c: support --stdin-packs mode Taylor Blau
2021-09-11 10:05   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:25     ` Taylor Blau
2021-09-11 16:28       ` Taylor Blau
2021-09-12  2:08       ` Eric Sunshine
2021-09-12  2:21         ` Taylor Blau
2021-09-12 15:15           ` Ævar Arnfjörð Bjarmason
2021-09-12 22:30             ` Junio C Hamano
2021-09-12 22:32               ` Ævar Arnfjörð Bjarmason
2021-09-14 19:02       ` Jeff King
2021-09-14 23:48         ` Taylor Blau
2021-09-15  1:55           ` Eric Sunshine
2021-09-11  3:32 ` [PATCH 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-11  3:32 ` [PATCH 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-11  3:32 ` [PATCH 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-11  3:32 ` [PATCH 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-11  3:32 ` [PATCH 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-11 10:17   ` Ævar Arnfjörð Bjarmason
2021-09-11 16:35     ` Taylor Blau
2021-09-11  3:32 ` [PATCH 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-11 10:27   ` Ævar Arnfjörð Bjarmason
2021-09-11 11:19     ` Ævar Arnfjörð Bjarmason
2021-09-11 16:51       ` Taylor Blau
2021-09-14 18:55         ` Jeff King
2021-09-14 23:34           ` Taylor Blau
2021-09-14 23:56             ` Ævar Arnfjörð Bjarmason
2021-09-15  4:31               ` Taylor Blau
2021-09-11 16:49     ` Taylor Blau
2021-09-15 18:24 ` Taylor Blau [this message]
2021-09-15 18:24   ` [PATCH v2 1/8] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-22 23:14     ` Jonathan Tan
2021-09-23  3:09       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 2/8] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-22 22:29     ` Josh Steadmon
2021-09-23  2:03       ` Taylor Blau
2021-09-22 23:11     ` Jonathan Tan
2021-09-23  2:06       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 3/8] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-22 22:34     ` Josh Steadmon
2021-09-23  2:08       ` Taylor Blau
2021-09-22 23:00     ` Jonathan Tan
2021-09-23  2:18       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 4/8] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-22 22:56     ` Jonathan Tan
2021-09-23  2:59       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 5/8] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-15 18:24   ` [PATCH v2 6/8] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-22 22:39     ` Jonathan Tan
2021-09-23  2:40       ` Taylor Blau
2021-09-15 18:24   ` [PATCH v2 7/8] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-15 18:24   ` [PATCH v2 8/8] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-24 18:22     ` Jonathan Tan
2021-10-01 22:38       ` Taylor Blau
2021-09-15 19:22   ` [PATCH v2 0/8] repack: introduce `--write-midx` Junio C Hamano
2021-09-15 19:29     ` Junio C Hamano
2021-09-15 21:19       ` Taylor Blau
2021-09-16 22:16         ` Junio C Hamano
2021-09-29  1:54   ` [PATCH v3 0/9] " Taylor Blau
2021-09-29  1:55     ` [PATCH v3 1/9] midx: expose `write_midx_file_only()` publicly Taylor Blau
2021-09-29  1:55     ` [PATCH v3 2/9] builtin/multi-pack-index.c: support `--stdin-packs` mode Taylor Blau
2021-09-29  1:55     ` [PATCH v3 3/9] midx: preliminary support for `--refs-snapshot` Taylor Blau
2021-09-29  1:55     ` [PATCH v3 4/9] builtin/repack.c: keep track of existing packs unconditionally Taylor Blau
2021-09-29  1:55     ` [PATCH v3 5/9] builtin/repack.c: rename variables that deal with non-kept packs Taylor Blau
2021-09-29  1:55     ` [PATCH v3 6/9] builtin/repack.c: extract showing progress to a variable Taylor Blau
2021-09-29  1:55     ` [PATCH v3 7/9] builtin/repack.c: support writing a MIDX while repacking Taylor Blau
2021-09-29  1:55     ` [PATCH v3 8/9] builtin/repack.c: make largest pack preferred Taylor Blau
2021-09-29  1:55     ` [PATCH v3 9/9] builtin/repack.c: pass `--refs-snapshot` when writing bitmaps Taylor Blau
2021-09-29  4:24     ` [PATCH v3 0/9] repack: introduce `--write-midx` Junio C Hamano
2021-10-01 20:01     ` Jonathan Tan
2021-10-01 22:40       ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1631730270.git.me@ttaylorr.com \
    --to=me@ttaylorr.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.