All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/10] gc: enable cruft packs by default
@ 2023-04-17 20:54 Taylor Blau
  2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
                   ` (11 more replies)
  0 siblings, 12 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

This series graduates `gc.cruftPacks` out of `feature.experimental` and
enables it by default, meaning that a bare `git gc` will generate a
cruft pack.

The main benefits are described in detail in the second to last patch,
but a brief summary is that:

  - cruft packs avoid an issue where repositories with many unreachable
    objects that were written too recently to be pruned causes an
    explosion of loose object files

  - cruft packs allow pairs of unreachable objects with similar content
    to delta with each other since they can be represented in a single
    pack instead of as individual loose objects.

Cruft packs have been in the tree for the vast majority of the past
calendar year, and have been in use at GitHub for even longer without
issue.

The series is structured as follows:

  - The first two patches are cleanup.
  - The third patches fixes an oversight where the code for `git gc`'s
    `--keep-largest-pack` option would incorrectly consider cruft packs.
  - The next five patches are test preparation.
  - Then the substantive patch, to actually graduate `gc.cruftPacks` and
    enable it by default.
  - The final patch is some cleanup that can only take place towards the
    end of the series.

Cruft packs have been a success for us at GitHub, and I am really
excited to get it in the hands of more users by default.

Thanks in advance for your review.

-Taylor

Taylor Blau (10):
  pack-write.c: plug a leak in stage_tmp_packfiles()
  builtin/repack.c: fix incorrect reference to '-C'
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6500-gc.sh: add additional test cases
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  repository.h: drop unused `gc_cruft_packs`

 Documentation/config/feature.txt |   3 -
 Documentation/config/gc.txt      |  12 +--
 Documentation/git-gc.txt         |  12 +--
 Documentation/gitformat-pack.txt |   4 +-
 builtin/gc.c                     |   8 +-
 builtin/repack.c                 |   2 +-
 pack-write.c                     |  14 ++--
 repo-settings.c                  |   4 +-
 repository.h                     |   1 -
 t/t5304-prune.sh                 |  28 +++----
 t/t6500-gc.sh                    | 136 ++++++++++++++++---------------
 t/t6501-freshen-objects.sh       |  10 +--
 t/t9300-fast-import.sh           |  13 +--
 13 files changed, 121 insertions(+), 126 deletions(-)

-- 
2.38.1

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 10:30   ` Jeff King
  2023-04-17 20:54 ` [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

The function `stage_tmp_packfiles()` generates a filename to use for
staging the contents of what will become the pack's ".mtimes" file.

The name is generated in `write_mtimes_file()` and the result is
returned back to `stage_tmp_packfiles()` which uses it to rename the
temporary file into place via `rename_tmp_packfiles()`.

`write_mtimes_file()` returns a `const char *`, indicating that callers
are not expected to free its result (similar to, e.g., `oid_to_hex()`).
But callers are expected to free its result, so this return type is
incorrect.

Change the function's signature to return a non-const `char *`, and free
it at the end of `stage_tmp_packfiles()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-write.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/pack-write.c b/pack-write.c
index f171405495..4da0ccc5f5 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -312,13 +312,13 @@ static void write_mtimes_trailer(struct hashfile *f, const unsigned char *hash)
 	hashwrite(f, hash, the_hash_algo->rawsz);
 }
 
-static const char *write_mtimes_file(struct packing_data *to_pack,
-				     struct pack_idx_entry **objects,
-				     uint32_t nr_objects,
-				     const unsigned char *hash)
+static char *write_mtimes_file(struct packing_data *to_pack,
+			       struct pack_idx_entry **objects,
+			       uint32_t nr_objects,
+			       const unsigned char *hash)
 {
 	struct strbuf tmp_file = STRBUF_INIT;
-	const char *mtimes_name;
+	char *mtimes_name;
 	struct hashfile *f;
 	int fd;
 
@@ -544,7 +544,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 			 char **idx_tmp_name)
 {
 	const char *rev_tmp_name = NULL;
-	const char *mtimes_tmp_name = NULL;
+	char *mtimes_tmp_name = NULL;
 
 	if (adjust_shared_perm(pack_tmp_name))
 		die_errno("unable to make temporary pack file readable");
@@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
 	if (mtimes_tmp_name)
 		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
+
+	free(mtimes_tmp_name);
 }
 
 void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C'
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
  2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
                   ` (9 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

When cruft packs were originally being developed, `-C` was designated as
the short-form for `--cruft` (as in `git repack -C`).

This was dropped due to confusion with Git's top-level `-C` option
before submitting to the list. But the reference to it in
`--cruft-expiration`'s help text was never updated. Fix that dangling
reference in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/repack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index df4d8e0f0b..d9eee15c2f 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -774,7 +774,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				N_("same as -a, pack unreachable cruft objects separately"),
 				   PACK_CRUFT),
 		OPT_STRING(0, "cruft-expiration", &cruft_expiration, N_("approxidate"),
-				N_("with -C, expire objects older than this")),
+				N_("with --cruft, expire objects older than this")),
 		OPT_BOOL('d', NULL, &delete_redundant,
 				N_("remove redundant packs, and run git-prune-packed")),
 		OPT_BOOL('f', NULL, &po_args.no_reuse_delta,
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
  2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
  2023-04-17 20:54 ` [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-17 22:54   ` Junio C Hamano
  2023-04-17 20:54 ` [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

When cruft packs were implemented, we never adjusted the code for `git
gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
packs. This option and configuration option share a common
implementation, but including cruft packs is wrong in both cases:

  - Running `git gc --keep-largest-pack` in a repository where the
    largest pack is the cruft pack itself will make it impossible for
    `git gc` to prune objects, since the cruft pack itself is kept.

  - The same is true for `gc.bigPackThreshold`, if the size of the cruft
    pack exceeds the limit set by the caller.

Ignore cruft packs in the common implementation for both of these
options, and add a pair of tests to prevent any future regressions here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/gc.txt | 10 ++++-----
 Documentation/git-gc.txt    |  7 +++---
 builtin/gc.c                |  2 +-
 t/t6500-gc.sh               | 43 +++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt
index 38fea076a2..8d5353e9e0 100644
--- a/Documentation/config/gc.txt
+++ b/Documentation/config/gc.txt
@@ -43,11 +43,11 @@ gc.autoDetach::
 	if the system supports it. Default is true.
 
 gc.bigPackThreshold::
-	If non-zero, all packs larger than this limit are kept when
-	`git gc` is run. This is very similar to `--keep-largest-pack`
-	except that all packs that meet the threshold are kept, not
-	just the largest pack. Defaults to zero. Common unit suffixes of
-	'k', 'm', or 'g' are supported.
+	If non-zero, all non-cruft packs larger than this limit are kept
+	when `git gc` is run. This is very similar to
+	`--keep-largest-pack` except that all non-cruft packs that meet
+	the threshold are kept, not just the largest pack. Defaults to
+	zero. Common unit suffixes of 'k', 'm', or 'g' are supported.
 +
 Note that if the number of kept packs is more than gc.autoPackLimit,
 this configuration variable is ignored, all packs except the base pack
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index a65c9aa62d..2427478314 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -77,9 +77,10 @@ be performed as well.
 	instance running on this repository.
 
 --keep-largest-pack::
-	All packs except the largest pack and those marked with a
-	`.keep` files are consolidated into a single pack. When this
-	option is used, `gc.bigPackThreshold` is ignored.
+	All packs except the largest pack, any packs marked with a
+	`.keep` file, and any cruft pack(s) are consolidated into a
+	single pack. When this option is used, `gc.bigPackThreshold` is
+	ignored.
 
 AGGRESSIVE
 ----------
diff --git a/builtin/gc.c b/builtin/gc.c
index edd98d35a5..53ef137e1d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -219,7 +219,7 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 	struct packed_git *p, *base = NULL;
 
 	for (p = get_all_packs(the_repository); p; p = p->next) {
-		if (!p->pack_local)
+		if (!p->pack_local || p->is_cruft)
 			continue;
 		if (limit) {
 			if (p->pack_size >= limit)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index d9acb63951..df6f2e6e52 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -298,6 +298,49 @@ test_expect_success 'feature.experimental=false avoids cruft packs by default' '
 	)
 '
 
+test_expect_success '--keep-largest-pack ignores cruft packs' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Generate a pack for reachable objects (of which there
+		# are 3), and one for unreachable objects (of which
+		# there are 6).
+		prepare_cruft_history &&
+		git gc --cruft &&
+
+		mtimes="$(find .git/objects/pack -type f -name "pack-*.mtimes")" &&
+		sz="$(test_file_size "${mtimes%.mtimes}.pack")" &&
+
+		# Ensure that the cruft pack gets removed (due to
+		# `--prune=now`) despite it being the largest pack.
+		git -c gc.bigPackThreshold=$sz gc --cruft --prune=now &&
+
+		assert_no_cruft_packs
+	)
+'
+
+test_expect_success 'gc.bigPackThreshold ignores cruft packs' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Generate a pack for reachable objects (of which there
+		# are 3), and one for unreachable objects (of which
+		# there are 6).
+		prepare_cruft_history &&
+		git gc --cruft &&
+
+		# Ensure that the cruft pack gets removed (due to
+		# `--prune=now`) despite it being the largest pack.
+		git gc --cruft --prune=now --keep-largest-pack &&
+
+		assert_no_cruft_packs
+	)
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (2 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
                   ` (7 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

Many of the tests in t5304 run `git gc`, and rely on its behavior that
unreachable-but-recent objects are written out loose. This is sensible,
since t5304 deals specifically with this kind of pruning.

If left unattended, however, this test would break when the default
behavior of a bare "git gc" is adjusted to generate a cruft pack by
default.

Ensure that these tests continue to work as-is (and continue to provide
coverage of loose object pruning) by passing `--no-cruft` explicitly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5304-prune.sh | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 5500dd0842..662ae9b152 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -62,11 +62,11 @@ test_expect_success 'prune --expire' '
 test_expect_success 'gc: implicit prune --expire' '
 	add_blob &&
 	test-tool chmtime =-$((2*$week-30)) $BLOB_FILE &&
-	git gc &&
+	git gc --no-cruft &&
 	verbose test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE &&
 	test-tool chmtime =-$((2*$week+1)) $BLOB_FILE &&
-	git gc &&
+	git gc --no-cruft &&
 	verbose test $before = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_missing $BLOB_FILE
 '
@@ -86,7 +86,7 @@ test_expect_success 'gc: refuse to start with invalid gc.pruneExpire' '
 
 test_expect_success 'gc: start with ok gc.pruneExpire' '
 	git config gc.pruneExpire 2.days.ago &&
-	git gc
+	git gc --no-cruft
 '
 
 test_expect_success 'prune: prune nonsense parameters' '
@@ -137,44 +137,44 @@ test_expect_success 'gc --no-prune' '
 	add_blob &&
 	test-tool chmtime =-$((5001*$day)) $BLOB_FILE &&
 	git config gc.pruneExpire 2.days.ago &&
-	git gc --no-prune &&
+	git gc --no-prune --no-cruft &&
 	verbose test 1 = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE
 '
 
 test_expect_success 'gc respects gc.pruneExpire' '
 	git config gc.pruneExpire 5002.days.ago &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire 5000.days.ago &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc --prune=<date>' '
 	add_blob &&
 	test-tool chmtime =-$((5001*$day)) $BLOB_FILE &&
-	git gc --prune=5002.days.ago &&
+	git gc --prune=5002.days.ago --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
-	git gc --prune=5000.days.ago &&
+	git gc --prune=5000.days.ago --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc --prune=never' '
 	add_blob &&
-	git gc --prune=never &&
+	git gc --prune=never --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
-	git gc --prune=now &&
+	git gc --prune=now --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc respects gc.pruneExpire=never' '
 	git config gc.pruneExpire never &&
 	add_blob &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire now &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
@@ -194,7 +194,7 @@ test_expect_success 'gc: prune old objects after local clone' '
 		cd aclone &&
 		verbose test 1 = $(git count-objects | sed "s/ .*//") &&
 		test_path_is_file $BLOB_FILE &&
-		git gc --prune &&
+		git gc --prune --no-cruft &&
 		verbose test 0 = $(git count-objects | sed "s/ .*//") &&
 		test_path_is_missing $BLOB_FILE
 	)
@@ -237,7 +237,7 @@ test_expect_success 'clean pack garbage with gc' '
 	>.git/objects/pack/fake2.keep &&
 	>.git/objects/pack/fake2.idx &&
 	>.git/objects/pack/fake3.keep &&
-	git gc &&
+	git gc --no-cruft &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 05/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (3 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 10:43   ` Jeff King
  2023-04-17 20:54 ` [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In a similar fashion as the previous commit, adjust the fast-import
tests to prepare for "git gc" generating a cruft pack by default.

This adjustment is slightly different, however. Instead of relying on us
writing out the objects loose, and then calling `git prune` to remove
them, t9300 needs to be prepared to drop objects that would be moved
into cruft packs.

To do this, we can combine the `git gc` invocation with `git prune` into
one `git gc --prune`, which handles pruning both loose objects, and
objects that would otherwise be written to a cruft pack.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t9300-fast-import.sh | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index aa55b41b9a..ac237a1f90 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -388,9 +388,7 @@ test_expect_success 'B: accept branch name "TEMP_TAG"' '
 
 	INPUT_END
 
-	test_when_finished "rm -f .git/TEMP_TAG
-		git gc
-		git prune" &&
+	test_when_finished "rm -f .git/TEMP_TAG && git gc --prune=now" &&
 	git fast-import <input &&
 	test $(test-tool ref-store main resolve-ref TEMP_TAG 0 | cut -f1 -d " " ) != "$ZERO_OID" &&
 	test $(git rev-parse main) = $(git rev-parse TEMP_TAG^)
@@ -406,8 +404,7 @@ test_expect_success 'B: accept empty committer' '
 	INPUT_END
 
 	test_when_finished "git update-ref -d refs/heads/empty-committer-1
-		git gc
-		git prune" &&
+		git gc --prune=now" &&
 	git fast-import <input &&
 	out=$(git fsck) &&
 	echo "$out" &&
@@ -452,8 +449,7 @@ test_expect_success 'B: accept and fixup committer with no name' '
 	INPUT_END
 
 	test_when_finished "git update-ref -d refs/heads/empty-committer-2
-		git gc
-		git prune" &&
+		git gc --prune=now" &&
 	git fast-import <input &&
 	out=$(git fsck) &&
 	echo "$out" &&
@@ -1778,8 +1774,7 @@ test_expect_success 'P: verbatim SHA gitlinks' '
 	INPUT_END
 
 	git branch -D sub &&
-	git gc &&
-	git prune &&
+	git gc --prune=now &&
 	git fast-import <input &&
 	test $(git rev-parse --verify subuse2) = $(git rev-parse --verify subuse1)
 '
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (4 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
                   ` (5 subsequent siblings)
  11 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In 12253ab6d0 (gc: add tests for --cruft and friends, 2022-10-26), we
added a handful of tests to t6500 to ensure that `git gc` respected the
value of `--cruft` and `gc.cruftPacks`.

Then, in c695592850 (config: let feature.experimental imply
gc.cruftPacks=true, 2022-10-26), another set of similar tests was added
to ensure that `feature.experimental` correctly implied enabling cruft
pack generation (or not).

These tests are similar and could be consolidated. Do so in this patch
to prepare for expanding the set of command-line invocations that enable
or disable writing cruft packs. This makes it possible to easily test
more combinations of arguments without being overly repetitive.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6500-gc.sh | 126 ++++++++++++++++++--------------------------------
 1 file changed, 44 insertions(+), 82 deletions(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index df6f2e6e52..a2f988c5c2 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -210,93 +210,55 @@ prepare_cruft_history () {
 	git reset HEAD^^
 }
 
-assert_cruft_packs () {
-	find .git/objects/pack -name "*.mtimes" >mtimes &&
-	sed -e 's/\.mtimes$/\.pack/g' mtimes >packs &&
-
-	test_file_not_empty packs &&
-	while read pack
-	do
-		test_path_is_file "$pack" || return 1
-	done <packs
-}
-
 assert_no_cruft_packs () {
 	find .git/objects/pack -name "*.mtimes" >mtimes &&
 	test_must_be_empty mtimes
 }
 
-test_expect_success 'gc --cruft generates a cruft pack' '
-	test_when_finished "rm -fr crufts" &&
-	git init crufts &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git gc --cruft &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'gc.cruftPacks=true generates a cruft pack' '
-	test_when_finished "rm -fr crufts" &&
-	git init crufts &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c gc.cruftPacks=true gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=true generates a cruft pack' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.experimental=true gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=false allows explicit cruft packs' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c gc.cruftPacks=true -c feature.experimental=false gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=true can be overridden' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.expiremental=true -c gc.cruftPacks=false gc &&
-		assert_no_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=false avoids cruft packs by default' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.experimental=false gc &&
-		assert_no_cruft_packs
-	)
-'
+for argv in \
+	"gc --cruft" \
+	"-c gc.cruftPacks=true gc" \
+	"-c feature.experimental=true gc" \
+	"-c gc.cruftPacks=true -c feature.experimental=false gc"
+do
+	test_expect_success "git $argv generates a cruft pack" '
+		test_when_finished "rm -fr repo" &&
+		git init repo &&
+		(
+			cd repo &&
+
+			prepare_cruft_history &&
+			git $argv &&
+
+			find .git/objects/pack -name "*.mtimes" >mtimes &&
+			sed -e 's/\.mtimes$/\.pack/g' mtimes >packs &&
+
+			test_file_not_empty packs &&
+			while read pack
+			do
+				test_path_is_file "$pack" || return 1
+			done <packs
+		)
+	'
+done
+
+for argv in \
+	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
+	"-c feature.experimental=false gc"
+do
+	test_expect_success "git $argv does not generate a cruft pack" '
+		test_when_finished "rm -fr repo" &&
+		git init repo &&
+		(
+			cd repo &&
+
+			prepare_cruft_history &&
+			git $argv &&
+
+			assert_no_cruft_packs
+		)
+	'
+done
 
 test_expect_success '--keep-largest-pack ignores cruft packs' '
 	test_when_finished "rm -fr repo" &&
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 07/10] t/t6500-gc.sh: add additional test cases
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (5 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 10:48   ` Jeff King
  2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In the last commit, we refactored some of the tests in t6500 to make
clearer when cruft packs will and won't be generated by `git gc`.

Add the remaining cases not covered by the previous patch into this one,
which enumerates all possible combinations of arguments that will
produce (or not produce) a cruft pack.

This prepares us for the following commit, which will change the default
of `gc.cruftPacks` by ensuring that we understand which invocations do
and do not change as a result.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6500-gc.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index a2f988c5c2..e7d3d1448f 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -218,6 +218,7 @@ assert_no_cruft_packs () {
 for argv in \
 	"gc --cruft" \
 	"-c gc.cruftPacks=true gc" \
+	"-c gc.cruftPacks=false gc --cruft" \
 	"-c feature.experimental=true gc" \
 	"-c gc.cruftPacks=true -c feature.experimental=false gc"
 do
@@ -243,6 +244,9 @@ do
 done
 
 for argv in \
+	"gc --no-cruft" \
+	"-c gc.cruftPacks=false gc" \
+	"-c gc.cruftPacks=true gc --no-cruft" \
 	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
 	"-c feature.experimental=false gc"
 do
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (6 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 10:56   ` Jeff King
  2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In a similar spirit as previous commits, prepare for `gc --cruft`
becoming the default by ensuring that the tests in t6501 explicitly
cover the case of freshening loose objects not using cruft packs.

We could run this test twice, once with `--cruft` and once with
`--no-cruft`, but doing so is unnecessary, since the object rescuing and
freshening behavior is already extensively tested via t5329.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6501-freshen-objects.sh | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
index 3968b47ed5..dbfa8a4d4c 100755
--- a/t/t6501-freshen-objects.sh
+++ b/t/t6501-freshen-objects.sh
@@ -101,7 +101,7 @@ do
 	'
 
 	test_expect_success "simultaneous gc ($title)" '
-		git gc --prune=12.hours.ago
+		git gc --no-cruft --prune=12.hours.ago
 	'
 
 	test_expect_success "finish writing out commit ($title)" '
@@ -131,7 +131,7 @@ do
 	'
 
 	test_expect_success "simultaneous gc ($title)" '
-		git gc --prune=12.hours.ago
+		git gc --no-cruft --prune=12.hours.ago
 	'
 
 	# tree should have been refreshed by write-tree
@@ -151,7 +151,7 @@ test_expect_success 'do not complain about existing broken links (commit)' '
 	some message
 	EOF
 	commit=$(git hash-object -t commit -w broken-commit) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	verbose git cat-file -e $commit &&
 	test_must_be_empty stderr
 '
@@ -161,7 +161,7 @@ test_expect_success 'do not complain about existing broken links (tree)' '
 	100644 blob $(test_oid 003)	foo
 	EOF
 	tree=$(git mktree --missing <broken-tree) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	git cat-file -e $tree &&
 	test_must_be_empty stderr
 '
@@ -176,7 +176,7 @@ test_expect_success 'do not complain about existing broken links (tag)' '
 	this is a broken tag
 	EOF
 	tag=$(git hash-object -t tag -w broken-tag) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	git cat-file -e $tag &&
 	test_must_be_empty stderr
 '
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (7 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 11:00   ` Jeff King
  2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.

Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.

Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.

Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.

In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.

As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/feature.txt | 3 ---
 Documentation/config/gc.txt      | 2 +-
 Documentation/git-gc.txt         | 5 +++--
 Documentation/gitformat-pack.txt | 4 ++--
 builtin/gc.c                     | 6 +-----
 t/t6500-gc.sh                    | 9 +++------
 6 files changed, 10 insertions(+), 19 deletions(-)

diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
index e52bc6b858..17b4d39f89 100644
--- a/Documentation/config/feature.txt
+++ b/Documentation/config/feature.txt
@@ -14,9 +14,6 @@ feature.experimental::
 +
 * `fetch.negotiationAlgorithm=skipping` may improve fetch negotiation times by
 skipping more commits at a time, reducing the number of round trips.
-+
-* `gc.cruftPacks=true` reduces disk space used by unreachable objects during
-garbage collection, preventing loose object explosions.
 
 feature.manyFiles::
 	Enable config options that optimize for repos with many files in the
diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt
index 8d5353e9e0..7f95c866e1 100644
--- a/Documentation/config/gc.txt
+++ b/Documentation/config/gc.txt
@@ -84,7 +84,7 @@ gc.packRefs::
 gc.cruftPacks::
 	Store unreachable objects in a cruft pack (see
 	linkgit:git-repack[1]) instead of as loose objects. The default
-	is `false`.
+	is `true`.
 
 gc.pruneExpire::
 	When 'git gc' is run, it will call 'prune --expire 2.weeks.ago'
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index 2427478314..0b4e7ba882 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -54,9 +54,10 @@ other housekeeping tasks (e.g. rerere, working trees, reflog...) will
 be performed as well.
 
 
---cruft::
+--[no-]cruft::
 	When expiring unreachable objects, pack them separately into a
-	cruft pack instead of storing them as loose objects.
+	cruft pack instead of storing them as loose objects. `--cruft`
+	is on by default.
 
 --prune=<date>::
 	Prune loose objects older than date (default is 2 weeks ago,
diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
index e06af02f21..0c1be2dbe8 100644
--- a/Documentation/gitformat-pack.txt
+++ b/Documentation/gitformat-pack.txt
@@ -611,8 +611,8 @@ result of repeatedly resetting the objects' mtimes to the present time.
 
 If you are GC-ing repositories in a mixed version environment, consider omitting
 the `--cruft` option when using linkgit:git-repack[1] and linkgit:git-gc[1], and
-leaving the `gc.cruftPacks` configuration unset until all writers understand
-cruft packs.
+setting the `gc.cruftPacks` configuration to "false" until all writers
+understand cruft packs.
 
 === Alternatives
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 53ef137e1d..ece01e966f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -48,7 +48,7 @@ static const char * const builtin_gc_usage[] = {
 
 static int pack_refs = 1;
 static int prune_reflogs = 1;
-static int cruft_packs = -1;
+static int cruft_packs = 1;
 static int aggressive_depth = 50;
 static int aggressive_window = 250;
 static int gc_auto_threshold = 6700;
@@ -608,10 +608,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (prune_expire && parse_expiry_date(prune_expire, &dummy))
 		die(_("failed to parse prune expiry value %s"), prune_expire);
 
-	prepare_repo_settings(the_repository);
-	if (cruft_packs < 0)
-		cruft_packs = the_repository->settings.gc_cruft_packs;
-
 	if (aggressive) {
 		strvec_push(&repack, "-f");
 		if (aggressive_depth > 0)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index e7d3d1448f..75760866b4 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -216,11 +216,10 @@ assert_no_cruft_packs () {
 }
 
 for argv in \
+	"gc" \
 	"gc --cruft" \
 	"-c gc.cruftPacks=true gc" \
-	"-c gc.cruftPacks=false gc --cruft" \
-	"-c feature.experimental=true gc" \
-	"-c gc.cruftPacks=true -c feature.experimental=false gc"
+	"-c gc.cruftPacks=false gc --cruft"
 do
 	test_expect_success "git $argv generates a cruft pack" '
 		test_when_finished "rm -fr repo" &&
@@ -246,9 +245,7 @@ done
 for argv in \
 	"gc --no-cruft" \
 	"-c gc.cruftPacks=false gc" \
-	"-c gc.cruftPacks=true gc --no-cruft" \
-	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
-	"-c feature.experimental=false gc"
+	"-c gc.cruftPacks=true gc --no-cruft"
 do
 	test_expect_success "git $argv does not generate a cruft pack" '
 		test_when_finished "rm -fr repo" &&
-- 
2.38.1


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 10/10] repository.h: drop unused `gc_cruft_packs`
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (8 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
@ 2023-04-17 20:54 ` Taylor Blau
  2023-04-18 11:02   ` Jeff King
  2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
  11 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 20:54 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

As of the previous commit, all callers that need to read the value of
`gc.cruftPacks` do so outside without using the `repo_settings` struct,
making its `gc_cruft_packs` unused. Drop it accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 repo-settings.c | 4 +---
 repository.h    | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/repo-settings.c b/repo-settings.c
index 0a6c0b381f..a8d0b98794 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -41,10 +41,8 @@ void prepare_repo_settings(struct repository *r)
 	repo_cfg_bool(r, "feature.experimental", &experimental, 0);
 
 	/* Defaults modified by feature.* */
-	if (experimental) {
+	if (experimental)
 		r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
-		r->settings.gc_cruft_packs = 1;
-	}
 	if (manyfiles) {
 		r->settings.index_version = 4;
 		r->settings.index_skip_hash = 1;
diff --git a/repository.h b/repository.h
index 15a8afc5fb..50eb0ce391 100644
--- a/repository.h
+++ b/repository.h
@@ -33,7 +33,6 @@ struct repo_settings {
 	int commit_graph_generation_version;
 	int commit_graph_read_changed_paths;
 	int gc_write_commit_graph;
-	int gc_cruft_packs;
 	int fetch_write_commit_graph;
 	int command_requires_full_index;
 	int sparse_index;
-- 
2.38.1

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
@ 2023-04-17 22:54   ` Junio C Hamano
  2023-04-17 23:03     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2023-04-17 22:54 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> When cruft packs were implemented, we never adjusted the code for `git
> gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
> packs. This option and configuration option share a common
> implementation, but including cruft packs is wrong in both cases:
>
>   - Running `git gc --keep-largest-pack` in a repository where the
>     largest pack is the cruft pack itself will make it impossible for
>     `git gc` to prune objects, since the cruft pack itself is kept.

Makes sense.  We want to keep the largest pack that is actually in
use, and we want to consolidate other non-cruft packs into one.

>   - The same is true for `gc.bigPackThreshold`, if the size of the cruft
>     pack exceeds the limit set by the caller.

This is not as cut-and-dried clear as the previous one.  "This pack
is so large that it is not worth rewriting it only to expunge a
handful of objects that are no longer reachable from it" is the main
motivation to use this configuration, but doesn't some part of the
same reasoning apply equally to a large cruft pack?  But let's
assume that the configuration is totally irrelevant to cruft packs
and read on.

>  --keep-largest-pack::
> -	All packs except the largest pack and those marked with a
> -	`.keep` files are consolidated into a single pack. When this
> -	option is used, `gc.bigPackThreshold` is ignored.
> +	All packs except the largest pack, any packs marked with a
> +	`.keep` file, and any cruft pack(s) are consolidated into a
> +	single pack. When this option is used, `gc.bigPackThreshold` is
> +	ignored.

"except the largest pack" -> "except the largest, non-cruft pack"


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-17 22:54   ` Junio C Hamano
@ 2023-04-17 23:03     ` Taylor Blau
  2023-04-18 10:39       ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-17 23:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Derrick Stolee

On Mon, Apr 17, 2023 at 03:54:35PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >   - The same is true for `gc.bigPackThreshold`, if the size of the cruft
> >     pack exceeds the limit set by the caller.
>
> This is not as cut-and-dried clear as the previous one.  "This pack
> is so large that it is not worth rewriting it only to expunge a
> handful of objects that are no longer reachable from it" is the main
> motivation to use this configuration, but doesn't some part of the
> same reasoning apply equally to a large cruft pack?  But let's
> assume that the configuration is totally irrelevant to cruft packs
> and read on.

This is an inherent design trade-off. I imagine that callers who want to
avoid rewriting their (large) cruft packs would prefer to generate a new
cruft pack on top with just the recently accumulated unreachable
objects.

That kind of works, except if you need to prune objects that are packed
in an earlier cruft pack. If you have `gc.bigPackThreshold`, there is no
way to do this if you need to expire objects that are in cruft packs
above that threshold.

A user may find themselves frustrated when trying to `git gc --prune`
some sensitive object(s) from their repository doesn't appear to work,
only to discover that `gc.bigPackThreshold` is set somewhere in their
configuration.

Writing (largely) the same cruft pack to expunge a few objects isn't
ideal, but it is better than the status quo. And if you have so many
unreachable objects that this is a concern, it is probably time to prune
anyway.

It is possible that in the future we could support writing multiple
cruft packs (we already handle the presence of multiple cruft packs
fine, just don't expose an easy way for the user to write >1 of them).
And at that point we would be able to relax this patch a bit and allow
`gc.bigPackThreshold` to cover cruft packs, too. But in the meantime,
the benefit of avoiding loose object explosions outweighs the possible
drawbacks here, IMHO.

> >  --keep-largest-pack::
> > -	All packs except the largest pack and those marked with a
> > -	`.keep` files are consolidated into a single pack. When this
> > -	option is used, `gc.bigPackThreshold` is ignored.
> > +	All packs except the largest pack, any packs marked with a
> > +	`.keep` file, and any cruft pack(s) are consolidated into a
> > +	single pack. When this option is used, `gc.bigPackThreshold` is
> > +	ignored.
>
> "except the largest pack" -> "except the largest, non-cruft pack"

Indeed, good eyes.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
@ 2023-04-18 10:30   ` Jeff King
  2023-04-18 19:40     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 10:30 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:15PM -0400, Taylor Blau wrote:

> The function `stage_tmp_packfiles()` generates a filename to use for
> staging the contents of what will become the pack's ".mtimes" file.
> 
> The name is generated in `write_mtimes_file()` and the result is
> returned back to `stage_tmp_packfiles()` which uses it to rename the
> temporary file into place via `rename_tmp_packfiles()`.
> 
> `write_mtimes_file()` returns a `const char *`, indicating that callers
> are not expected to free its result (similar to, e.g., `oid_to_hex()`).
> But callers are expected to free its result, so this return type is
> incorrect.

Makes sense. I do think in the long run that it might make sense for all
of these pack-write tmpfiles to tempfile.[ch] (either directly, or via
register_tempfile() after creating the file). And then it would be safe
to pass around the tempfile struct itself, which contains the filename,
without worrying so much about ownership issues.

But that's a much bigger change that shouldn't be part of your series.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-17 23:03     ` Taylor Blau
@ 2023-04-18 10:39       ` Jeff King
  2023-04-18 14:54         ` Derrick Stolee
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 10:39 UTC (permalink / raw)
  To: Taylor Blau; +Cc: Junio C Hamano, git, Derrick Stolee

On Mon, Apr 17, 2023 at 07:03:08PM -0400, Taylor Blau wrote:

> On Mon, Apr 17, 2023 at 03:54:35PM -0700, Junio C Hamano wrote:
> > Taylor Blau <me@ttaylorr.com> writes:
> >
> > >   - The same is true for `gc.bigPackThreshold`, if the size of the cruft
> > >     pack exceeds the limit set by the caller.
> >
> > This is not as cut-and-dried clear as the previous one.  "This pack
> > is so large that it is not worth rewriting it only to expunge a
> > handful of objects that are no longer reachable from it" is the main
> > motivation to use this configuration, but doesn't some part of the
> > same reasoning apply equally to a large cruft pack?  But let's
> > assume that the configuration is totally irrelevant to cruft packs
> > and read on.
> 
> This is an inherent design trade-off. I imagine that callers who want to
> avoid rewriting their (large) cruft packs would prefer to generate a new
> cruft pack on top with just the recently accumulated unreachable
> objects.
> 
> That kind of works, except if you need to prune objects that are packed
> in an earlier cruft pack. If you have `gc.bigPackThreshold`, there is no
> way to do this if you need to expire objects that are in cruft packs
> above that threshold.
> 
> A user may find themselves frustrated when trying to `git gc --prune`
> some sensitive object(s) from their repository doesn't appear to work,
> only to discover that `gc.bigPackThreshold` is set somewhere in their
> configuration.
> 
> Writing (largely) the same cruft pack to expunge a few objects isn't
> ideal, but it is better than the status quo. And if you have so many
> unreachable objects that this is a concern, it is probably time to prune
> anyway.

Yeah, what your patch does makes sense to me as a default behavior. In a
pre-cruft-pack world, those objects would all be left alone by
gc.bigPackThreshol (because they're loose), and the essence of
cruft-packs is creating a parallel universe where those ejected-to-loose
objects just happen to be stored in a more efficient format.

> It is possible that in the future we could support writing multiple
> cruft packs (we already handle the presence of multiple cruft packs
> fine, just don't expose an easy way for the user to write >1 of them).
> And at that point we would be able to relax this patch a bit and allow
> `gc.bigPackThreshold` to cover cruft packs, too. But in the meantime,
> the benefit of avoiding loose object explosions outweighs the possible
> drawbacks here, IMHO.

I wondered if that interface might be an option to say "hey, I have a
gigantic cruft file I want to carry forward, please leave it alone".

But if you have a giant cruft pack that is making your "git gc" too
slow, it will eventually age out on its own. And if you're impatient,
then "git gc --prune=now" is probably the right tool.

And If you really did want to keep rolling it forward for some reason,
then I'd think marking it with ".keep" would be the best thing (and
maybe even dropping the mtimes file? I'm not sure a how a kept-cruft
pack does or should behave).

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
@ 2023-04-18 10:43   ` Jeff King
  2023-04-18 19:44     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 10:43 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:27PM -0400, Taylor Blau wrote:

> In a similar fashion as the previous commit, adjust the fast-import
> tests to prepare for "git gc" generating a cruft pack by default.
> 
> This adjustment is slightly different, however. Instead of relying on us
> writing out the objects loose, and then calling `git prune` to remove
> them, t9300 needs to be prepared to drop objects that would be moved
> into cruft packs.
> 
> To do this, we can combine the `git gc` invocation with `git prune` into
> one `git gc --prune`, which handles pruning both loose objects, and
> objects that would otherwise be written to a cruft pack.

Good. This is more efficient anyway. It probably does not matter for our
tiny test repository, but it is always good for our tests to model the
best option when possible. :)

>  t/t9300-fast-import.sh | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)

The patch itself looks good to me.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/10] t/t6500-gc.sh: add additional test cases
  2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
@ 2023-04-18 10:48   ` Jeff King
  2023-04-18 19:48     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 10:48 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:33PM -0400, Taylor Blau wrote:

> In the last commit, we refactored some of the tests in t6500 to make
> clearer when cruft packs will and won't be generated by `git gc`.
> 
> Add the remaining cases not covered by the previous patch into this one,
> which enumerates all possible combinations of arguments that will
> produce (or not produce) a cruft pack.
> 
> This prepares us for the following commit, which will change the default
> of `gc.cruftPacks` by ensuring that we understand which invocations do
> and do not change as a result.

I think "the following commit" is really patch 9. Patch 8 adjusts t6501
(and likewise says "like the previous commits"). It would probably make
more sense to put patch 8 before patches 6 and 7.

Probably not worth a re-roll on its own, but if you're doing one anyway,
it should be easy to do.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
@ 2023-04-18 10:56   ` Jeff King
  2023-04-18 19:50     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 10:56 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:36PM -0400, Taylor Blau wrote:

> We could run this test twice, once with `--cruft` and once with
> `--no-cruft`, but doing so is unnecessary, since the object rescuing and
> freshening behavior is already extensively tested via t5329.

That's doubtless true for the general case of freshening (after all,
that's the point of cruft packs). I do wonder about these "broken links"
cases:

> @@ -151,7 +151,7 @@ test_expect_success 'do not complain about existing broken links (commit)' '
>  	some message
>  	EOF
>  	commit=$(git hash-object -t commit -w broken-commit) &&
> -	git gc -q 2>stderr &&
> +	git gc --no-cruft -q 2>stderr &&
>  	verbose git cat-file -e $commit &&
>  	test_must_be_empty stderr
>  '

The idea is that we don't complain when repacking unreachable-but-broken
segments of history. Which could perhaps behave differently for objects
that are going into a cruft pack versus being turned loose. So maybe
it's worth covering for the --cruft case, too. I dunno.

Certainly your patch is not making the test coverage worse, but it might
be highlighting an existing blind-spot (and one that will become the
default behavior in the next patch).

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
@ 2023-04-18 11:00   ` Jeff King
  2023-04-18 19:52     ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 11:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:39PM -0400, Taylor Blau wrote:

>  for argv in \
> +	"gc" \
>  	"gc --cruft" \
>  	"-c gc.cruftPacks=true gc" \
> -	"-c gc.cruftPacks=false gc --cruft" \
> -	"-c feature.experimental=true gc" \
> -	"-c gc.cruftPacks=true -c feature.experimental=false gc"
> +	"-c gc.cruftPacks=false gc --cruft"
>  do

Oh good. I was a little sad to see the increase in the size of this loop
in the earlier patches, so now reducing the number of combinations is a
welcome change.

The set you have here looks fine, though isn't "gc --cruft" redundant
with "gc" now?

> @@ -246,9 +245,7 @@ done
>  for argv in \
>  	"gc --no-cruft" \
>  	"-c gc.cruftPacks=false gc" \
> -	"-c gc.cruftPacks=true gc --no-cruft" \
> -	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
> -	"-c feature.experimental=false gc"
> +	"-c gc.cruftPacks=true gc --no-cruft"

Likewise here, "gc --no-cruft" would have been redundant with "gc"
before this patch, but we did not even bother with it (so no need to
remove it here!).

The rest of the patch looks good, and I am quite on board with the
overall goal. It's been a long time coming. :)

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 10/10] repository.h: drop unused `gc_cruft_packs`
  2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
@ 2023-04-18 11:02   ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2023-04-18 11:02 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:43PM -0400, Taylor Blau wrote:

> As of the previous commit, all callers that need to read the value of
> `gc.cruftPacks` do so outside without using the `repo_settings` struct,
> making its `gc_cruft_packs` unused. Drop it accordingly.

Nice cleanup (I had to peek forward to understand the mystery
repo_settings hunk in the previous patch; it is a little weird to me
that this "settings" struct exists independent of the config, but that
is not your doing).

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 00/10] gc: enable cruft packs by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (9 preceding siblings ...)
  2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
@ 2023-04-18 11:04 ` Jeff King
  2023-04-18 19:53   ` Taylor Blau
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
  11 siblings, 1 reply; 48+ messages in thread
From: Jeff King @ 2023-04-18 11:04 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Mon, Apr 17, 2023 at 04:54:11PM -0400, Taylor Blau wrote:

> The series is structured as follows:
> 
>   - The first two patches are cleanup.
>   - The third patches fixes an oversight where the code for `git gc`'s
>     `--keep-largest-pack` option would incorrectly consider cruft packs.
>   - The next five patches are test preparation.
>   - Then the substantive patch, to actually graduate `gc.cruftPacks` and
>     enable it by default.
>   - The final patch is some cleanup that can only take place towards the
>     end of the series.

Thanks, this was a pleasant read. Besides Junio's small fixup in the
docs, everything I pointed out was minor enough that it wouldn't be a
big deal to go in as-is.

> Cruft packs have been a success for us at GitHub, and I am really
> excited to get it in the hands of more users by default.

Me too. :)

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-18 10:39       ` Jeff King
@ 2023-04-18 14:54         ` Derrick Stolee
  0 siblings, 0 replies; 48+ messages in thread
From: Derrick Stolee @ 2023-04-18 14:54 UTC (permalink / raw)
  To: Jeff King, Taylor Blau; +Cc: Junio C Hamano, git

On 4/18/2023 6:39 AM, Jeff King wrote:
> On Mon, Apr 17, 2023 at 07:03:08PM -0400, Taylor Blau wrote:

I agree with the prior discussion that gc.bigPackThreshold is
currently misbehaving and stopping it from caring about cruft packs
is the best way to fix that behavior in this series.

>> It is possible that in the future we could support writing multiple
>> cruft packs (we already handle the presence of multiple cruft packs
>> fine, just don't expose an easy way for the user to write >1 of them).
>> And at that point we would be able to relax this patch a bit and allow
>> `gc.bigPackThreshold` to cover cruft packs, too. But in the meantime,
>> the benefit of avoiding loose object explosions outweighs the possible
>> drawbacks here, IMHO.
> 
> I wondered if that interface might be an option to say "hey, I have a
> gigantic cruft file I want to carry forward, please leave it alone".
> 
> But if you have a giant cruft pack that is making your "git gc" too
> slow, it will eventually age out on its own. And if you're impatient,
> then "git gc --prune=now" is probably the right tool.
> 
> And If you really did want to keep rolling it forward for some reason,
> then I'd think marking it with ".keep" would be the best thing (and
> maybe even dropping the mtimes file? I'm not sure a how a kept-cruft
> pack does or should behave).

Generally, it's probably a good idea to (later) create a separate knob
for "don't rewrite the objects in a 'big' cruft pack unless you need
to". For situations where cruft objects are being collected and not
regularly pruned, this helps avoid repacking all unreachable objects
into a giant single pack, even though only a small number of objects
were discovered unreachable this time.

The important times where we'd want to consider a 'big' cruft pack
for rewrite are:

 1. Some objects in the cruft pack are being pruned.
 2. Some objects in the cruft pack need updated mtimes.

However, in the typical case that we are adding new cruft objects
and not changing the mtimes of existing unreachable objects, we could
create a sensible limit on the size of a cruft pack to be rewritten
during normal maintenance.

My personal preference would be something between 2GB and 10GB, which
seems like a decent balance between "size of cruft pack" and "number of
cruft packs" for most repositories. Since none of the objects are
reachable, we don't really care about them having good deltas for things
like fetches and clones. The benefit of reducing the time spent in 'git
repack --cruft' outweighs the slight disk space savings by having a
single cruft pack, in my opinion.

Thanks,
-Stolee

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-18 10:30   ` Jeff King
@ 2023-04-18 19:40     ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 06:30:05AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:15PM -0400, Taylor Blau wrote:
>
> > The function `stage_tmp_packfiles()` generates a filename to use for
> > staging the contents of what will become the pack's ".mtimes" file.
> >
> > The name is generated in `write_mtimes_file()` and the result is
> > returned back to `stage_tmp_packfiles()` which uses it to rename the
> > temporary file into place via `rename_tmp_packfiles()`.
> >
> > `write_mtimes_file()` returns a `const char *`, indicating that callers
> > are not expected to free its result (similar to, e.g., `oid_to_hex()`).
> > But callers are expected to free its result, so this return type is
> > incorrect.
>
> Makes sense. I do think in the long run that it might make sense for all
> of these pack-write tmpfiles to tempfile.[ch] (either directly, or via
> register_tempfile() after creating the file). And then it would be safe
> to pass around the tempfile struct itself, which contains the filename,
> without worrying so much about ownership issues.
>
> But that's a much bigger change that shouldn't be part of your series.

Agreed on both fronts. I'll put investigating something like this on my
list of things to-do.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 05/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  2023-04-18 10:43   ` Jeff King
@ 2023-04-18 19:44     ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:44 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 06:43:31AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:27PM -0400, Taylor Blau wrote:
>
> > In a similar fashion as the previous commit, adjust the fast-import
> > tests to prepare for "git gc" generating a cruft pack by default.
> >
> > This adjustment is slightly different, however. Instead of relying on us
> > writing out the objects loose, and then calling `git prune` to remove
> > them, t9300 needs to be prepared to drop objects that would be moved
> > into cruft packs.
> >
> > To do this, we can combine the `git gc` invocation with `git prune` into
> > one `git gc --prune`, which handles pruning both loose objects, and
> > objects that would otherwise be written to a cruft pack.
>
> Good. This is more efficient anyway. It probably does not matter for our
> tiny test repository, but it is always good for our tests to model the
> best option when possible. :)

I spent more time than I'd willingly admit second-guessing myself
wondering if there was something I'm missing why we'd want to call `git
gc` and `git prune` separately.

As best as I can tell, this pattern started all the way back in
03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
deprecate --prune, it now really has no effect, 2008-05-09).

After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
again by accepting an optional parameter, 2009-02-14), we got a handful
of new uses in this script via 4cedb78cb5 (fast-import: add input format
tests, 2011-08-11), which could have been `git gc --prune`, but weren't
(likely taking after 03db4525d3).

I don't know if any of that detail is interesting enough to rise to the
level of needing to be in the patch itself, those were just my notes I
took while wandering through this part of the suite.

> >  t/t9300-fast-import.sh | 13 ++++---------
> >  1 file changed, 4 insertions(+), 9 deletions(-)
>
> The patch itself looks good to me.

Thanks!

-Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 07/10] t/t6500-gc.sh: add additional test cases
  2023-04-18 10:48   ` Jeff King
@ 2023-04-18 19:48     ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:48 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 06:48:13AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:33PM -0400, Taylor Blau wrote:
>
> > In the last commit, we refactored some of the tests in t6500 to make
> > clearer when cruft packs will and won't be generated by `git gc`.
> >
> > Add the remaining cases not covered by the previous patch into this one,
> > which enumerates all possible combinations of arguments that will
> > produce (or not produce) a cruft pack.
> >
> > This prepares us for the following commit, which will change the default
> > of `gc.cruftPacks` by ensuring that we understand which invocations do
> > and do not change as a result.
>
> I think "the following commit" is really patch 9. Patch 8 adjusts t6501
> (and likewise says "like the previous commits"). It would probably make
> more sense to put patch 8 before patches 6 and 7.

I'm glad you said something, because shortly after I sent this series I
was annoyed with myself for not organizing the preparatory patches in
order of their test number.

I reordered them like so:

    c02f80e1e2 t/t5304-prune.sh: prepare for `gc --cruft` by default
    2e0fb1382b t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
    c916e6d356 t/t6500-gc.sh: refactor cruft pack tests
    f9f7b2137b t/t6500-gc.sh: add additional test cases
    5a74520b35 t/t9300-fast-import.sh: prepare for `gc --cruft` by default

Which feels much better to me.

> Probably not worth a re-roll on its own, but if you're doing one anyway,
> it should be easy to do.

Agreed on both. I'm rerolling a few tiny things anyways, so I'll squash
this in, too.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  2023-04-18 10:56   ` Jeff King
@ 2023-04-18 19:50     ` Taylor Blau
  2023-04-22 11:23       ` Jeff King
  0 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:50 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 06:56:22AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:36PM -0400, Taylor Blau wrote:
>
> > We could run this test twice, once with `--cruft` and once with
> > `--no-cruft`, but doing so is unnecessary, since the object rescuing and
> > freshening behavior is already extensively tested via t5329.
>
> That's doubtless true for the general case of freshening (after all,
> that's the point of cruft packs). I do wonder about these "broken links"
> cases:
>
> > @@ -151,7 +151,7 @@ test_expect_success 'do not complain about existing broken links (commit)' '
> >  	some message
> >  	EOF
> >  	commit=$(git hash-object -t commit -w broken-commit) &&
> > -	git gc -q 2>stderr &&
> > +	git gc --no-cruft -q 2>stderr &&
> >  	verbose git cat-file -e $commit &&
> >  	test_must_be_empty stderr
> >  '
>
> The idea is that we don't complain when repacking unreachable-but-broken
> segments of history. Which could perhaps behave differently for objects
> that are going into a cruft pack versus being turned loose. So maybe
> it's worth covering for the --cruft case, too. I dunno.

I think we already have coverage of those cases in t5329, specifically
in the tests:

  - cruft packs tolerate missing trees (expire $expire)
  - cruft packs tolerate missing blobs (expire $expire)

which are tested for both the pruning and non-pruning implementations
(by setting $expire to "2.weeks.ago", and "never", respectively).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-18 11:00   ` Jeff King
@ 2023-04-18 19:52     ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:52 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 07:00:49AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:39PM -0400, Taylor Blau wrote:
>
> >  for argv in \
> > +	"gc" \
> >  	"gc --cruft" \
> >  	"-c gc.cruftPacks=true gc" \
> > -	"-c gc.cruftPacks=false gc --cruft" \
> > -	"-c feature.experimental=true gc" \
> > -	"-c gc.cruftPacks=true -c feature.experimental=false gc"
> > +	"-c gc.cruftPacks=false gc --cruft"
> >  do
>
> Oh good. I was a little sad to see the increase in the size of this loop
> in the earlier patches, so now reducing the number of combinations is a
> welcome change.

Sorry for the roller-coaster of emotions ;-).

> The set you have here looks fine, though isn't "gc --cruft" redundant
> with "gc" now?

Yeah, these are redundant. I figured that it might be good to test all
cases, but I think this matrix-style of testing only gets you so far,
and can often come out wasteful.

"gc --cruft" isn't testing anything meaningful, so let's drop it.

> > @@ -246,9 +245,7 @@ done
> >  for argv in \
> >  	"gc --no-cruft" \
> >  	"-c gc.cruftPacks=false gc" \
> > -	"-c gc.cruftPacks=true gc --no-cruft" \
> > -	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
> > -	"-c feature.experimental=false gc"
> > +	"-c gc.cruftPacks=true gc --no-cruft"
>
> Likewise here, "gc --no-cruft" would have been redundant with "gc"
> before this patch, but we did not even bother with it (so no need to
> remove it here!).

Yeah, this one we definitely want to keep. We probably *could* have
dropped it in the previous patch and brought it back here, but I think
that doing so would have been unnecessary churn for reviewers reading
this series.

> The rest of the patch looks good, and I am quite on board with the
> overall goal. It's been a long time coming. :)

Thanks, I agree ;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 00/10] gc: enable cruft packs by default
  2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
@ 2023-04-18 19:53   ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 19:53 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 07:04:24AM -0400, Jeff King wrote:
> On Mon, Apr 17, 2023 at 04:54:11PM -0400, Taylor Blau wrote:
>
> > The series is structured as follows:
> >
> >   - The first two patches are cleanup.
> >   - The third patches fixes an oversight where the code for `git gc`'s
> >     `--keep-largest-pack` option would incorrectly consider cruft packs.
> >   - The next five patches are test preparation.
> >   - Then the substantive patch, to actually graduate `gc.cruftPacks` and
> >     enable it by default.
> >   - The final patch is some cleanup that can only take place towards the
> >     end of the series.
>
> Thanks, this was a pleasant read. Besides Junio's small fixup in the
> docs, everything I pointed out was minor enough that it wouldn't be a
> big deal to go in as-is.

Thanks, both, for the review. Most (all?) of the clean-up was all pretty
minor, but I would have rerolled here regardless. New version incoming
just as soon as CI says it's all good.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 00/10] gc: enable cruft packs by default
  2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
                   ` (10 preceding siblings ...)
  2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
@ 2023-04-18 20:40 ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
                     ` (9 more replies)
  11 siblings, 10 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

Here is a very tiny reroll of my series to graduate `gc.cruftPacks` out
of `feature.experimental` and enables it by default.

A complete summary of the topic is available in the original cover
letter[1], and the changes since last time are limited to test clean-up,
patch reorganization, and some touch-ups on the patch messages
themselves.

As always, a range-diff is below for convenience.

Thanks for all of the review thus far, and in advance for any review on
this round, too.

[1]: https://lore.kernel.org/git/cover.1681764848.git.me@ttaylorr.com/

Taylor Blau (10):
  pack-write.c: plug a leak in stage_tmp_packfiles()
  builtin/repack.c: fix incorrect reference to '-C'
  builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  t/t5304-prune.sh: prepare for `gc --cruft` by default
  t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  t/t6500-gc.sh: refactor cruft pack tests
  t/t6500-gc.sh: add additional test cases
  t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  builtin/gc.c: make `gc.cruftPacks` enabled by default
  repository.h: drop unused `gc_cruft_packs`

 Documentation/config/feature.txt |   3 -
 Documentation/config/gc.txt      |  12 +--
 Documentation/git-gc.txt         |  12 +--
 Documentation/gitformat-pack.txt |   4 +-
 builtin/gc.c                     |   8 +-
 builtin/repack.c                 |   2 +-
 pack-write.c                     |  14 ++--
 repo-settings.c                  |   4 +-
 repository.h                     |   1 -
 t/t5304-prune.sh                 |  28 +++----
 t/t6500-gc.sh                    | 135 ++++++++++++++++---------------
 t/t6501-freshen-objects.sh       |  10 +--
 t/t9300-fast-import.sh           |  13 +--
 13 files changed, 120 insertions(+), 126 deletions(-)

Range-diff against v1:
 1:  65ac7ed9b8 =  1:  c477b754e7 pack-write.c: plug a leak in stage_tmp_packfiles()
 2:  fbc8d15032 =  2:  52fb61fa9c builtin/repack.c: fix incorrect reference to '-C'
 3:  796df920ad !  3:  63b9ee8e2e builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
    @@ Commit message
           - The same is true for `gc.bigPackThreshold`, if the size of the cruft
             pack exceeds the limit set by the caller.
     
    -    Ignore cruft packs in the common implementation for both of these
    -    options, and add a pair of tests to prevent any future regressions here.
    +    In the future, it is possible that `gc.bigPackThreshold` could be used
    +    to write a separate cruft pack containing any new unreachable objects
    +    that entered the repository since the last time a cruft pack was
    +    written.
    +
    +    There are some complexities to doing so, mainly around handling
    +    pruning objects that are in an existing cruft pack that is above the
    +    threshold (which would either need to be rewritten, or else delay
    +    pruning). Rewriting a substantially similar cruft pack isn't ideal, but
    +    it is significantly better than the status-quo.
    +
    +    If users have large cruft packs that they don't want to rewrite, they
    +    can mark them as `*.keep` packs. But in general, if a repository has a
    +    cruft pack that is so large it is slowing down GC's, it should probably
    +    be pruned anyway.
    +
    +    In the meantime, ignore cruft packs in the common implementation for
    +    both of these options, and add a pair of tests to prevent any future
    +    regressions here.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
    @@ Documentation/git-gc.txt: be performed as well.
     -	All packs except the largest pack and those marked with a
     -	`.keep` files are consolidated into a single pack. When this
     -	option is used, `gc.bigPackThreshold` is ignored.
    -+	All packs except the largest pack, any packs marked with a
    -+	`.keep` file, and any cruft pack(s) are consolidated into a
    -+	single pack. When this option is used, `gc.bigPackThreshold` is
    -+	ignored.
    ++	All packs except the largest non-cruft pack, any packs marked
    ++	with a `.keep` file, and any cruft pack(s) are consolidated into
    ++	a single pack. When this option is used, `gc.bigPackThreshold`
    ++	is ignored.
      
      AGGRESSIVE
      ----------
 4:  44006da959 =  4:  905eeb9027 t/t5304-prune.sh: prepare for `gc --cruft` by default
 8:  4ccc525c39 !  5:  fa6eafb1fe t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
    @@ Commit message
         cover the case of freshening loose objects not using cruft packs.
     
         We could run this test twice, once with `--cruft` and once with
    -    `--no-cruft`, but doing so is unnecessary, since the object rescuing and
    -    freshening behavior is already extensively tested via t5329.
    +    `--no-cruft`, but doing so is unnecessary, since we already test object
    +    rescuing, freshening, and dealing with corrupt parts of the unreachable
    +    object graph extensively via t5329.
     
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
 6:  56a965e517 =  6:  e6270d75fa t/t6500-gc.sh: refactor cruft pack tests
 7:  6957e54f51 !  7:  9db3fa9e36 t/t6500-gc.sh: add additional test cases
    @@ Commit message
         which enumerates all possible combinations of arguments that will
         produce (or not produce) a cruft pack.
     
    -    This prepares us for the following commit, which will change the default
    +    This prepares us for a future commit which will change the default value
         of `gc.cruftPacks` by ensuring that we understand which invocations do
         and do not change as a result.
     
    @@ t/t6500-gc.sh: do
      done
      
      for argv in \
    -+	"gc --no-cruft" \
    ++	"gc" \
     +	"-c gc.cruftPacks=false gc" \
     +	"-c gc.cruftPacks=true gc --no-cruft" \
      	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
 5:  1b07eb83fe !  8:  894cf176ea t/t9300-fast-import.sh: prepare for `gc --cruft` by default
    @@ Metadata
      ## Commit message ##
         t/t9300-fast-import.sh: prepare for `gc --cruft` by default
     
    -    In a similar fashion as the previous commit, adjust the fast-import
    -    tests to prepare for "git gc" generating a cruft pack by default.
    +    In a similar fashion as previous commits, adjust the fast-import tests
    +    to prepare for "git gc" generating a cruft pack by default.
     
         This adjustment is slightly different, however. Instead of relying on us
         writing out the objects loose, and then calling `git prune` to remove
    @@ Commit message
         one `git gc --prune`, which handles pruning both loose objects, and
         objects that would otherwise be written to a cruft pack.
     
    +    Likely this pattern of "git gc && git prune" started all the way back in
    +    03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
    +    happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
    +    deprecate --prune, it now really has no effect, 2008-05-09).
    +
    +    After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
    +    again by accepting an optional parameter, 2009-02-14), this script got a
    +    handful of new "git gc && git prune" instances via via 4cedb78cb5
    +    (fast-import: add input format tests, 2011-08-11). These could have been
    +    `git gc --prune`, but weren't (likely taking after 03db4525d3).
    +
         Signed-off-by: Taylor Blau <me@ttaylorr.com>
     
      ## t/t9300-fast-import.sh ##
 9:  bfda40a21d !  9:  b6784ddfe2 builtin/gc.c: make `gc.cruftPacks` enabled by default
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      }
      
      for argv in \
    +-	"gc --cruft" \
     +	"gc" \
    - 	"gc --cruft" \
      	"-c gc.cruftPacks=true gc" \
     -	"-c gc.cruftPacks=false gc --cruft" \
     -	"-c feature.experimental=true gc" \
    @@ t/t6500-gc.sh: assert_no_cruft_packs () {
      do
      	test_expect_success "git $argv generates a cruft pack" '
      		test_when_finished "rm -fr repo" &&
    -@@ t/t6500-gc.sh: done
    +@@ t/t6500-gc.sh: do
    + done
    + 
      for argv in \
    - 	"gc --no-cruft" \
    +-	"gc" \
    ++	"gc --no-cruft" \
      	"-c gc.cruftPacks=false gc" \
     -	"-c gc.cruftPacks=true gc --no-cruft" \
     -	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
10:  fa15125454 = 10:  c67ee7c2ff repository.h: drop unused `gc_cruft_packs`
-- 
2.40.0.362.gc67ee7c2ff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-19 22:00     ` Junio C Hamano
  2023-04-18 20:40   ` [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
                     ` (8 subsequent siblings)
  9 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

The function `stage_tmp_packfiles()` generates a filename to use for
staging the contents of what will become the pack's ".mtimes" file.

The name is generated in `write_mtimes_file()` and the result is
returned back to `stage_tmp_packfiles()` which uses it to rename the
temporary file into place via `rename_tmp_packfiles()`.

`write_mtimes_file()` returns a `const char *`, indicating that callers
are not expected to free its result (similar to, e.g., `oid_to_hex()`).
But callers are expected to free its result, so this return type is
incorrect.

Change the function's signature to return a non-const `char *`, and free
it at the end of `stage_tmp_packfiles()`.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 pack-write.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/pack-write.c b/pack-write.c
index f171405495..4da0ccc5f5 100644
--- a/pack-write.c
+++ b/pack-write.c
@@ -312,13 +312,13 @@ static void write_mtimes_trailer(struct hashfile *f, const unsigned char *hash)
 	hashwrite(f, hash, the_hash_algo->rawsz);
 }
 
-static const char *write_mtimes_file(struct packing_data *to_pack,
-				     struct pack_idx_entry **objects,
-				     uint32_t nr_objects,
-				     const unsigned char *hash)
+static char *write_mtimes_file(struct packing_data *to_pack,
+			       struct pack_idx_entry **objects,
+			       uint32_t nr_objects,
+			       const unsigned char *hash)
 {
 	struct strbuf tmp_file = STRBUF_INIT;
-	const char *mtimes_name;
+	char *mtimes_name;
 	struct hashfile *f;
 	int fd;
 
@@ -544,7 +544,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 			 char **idx_tmp_name)
 {
 	const char *rev_tmp_name = NULL;
-	const char *mtimes_tmp_name = NULL;
+	char *mtimes_tmp_name = NULL;
 
 	if (adjust_shared_perm(pack_tmp_name))
 		die_errno("unable to make temporary pack file readable");
@@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
 		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
 	if (mtimes_tmp_name)
 		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
+
+	free(mtimes_tmp_name);
 }
 
 void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C'
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
                     ` (7 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

When cruft packs were originally being developed, `-C` was designated as
the short-form for `--cruft` (as in `git repack -C`).

This was dropped due to confusion with Git's top-level `-C` option
before submitting to the list. But the reference to it in
`--cruft-expiration`'s help text was never updated. Fix that dangling
reference in this patch.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 builtin/repack.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/repack.c b/builtin/repack.c
index df4d8e0f0b..d9eee15c2f 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -774,7 +774,7 @@ int cmd_repack(int argc, const char **argv, const char *prefix)
 				N_("same as -a, pack unreachable cruft objects separately"),
 				   PACK_CRUFT),
 		OPT_STRING(0, "cruft-expiration", &cruft_expiration, N_("approxidate"),
-				N_("with -C, expire objects older than this")),
+				N_("with --cruft, expire objects older than this")),
 		OPT_BOOL('d', NULL, &delete_redundant,
 				N_("remove redundant packs, and run git-prune-packed")),
 		OPT_BOOL('f', NULL, &po_args.no_reuse_delta,
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack`
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
                     ` (6 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

When cruft packs were implemented, we never adjusted the code for `git
gc`'s `--keep-largest-pack` and `gc.bigPackThreshold` to ignore cruft
packs. This option and configuration option share a common
implementation, but including cruft packs is wrong in both cases:

  - Running `git gc --keep-largest-pack` in a repository where the
    largest pack is the cruft pack itself will make it impossible for
    `git gc` to prune objects, since the cruft pack itself is kept.

  - The same is true for `gc.bigPackThreshold`, if the size of the cruft
    pack exceeds the limit set by the caller.

In the future, it is possible that `gc.bigPackThreshold` could be used
to write a separate cruft pack containing any new unreachable objects
that entered the repository since the last time a cruft pack was
written.

There are some complexities to doing so, mainly around handling
pruning objects that are in an existing cruft pack that is above the
threshold (which would either need to be rewritten, or else delay
pruning). Rewriting a substantially similar cruft pack isn't ideal, but
it is significantly better than the status-quo.

If users have large cruft packs that they don't want to rewrite, they
can mark them as `*.keep` packs. But in general, if a repository has a
cruft pack that is so large it is slowing down GC's, it should probably
be pruned anyway.

In the meantime, ignore cruft packs in the common implementation for
both of these options, and add a pair of tests to prevent any future
regressions here.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/gc.txt | 10 ++++-----
 Documentation/git-gc.txt    |  7 +++---
 builtin/gc.c                |  2 +-
 t/t6500-gc.sh               | 43 +++++++++++++++++++++++++++++++++++++
 4 files changed, 53 insertions(+), 9 deletions(-)

diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt
index 38fea076a2..8d5353e9e0 100644
--- a/Documentation/config/gc.txt
+++ b/Documentation/config/gc.txt
@@ -43,11 +43,11 @@ gc.autoDetach::
 	if the system supports it. Default is true.
 
 gc.bigPackThreshold::
-	If non-zero, all packs larger than this limit are kept when
-	`git gc` is run. This is very similar to `--keep-largest-pack`
-	except that all packs that meet the threshold are kept, not
-	just the largest pack. Defaults to zero. Common unit suffixes of
-	'k', 'm', or 'g' are supported.
+	If non-zero, all non-cruft packs larger than this limit are kept
+	when `git gc` is run. This is very similar to
+	`--keep-largest-pack` except that all non-cruft packs that meet
+	the threshold are kept, not just the largest pack. Defaults to
+	zero. Common unit suffixes of 'k', 'm', or 'g' are supported.
 +
 Note that if the number of kept packs is more than gc.autoPackLimit,
 this configuration variable is ignored, all packs except the base pack
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index a65c9aa62d..fef382a70f 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -77,9 +77,10 @@ be performed as well.
 	instance running on this repository.
 
 --keep-largest-pack::
-	All packs except the largest pack and those marked with a
-	`.keep` files are consolidated into a single pack. When this
-	option is used, `gc.bigPackThreshold` is ignored.
+	All packs except the largest non-cruft pack, any packs marked
+	with a `.keep` file, and any cruft pack(s) are consolidated into
+	a single pack. When this option is used, `gc.bigPackThreshold`
+	is ignored.
 
 AGGRESSIVE
 ----------
diff --git a/builtin/gc.c b/builtin/gc.c
index edd98d35a5..53ef137e1d 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -219,7 +219,7 @@ static struct packed_git *find_base_packs(struct string_list *packs,
 	struct packed_git *p, *base = NULL;
 
 	for (p = get_all_packs(the_repository); p; p = p->next) {
-		if (!p->pack_local)
+		if (!p->pack_local || p->is_cruft)
 			continue;
 		if (limit) {
 			if (p->pack_size >= limit)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index d9acb63951..df6f2e6e52 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -298,6 +298,49 @@ test_expect_success 'feature.experimental=false avoids cruft packs by default' '
 	)
 '
 
+test_expect_success '--keep-largest-pack ignores cruft packs' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Generate a pack for reachable objects (of which there
+		# are 3), and one for unreachable objects (of which
+		# there are 6).
+		prepare_cruft_history &&
+		git gc --cruft &&
+
+		mtimes="$(find .git/objects/pack -type f -name "pack-*.mtimes")" &&
+		sz="$(test_file_size "${mtimes%.mtimes}.pack")" &&
+
+		# Ensure that the cruft pack gets removed (due to
+		# `--prune=now`) despite it being the largest pack.
+		git -c gc.bigPackThreshold=$sz gc --cruft --prune=now &&
+
+		assert_no_cruft_packs
+	)
+'
+
+test_expect_success 'gc.bigPackThreshold ignores cruft packs' '
+	test_when_finished "rm -fr repo" &&
+	git init repo &&
+	(
+		cd repo &&
+
+		# Generate a pack for reachable objects (of which there
+		# are 3), and one for unreachable objects (of which
+		# there are 6).
+		prepare_cruft_history &&
+		git gc --cruft &&
+
+		# Ensure that the cruft pack gets removed (due to
+		# `--prune=now`) despite it being the largest pack.
+		git gc --cruft --prune=now --keep-largest-pack &&
+
+		assert_no_cruft_packs
+	)
+'
+
 run_and_wait_for_auto_gc () {
 	# We read stdout from gc for the side effect of waiting until the
 	# background gc process exits, closing its fd 9.  Furthermore, the
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (2 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 05/10] t/t6501-freshen-objects.sh: " Taylor Blau
                     ` (5 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

Many of the tests in t5304 run `git gc`, and rely on its behavior that
unreachable-but-recent objects are written out loose. This is sensible,
since t5304 deals specifically with this kind of pruning.

If left unattended, however, this test would break when the default
behavior of a bare "git gc" is adjusted to generate a cruft pack by
default.

Ensure that these tests continue to work as-is (and continue to provide
coverage of loose object pruning) by passing `--no-cruft` explicitly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t5304-prune.sh | 28 ++++++++++++++--------------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 5500dd0842..662ae9b152 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -62,11 +62,11 @@ test_expect_success 'prune --expire' '
 test_expect_success 'gc: implicit prune --expire' '
 	add_blob &&
 	test-tool chmtime =-$((2*$week-30)) $BLOB_FILE &&
-	git gc &&
+	git gc --no-cruft &&
 	verbose test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE &&
 	test-tool chmtime =-$((2*$week+1)) $BLOB_FILE &&
-	git gc &&
+	git gc --no-cruft &&
 	verbose test $before = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_missing $BLOB_FILE
 '
@@ -86,7 +86,7 @@ test_expect_success 'gc: refuse to start with invalid gc.pruneExpire' '
 
 test_expect_success 'gc: start with ok gc.pruneExpire' '
 	git config gc.pruneExpire 2.days.ago &&
-	git gc
+	git gc --no-cruft
 '
 
 test_expect_success 'prune: prune nonsense parameters' '
@@ -137,44 +137,44 @@ test_expect_success 'gc --no-prune' '
 	add_blob &&
 	test-tool chmtime =-$((5001*$day)) $BLOB_FILE &&
 	git config gc.pruneExpire 2.days.ago &&
-	git gc --no-prune &&
+	git gc --no-prune --no-cruft &&
 	verbose test 1 = $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE
 '
 
 test_expect_success 'gc respects gc.pruneExpire' '
 	git config gc.pruneExpire 5002.days.ago &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire 5000.days.ago &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc --prune=<date>' '
 	add_blob &&
 	test-tool chmtime =-$((5001*$day)) $BLOB_FILE &&
-	git gc --prune=5002.days.ago &&
+	git gc --prune=5002.days.ago --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
-	git gc --prune=5000.days.ago &&
+	git gc --prune=5000.days.ago --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc --prune=never' '
 	add_blob &&
-	git gc --prune=never &&
+	git gc --prune=never --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
-	git gc --prune=now &&
+	git gc --prune=now --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
 test_expect_success 'gc respects gc.pruneExpire=never' '
 	git config gc.pruneExpire never &&
 	add_blob &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire now &&
-	git gc &&
+	git gc --no-cruft &&
 	test_path_is_missing $BLOB_FILE
 '
 
@@ -194,7 +194,7 @@ test_expect_success 'gc: prune old objects after local clone' '
 		cd aclone &&
 		verbose test 1 = $(git count-objects | sed "s/ .*//") &&
 		test_path_is_file $BLOB_FILE &&
-		git gc --prune &&
+		git gc --prune --no-cruft &&
 		verbose test 0 = $(git count-objects | sed "s/ .*//") &&
 		test_path_is_missing $BLOB_FILE
 	)
@@ -237,7 +237,7 @@ test_expect_success 'clean pack garbage with gc' '
 	>.git/objects/pack/fake2.keep &&
 	>.git/objects/pack/fake2.idx &&
 	>.git/objects/pack/fake3.keep &&
-	git gc &&
+	git gc --no-cruft &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 05/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (3 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
                     ` (4 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In a similar spirit as previous commits, prepare for `gc --cruft`
becoming the default by ensuring that the tests in t6501 explicitly
cover the case of freshening loose objects not using cruft packs.

We could run this test twice, once with `--cruft` and once with
`--no-cruft`, but doing so is unnecessary, since we already test object
rescuing, freshening, and dealing with corrupt parts of the unreachable
object graph extensively via t5329.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6501-freshen-objects.sh | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
index 3968b47ed5..dbfa8a4d4c 100755
--- a/t/t6501-freshen-objects.sh
+++ b/t/t6501-freshen-objects.sh
@@ -101,7 +101,7 @@ do
 	'
 
 	test_expect_success "simultaneous gc ($title)" '
-		git gc --prune=12.hours.ago
+		git gc --no-cruft --prune=12.hours.ago
 	'
 
 	test_expect_success "finish writing out commit ($title)" '
@@ -131,7 +131,7 @@ do
 	'
 
 	test_expect_success "simultaneous gc ($title)" '
-		git gc --prune=12.hours.ago
+		git gc --no-cruft --prune=12.hours.ago
 	'
 
 	# tree should have been refreshed by write-tree
@@ -151,7 +151,7 @@ test_expect_success 'do not complain about existing broken links (commit)' '
 	some message
 	EOF
 	commit=$(git hash-object -t commit -w broken-commit) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	verbose git cat-file -e $commit &&
 	test_must_be_empty stderr
 '
@@ -161,7 +161,7 @@ test_expect_success 'do not complain about existing broken links (tree)' '
 	100644 blob $(test_oid 003)	foo
 	EOF
 	tree=$(git mktree --missing <broken-tree) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	git cat-file -e $tree &&
 	test_must_be_empty stderr
 '
@@ -176,7 +176,7 @@ test_expect_success 'do not complain about existing broken links (tag)' '
 	this is a broken tag
 	EOF
 	tag=$(git hash-object -t tag -w broken-tag) &&
-	git gc -q 2>stderr &&
+	git gc --no-cruft -q 2>stderr &&
 	git cat-file -e $tag &&
 	test_must_be_empty stderr
 '
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (4 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 05/10] t/t6501-freshen-objects.sh: " Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
                     ` (3 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In 12253ab6d0 (gc: add tests for --cruft and friends, 2022-10-26), we
added a handful of tests to t6500 to ensure that `git gc` respected the
value of `--cruft` and `gc.cruftPacks`.

Then, in c695592850 (config: let feature.experimental imply
gc.cruftPacks=true, 2022-10-26), another set of similar tests was added
to ensure that `feature.experimental` correctly implied enabling cruft
pack generation (or not).

These tests are similar and could be consolidated. Do so in this patch
to prepare for expanding the set of command-line invocations that enable
or disable writing cruft packs. This makes it possible to easily test
more combinations of arguments without being overly repetitive.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6500-gc.sh | 126 ++++++++++++++++++--------------------------------
 1 file changed, 44 insertions(+), 82 deletions(-)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index df6f2e6e52..a2f988c5c2 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -210,93 +210,55 @@ prepare_cruft_history () {
 	git reset HEAD^^
 }
 
-assert_cruft_packs () {
-	find .git/objects/pack -name "*.mtimes" >mtimes &&
-	sed -e 's/\.mtimes$/\.pack/g' mtimes >packs &&
-
-	test_file_not_empty packs &&
-	while read pack
-	do
-		test_path_is_file "$pack" || return 1
-	done <packs
-}
-
 assert_no_cruft_packs () {
 	find .git/objects/pack -name "*.mtimes" >mtimes &&
 	test_must_be_empty mtimes
 }
 
-test_expect_success 'gc --cruft generates a cruft pack' '
-	test_when_finished "rm -fr crufts" &&
-	git init crufts &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git gc --cruft &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'gc.cruftPacks=true generates a cruft pack' '
-	test_when_finished "rm -fr crufts" &&
-	git init crufts &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c gc.cruftPacks=true gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=true generates a cruft pack' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.experimental=true gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=false allows explicit cruft packs' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c gc.cruftPacks=true -c feature.experimental=false gc &&
-		assert_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=true can be overridden' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.expiremental=true -c gc.cruftPacks=false gc &&
-		assert_no_cruft_packs
-	)
-'
-
-test_expect_success 'feature.experimental=false avoids cruft packs by default' '
-	git init crufts &&
-	test_when_finished "rm -fr crufts" &&
-	(
-		cd crufts &&
-
-		prepare_cruft_history &&
-		git -c feature.experimental=false gc &&
-		assert_no_cruft_packs
-	)
-'
+for argv in \
+	"gc --cruft" \
+	"-c gc.cruftPacks=true gc" \
+	"-c feature.experimental=true gc" \
+	"-c gc.cruftPacks=true -c feature.experimental=false gc"
+do
+	test_expect_success "git $argv generates a cruft pack" '
+		test_when_finished "rm -fr repo" &&
+		git init repo &&
+		(
+			cd repo &&
+
+			prepare_cruft_history &&
+			git $argv &&
+
+			find .git/objects/pack -name "*.mtimes" >mtimes &&
+			sed -e 's/\.mtimes$/\.pack/g' mtimes >packs &&
+
+			test_file_not_empty packs &&
+			while read pack
+			do
+				test_path_is_file "$pack" || return 1
+			done <packs
+		)
+	'
+done
+
+for argv in \
+	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
+	"-c feature.experimental=false gc"
+do
+	test_expect_success "git $argv does not generate a cruft pack" '
+		test_when_finished "rm -fr repo" &&
+		git init repo &&
+		(
+			cd repo &&
+
+			prepare_cruft_history &&
+			git $argv &&
+
+			assert_no_cruft_packs
+		)
+	'
+done
 
 test_expect_success '--keep-largest-pack ignores cruft packs' '
 	test_when_finished "rm -fr repo" &&
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (5 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default Taylor Blau
                     ` (2 subsequent siblings)
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In the last commit, we refactored some of the tests in t6500 to make
clearer when cruft packs will and won't be generated by `git gc`.

Add the remaining cases not covered by the previous patch into this one,
which enumerates all possible combinations of arguments that will
produce (or not produce) a cruft pack.

This prepares us for a future commit which will change the default value
of `gc.cruftPacks` by ensuring that we understand which invocations do
and do not change as a result.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t6500-gc.sh | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index a2f988c5c2..3ba2ae5140 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -218,6 +218,7 @@ assert_no_cruft_packs () {
 for argv in \
 	"gc --cruft" \
 	"-c gc.cruftPacks=true gc" \
+	"-c gc.cruftPacks=false gc --cruft" \
 	"-c feature.experimental=true gc" \
 	"-c gc.cruftPacks=true -c feature.experimental=false gc"
 do
@@ -243,6 +244,9 @@ do
 done
 
 for argv in \
+	"gc" \
+	"-c gc.cruftPacks=false gc" \
+	"-c gc.cruftPacks=true gc --no-cruft" \
 	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
 	"-c feature.experimental=false gc"
 do
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (6 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-18 20:40   ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
  2023-04-18 20:41   ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
  9 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

In a similar fashion as previous commits, adjust the fast-import tests
to prepare for "git gc" generating a cruft pack by default.

This adjustment is slightly different, however. Instead of relying on us
writing out the objects loose, and then calling `git prune` to remove
them, t9300 needs to be prepared to drop objects that would be moved
into cruft packs.

To do this, we can combine the `git gc` invocation with `git prune` into
one `git gc --prune`, which handles pruning both loose objects, and
objects that would otherwise be written to a cruft pack.

Likely this pattern of "git gc && git prune" started all the way back in
03db4525d3 (Support gitlinks in fast-import., 2008-07-19), which
happened after deprecating `git gc --prune` in 9e7d501990 (builtin-gc.c:
deprecate --prune, it now really has no effect, 2008-05-09).

After `--prune` was un-deprecated in 58e9d9d472 (gc: make --prune useful
again by accepting an optional parameter, 2009-02-14), this script got a
handful of new "git gc && git prune" instances via via 4cedb78cb5
(fast-import: add input format tests, 2011-08-11). These could have been
`git gc --prune`, but weren't (likely taking after 03db4525d3).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 t/t9300-fast-import.sh | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/t/t9300-fast-import.sh b/t/t9300-fast-import.sh
index aa55b41b9a..ac237a1f90 100755
--- a/t/t9300-fast-import.sh
+++ b/t/t9300-fast-import.sh
@@ -388,9 +388,7 @@ test_expect_success 'B: accept branch name "TEMP_TAG"' '
 
 	INPUT_END
 
-	test_when_finished "rm -f .git/TEMP_TAG
-		git gc
-		git prune" &&
+	test_when_finished "rm -f .git/TEMP_TAG && git gc --prune=now" &&
 	git fast-import <input &&
 	test $(test-tool ref-store main resolve-ref TEMP_TAG 0 | cut -f1 -d " " ) != "$ZERO_OID" &&
 	test $(git rev-parse main) = $(git rev-parse TEMP_TAG^)
@@ -406,8 +404,7 @@ test_expect_success 'B: accept empty committer' '
 	INPUT_END
 
 	test_when_finished "git update-ref -d refs/heads/empty-committer-1
-		git gc
-		git prune" &&
+		git gc --prune=now" &&
 	git fast-import <input &&
 	out=$(git fsck) &&
 	echo "$out" &&
@@ -452,8 +449,7 @@ test_expect_success 'B: accept and fixup committer with no name' '
 	INPUT_END
 
 	test_when_finished "git update-ref -d refs/heads/empty-committer-2
-		git gc
-		git prune" &&
+		git gc --prune=now" &&
 	git fast-import <input &&
 	out=$(git fsck) &&
 	echo "$out" &&
@@ -1778,8 +1774,7 @@ test_expect_success 'P: verbatim SHA gitlinks' '
 	INPUT_END
 
 	git branch -D sub &&
-	git gc &&
-	git prune &&
+	git gc --prune=now &&
 	git fast-import <input &&
 	test $(git rev-parse --verify subuse2) = $(git rev-parse --verify subuse1)
 '
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (7 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default Taylor Blau
@ 2023-04-18 20:40   ` Taylor Blau
  2023-04-19 22:22     ` Junio C Hamano
  2023-04-18 20:41   ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
  9 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:40 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

Back in 5b92477f89 (builtin/gc.c: conditionally avoid pruning objects
via loose, 2022-05-20), `git gc` learned the `--cruft` option and
`gc.cruftPacks` configuration to opt-in to writing cruft packs when
collecting or pruning unreachable objects.

Cruft packs were introduced with the merge in a50036da1a (Merge branch
'tb/cruft-packs', 2022-06-03). They address the problem of "loose object
explosions", where Git will write out many individual loose objects when
there is a large number of unreachable objects that have not yet aged
past `--prune=<date>`.

Instead of keeping track of those unreachable yet recent objects via
their loose object file's mtime, cruft packs collect all unreachable
objects into a single pack with a corresponding `*.mtimes` file that
acts as a table to store the mtimes of all unreachable objects. This
prevents the need to store unreachable objects as loose as they age out
of the repository, and avoids the problem of loose object explosions.

Beyond avoiding loose object explosions, cruft packs also act as a more
efficient mechanism to store unreachable objects as they age out of a
repository. This is because pairs of similar unreachable objects serve
as delta bases for one another.

In 5b92477f89, the feature was introduced as experimental. Since then,
GitHub has been running these patches in every repository generating
hundreds of millions of cruft packs along the way. The feature is
battle-tested, and avoids many pathological cases such as above. Users
who either run `git gc` manually, or via `git maintenance` can benefit
from having cruft packs.

As such, enable cruft pack generation to take place by default (by
making `gc.cruftPacks` have the default of "true" rather than "false).

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 Documentation/config/feature.txt |  3 ---
 Documentation/config/gc.txt      |  2 +-
 Documentation/git-gc.txt         |  5 +++--
 Documentation/gitformat-pack.txt |  4 ++--
 builtin/gc.c                     |  6 +-----
 t/t6500-gc.sh                    | 12 ++++--------
 6 files changed, 11 insertions(+), 21 deletions(-)

diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
index e52bc6b858..17b4d39f89 100644
--- a/Documentation/config/feature.txt
+++ b/Documentation/config/feature.txt
@@ -14,9 +14,6 @@ feature.experimental::
 +
 * `fetch.negotiationAlgorithm=skipping` may improve fetch negotiation times by
 skipping more commits at a time, reducing the number of round trips.
-+
-* `gc.cruftPacks=true` reduces disk space used by unreachable objects during
-garbage collection, preventing loose object explosions.
 
 feature.manyFiles::
 	Enable config options that optimize for repos with many files in the
diff --git a/Documentation/config/gc.txt b/Documentation/config/gc.txt
index 8d5353e9e0..7f95c866e1 100644
--- a/Documentation/config/gc.txt
+++ b/Documentation/config/gc.txt
@@ -84,7 +84,7 @@ gc.packRefs::
 gc.cruftPacks::
 	Store unreachable objects in a cruft pack (see
 	linkgit:git-repack[1]) instead of as loose objects. The default
-	is `false`.
+	is `true`.
 
 gc.pruneExpire::
 	When 'git gc' is run, it will call 'prune --expire 2.weeks.ago'
diff --git a/Documentation/git-gc.txt b/Documentation/git-gc.txt
index fef382a70f..90806fd26a 100644
--- a/Documentation/git-gc.txt
+++ b/Documentation/git-gc.txt
@@ -54,9 +54,10 @@ other housekeeping tasks (e.g. rerere, working trees, reflog...) will
 be performed as well.
 
 
---cruft::
+--[no-]cruft::
 	When expiring unreachable objects, pack them separately into a
-	cruft pack instead of storing them as loose objects.
+	cruft pack instead of storing them as loose objects. `--cruft`
+	is on by default.
 
 --prune=<date>::
 	Prune loose objects older than date (default is 2 weeks ago,
diff --git a/Documentation/gitformat-pack.txt b/Documentation/gitformat-pack.txt
index e06af02f21..0c1be2dbe8 100644
--- a/Documentation/gitformat-pack.txt
+++ b/Documentation/gitformat-pack.txt
@@ -611,8 +611,8 @@ result of repeatedly resetting the objects' mtimes to the present time.
 
 If you are GC-ing repositories in a mixed version environment, consider omitting
 the `--cruft` option when using linkgit:git-repack[1] and linkgit:git-gc[1], and
-leaving the `gc.cruftPacks` configuration unset until all writers understand
-cruft packs.
+setting the `gc.cruftPacks` configuration to "false" until all writers
+understand cruft packs.
 
 === Alternatives
 
diff --git a/builtin/gc.c b/builtin/gc.c
index 53ef137e1d..ece01e966f 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -48,7 +48,7 @@ static const char * const builtin_gc_usage[] = {
 
 static int pack_refs = 1;
 static int prune_reflogs = 1;
-static int cruft_packs = -1;
+static int cruft_packs = 1;
 static int aggressive_depth = 50;
 static int aggressive_window = 250;
 static int gc_auto_threshold = 6700;
@@ -608,10 +608,6 @@ int cmd_gc(int argc, const char **argv, const char *prefix)
 	if (prune_expire && parse_expiry_date(prune_expire, &dummy))
 		die(_("failed to parse prune expiry value %s"), prune_expire);
 
-	prepare_repo_settings(the_repository);
-	if (cruft_packs < 0)
-		cruft_packs = the_repository->settings.gc_cruft_packs;
-
 	if (aggressive) {
 		strvec_push(&repack, "-f");
 		if (aggressive_depth > 0)
diff --git a/t/t6500-gc.sh b/t/t6500-gc.sh
index 3ba2ae5140..69509d0c11 100755
--- a/t/t6500-gc.sh
+++ b/t/t6500-gc.sh
@@ -216,11 +216,9 @@ assert_no_cruft_packs () {
 }
 
 for argv in \
-	"gc --cruft" \
+	"gc" \
 	"-c gc.cruftPacks=true gc" \
-	"-c gc.cruftPacks=false gc --cruft" \
-	"-c feature.experimental=true gc" \
-	"-c gc.cruftPacks=true -c feature.experimental=false gc"
+	"-c gc.cruftPacks=false gc --cruft"
 do
 	test_expect_success "git $argv generates a cruft pack" '
 		test_when_finished "rm -fr repo" &&
@@ -244,11 +242,9 @@ do
 done
 
 for argv in \
-	"gc" \
+	"gc --no-cruft" \
 	"-c gc.cruftPacks=false gc" \
-	"-c gc.cruftPacks=true gc --no-cruft" \
-	"-c feature.expiremental=true -c gc.cruftPacks=false gc" \
-	"-c feature.experimental=false gc"
+	"-c gc.cruftPacks=true gc --no-cruft"
 do
 	test_expect_success "git $argv does not generate a cruft pack" '
 		test_when_finished "rm -fr repo" &&
-- 
2.40.0.362.gc67ee7c2ff


^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs`
  2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
                     ` (8 preceding siblings ...)
  2023-04-18 20:40   ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
@ 2023-04-18 20:41   ` Taylor Blau
  2023-04-19 22:19     ` Junio C Hamano
  9 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-18 20:41 UTC (permalink / raw)
  To: git; +Cc: Jeff King, Derrick Stolee, Junio C Hamano

As of the previous commit, all callers that need to read the value of
`gc.cruftPacks` do so outside without using the `repo_settings` struct,
making its `gc_cruft_packs` unused. Drop it accordingly.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
---
 repo-settings.c | 4 +---
 repository.h    | 1 -
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/repo-settings.c b/repo-settings.c
index 0a6c0b381f..a8d0b98794 100644
--- a/repo-settings.c
+++ b/repo-settings.c
@@ -41,10 +41,8 @@ void prepare_repo_settings(struct repository *r)
 	repo_cfg_bool(r, "feature.experimental", &experimental, 0);
 
 	/* Defaults modified by feature.* */
-	if (experimental) {
+	if (experimental)
 		r->settings.fetch_negotiation_algorithm = FETCH_NEGOTIATION_SKIPPING;
-		r->settings.gc_cruft_packs = 1;
-	}
 	if (manyfiles) {
 		r->settings.index_version = 4;
 		r->settings.index_skip_hash = 1;
diff --git a/repository.h b/repository.h
index 15a8afc5fb..50eb0ce391 100644
--- a/repository.h
+++ b/repository.h
@@ -33,7 +33,6 @@ struct repo_settings {
 	int commit_graph_generation_version;
 	int commit_graph_read_changed_paths;
 	int gc_write_commit_graph;
-	int gc_cruft_packs;
 	int fetch_write_commit_graph;
 	int command_requires_full_index;
 	int sparse_index;
-- 
2.40.0.362.gc67ee7c2ff

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
@ 2023-04-19 22:00     ` Junio C Hamano
  2023-04-20 16:31       ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2023-04-19 22:00 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> The function `stage_tmp_packfiles()` generates a filename to use for
> staging the contents of what will become the pack's ".mtimes" file.
>
> The name is generated in `write_mtimes_file()` and the result is
> returned back to `stage_tmp_packfiles()` which uses it to rename the
> temporary file into place via `rename_tmp_packfiles()`.
>
> `write_mtimes_file()` returns a `const char *`, indicating that callers
> are not expected to free its result (similar to, e.g., `oid_to_hex()`).
> But callers are expected to free its result, so this return type is
> incorrect.

Indeed the string that holds the name of the file returned by
write_mtimes_file() is leaking.  Does the same logic apply to the
returned filename from write_rev_file() and stored in rev_tmp_name
that is not freed in stage_tmp_packfiles() in another topic?

> @@ -544,7 +544,7 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
>  			 char **idx_tmp_name)
>  {
>  	const char *rev_tmp_name = NULL;
> -	const char *mtimes_tmp_name = NULL;
> +	char *mtimes_tmp_name = NULL;
>  
>  	if (adjust_shared_perm(pack_tmp_name))
>  		die_errno("unable to make temporary pack file readable");
> @@ -568,6 +568,8 @@ void stage_tmp_packfiles(struct strbuf *name_buffer,
>  		rename_tmp_packfile(name_buffer, rev_tmp_name, "rev");
>  	if (mtimes_tmp_name)
>  		rename_tmp_packfile(name_buffer, mtimes_tmp_name, "mtimes");
> +
> +	free(mtimes_tmp_name);
>  }
>  
>  void write_promisor_file(const char *promisor_name, struct ref **sought, int nr_sought)

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs`
  2023-04-18 20:41   ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
@ 2023-04-19 22:19     ` Junio C Hamano
  0 siblings, 0 replies; 48+ messages in thread
From: Junio C Hamano @ 2023-04-19 22:19 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> As of the previous commit, all callers that need to read the value of
> `gc.cruftPacks` do so outside without using the `repo_settings` struct,
> making its `gc_cruft_packs` unused. Drop it accordingly.
>
> Signed-off-by: Taylor Blau <me@ttaylorr.com>
> ---
>  repo-settings.c | 4 +---
>  repository.h    | 1 -
>  2 files changed, 1 insertion(+), 4 deletions(-)

This is in line with the current design of repo-settings, and it is
a total tangent to this topic, but in retrospect, if we structured
how "what's experimental" is determined in a more distributed way,
we didn't have to have this patch.  That is, instead of saying "ok
we are in a repository with experimental setting turned on, and the
set of features to be enabled are centrally described in this
settings structure", each opt-in features (like the code that reads
"gc.cruftPacks" by calling git_config_get_bool() to override its
default value) can select their default/fallback values by seeing if
the experimental bit is set.

In any case, this patch looks good and we have one fewer such
centrally managed bit.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-18 20:40   ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
@ 2023-04-19 22:22     ` Junio C Hamano
  2023-04-20 17:24       ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2023-04-19 22:22 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

> diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
> index e52bc6b858..17b4d39f89 100644
> --- a/Documentation/config/feature.txt
> +++ b/Documentation/config/feature.txt
> @@ -14,9 +14,6 @@ feature.experimental::
>  +
>  * `fetch.negotiationAlgorithm=skipping` may improve fetch negotiation times by
>  skipping more commits at a time, reducing the number of round trips.
> -+
> -* `gc.cruftPacks=true` reduces disk space used by unreachable objects during
> -garbage collection, preventing loose object explosions.

Being listed here is a definite sign that a feature behind a
configuration variable is considered experimental.  Do we have (and
if not, do we want to establish) a procedure to mark and announce a
feature that used to be experimental no longer is?  If it is enough
to mention it in the release notes, then I can take care of it, of
course.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-19 22:00     ` Junio C Hamano
@ 2023-04-20 16:31       ` Taylor Blau
  2023-04-20 16:57         ` Junio C Hamano
  0 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-20 16:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Derrick Stolee

On Wed, Apr 19, 2023 at 03:00:48PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > The function `stage_tmp_packfiles()` generates a filename to use for
> > staging the contents of what will become the pack's ".mtimes" file.
> >
> > The name is generated in `write_mtimes_file()` and the result is
> > returned back to `stage_tmp_packfiles()` which uses it to rename the
> > temporary file into place via `rename_tmp_packfiles()`.
> >
> > `write_mtimes_file()` returns a `const char *`, indicating that callers
> > are not expected to free its result (similar to, e.g., `oid_to_hex()`).
> > But callers are expected to free its result, so this return type is
> > incorrect.
>
> Indeed the string that holds the name of the file returned by
> write_mtimes_file() is leaking.  Does the same logic apply to the
> returned filename from write_rev_file() and stored in rev_tmp_name
> that is not freed in stage_tmp_packfiles() in another topic?

They are similar, but unfortunately different.

Here, our temporary name is generated by `write_mtimes_file()` which
*always* comes up with a new name from scratch. So we know that it
should always be free()'d at the end of `stage_tmp_packfiles()`.

But in the case of `write_rev_file()`, it only *sometimes* generates a
new name from scratch. The first parameter can be NULL, in which case
`write_rev_file()` will generate a new name. Or it can be non-NULL, in
which case that name will be used instead.

So in that topic, it's less clear on what the ultimate right path
forward is. But in this topic, changing `mtimes_tmp_name` and the return
type of `write_mtimes_file()` to be a non-const `char *` (and free()ing
it, of course ;-)) is the right thing to do.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles()
  2023-04-20 16:31       ` Taylor Blau
@ 2023-04-20 16:57         ` Junio C Hamano
  0 siblings, 0 replies; 48+ messages in thread
From: Junio C Hamano @ 2023-04-20 16:57 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

>> Indeed the string that holds the name of the file returned by
>> write_mtimes_file() is leaking.  Does the same logic apply to the
>> returned filename from write_rev_file() and stored in rev_tmp_name
>> that is not freed in stage_tmp_packfiles() in another topic?
>
> They are similar, but unfortunately different.

I hoped "fortunately different and there is no leak", but it seems
what you said below is a bit different X-<.

> But in the case of `write_rev_file()`, it only *sometimes* generates a
> new name from scratch. The first parameter can be NULL, in which case
> `write_rev_file()` will generate a new name. Or it can be non-NULL, in
> which case that name will be used instead.

IOW, it is a mess.

But I think the topic to introduce that variable is being rerolled
and has no interaction with the codepath we are discussing here, so
we are still in good shape wrt this series ;-)

Thanks.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-19 22:22     ` Junio C Hamano
@ 2023-04-20 17:24       ` Taylor Blau
  2023-04-20 17:31         ` Junio C Hamano
  0 siblings, 1 reply; 48+ messages in thread
From: Taylor Blau @ 2023-04-20 17:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Derrick Stolee

On Wed, Apr 19, 2023 at 03:22:13PM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> > diff --git a/Documentation/config/feature.txt b/Documentation/config/feature.txt
> > index e52bc6b858..17b4d39f89 100644
> > --- a/Documentation/config/feature.txt
> > +++ b/Documentation/config/feature.txt
> > @@ -14,9 +14,6 @@ feature.experimental::
> >  +
> >  * `fetch.negotiationAlgorithm=skipping` may improve fetch negotiation times by
> >  skipping more commits at a time, reducing the number of round trips.
> > -+
> > -* `gc.cruftPacks=true` reduces disk space used by unreachable objects during
> > -garbage collection, preventing loose object explosions.
>
> Being listed here is a definite sign that a feature behind a
> configuration variable is considered experimental.  Do we have (and
> if not, do we want to establish) a procedure to mark and announce a
> feature that used to be experimental no longer is?  If it is enough
> to mention it in the release notes, then I can take care of it, of
> course.

I am not aware of such a procedure. But personally I think it would be
fine to mention it in the release notes for the next release.

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-20 17:24       ` Taylor Blau
@ 2023-04-20 17:31         ` Junio C Hamano
  2023-04-20 19:19           ` Taylor Blau
  0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2023-04-20 17:31 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Jeff King, Derrick Stolee

Taylor Blau <me@ttaylorr.com> writes:

>> Being listed here is a definite sign that a feature behind a
>> configuration variable is considered experimental.  Do we have (and
>> if not, do we want to establish) a procedure to mark and announce a
>> feature that used to be experimental no longer is?  If it is enough
>> to mention it in the release notes, then I can take care of it, of
>> course.
>
> I am not aware of such a procedure. But personally I think it would be
> fine to mention it in the release notes for the next release.

OK.  I updated the entry for the topic in the draft "What's cooking"
report to read:

    * tb/enable-cruft-packs-by-default (2023-04-18) 10 commits
     - ...

     When "gc" needs to retain unreachable objects, packing them into
     cruft packs (instead of exploding them into loose object files) has
     been offered as a more efficient option for some time.  Now the use
     of cruft packs has been made the default and no longer considered
     an experimental feature.

     Will merge to 'next'.
     source: <cover.1681850424.git.me@ttaylorr.com>

and the per-topic description text is usually what is copied
verbatim to the release notes, so even if I forget, the procedure
will remember it for us ;-)

Thanks.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled by default
  2023-04-20 17:31         ` Junio C Hamano
@ 2023-04-20 19:19           ` Taylor Blau
  0 siblings, 0 replies; 48+ messages in thread
From: Taylor Blau @ 2023-04-20 19:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Jeff King, Derrick Stolee

On Thu, Apr 20, 2023 at 10:31:00AM -0700, Junio C Hamano wrote:
> Taylor Blau <me@ttaylorr.com> writes:
>
> >> Being listed here is a definite sign that a feature behind a
> >> configuration variable is considered experimental.  Do we have (and
> >> if not, do we want to establish) a procedure to mark and announce a
> >> feature that used to be experimental no longer is?  If it is enough
> >> to mention it in the release notes, then I can take care of it, of
> >> course.
> >
> > I am not aware of such a procedure. But personally I think it would be
> > fine to mention it in the release notes for the next release.
>
> OK.  I updated the entry for the topic in the draft "What's cooking"
> report to read:
>
>     * tb/enable-cruft-packs-by-default (2023-04-18) 10 commits
>      - ...
>
>      When "gc" needs to retain unreachable objects, packing them into
>      cruft packs (instead of exploding them into loose object files) has
>      been offered as a more efficient option for some time.  Now the use
>      of cruft packs has been made the default and no longer considered
>      an experimental feature.
>
>      Will merge to 'next'.
>      source: <cover.1681850424.git.me@ttaylorr.com>

Thanks, I think that summarizes/announces the change well. I always
appreciate your effort into concisely summarizing the topics being
queued.

> and the per-topic description text is usually what is copied
> verbatim to the release notes, so even if I forget, the procedure
> will remember it for us ;-)

;-).

Thanks,
Taylor

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default
  2023-04-18 19:50     ` Taylor Blau
@ 2023-04-22 11:23       ` Jeff King
  0 siblings, 0 replies; 48+ messages in thread
From: Jeff King @ 2023-04-22 11:23 UTC (permalink / raw)
  To: Taylor Blau; +Cc: git, Derrick Stolee, Junio C Hamano

On Tue, Apr 18, 2023 at 03:50:33PM -0400, Taylor Blau wrote:

> > > @@ -151,7 +151,7 @@ test_expect_success 'do not complain about existing broken links (commit)' '
> > >  	some message
> > >  	EOF
> > >  	commit=$(git hash-object -t commit -w broken-commit) &&
> > > -	git gc -q 2>stderr &&
> > > +	git gc --no-cruft -q 2>stderr &&
> > >  	verbose git cat-file -e $commit &&
> > >  	test_must_be_empty stderr
> > >  '
> >
> > The idea is that we don't complain when repacking unreachable-but-broken
> > segments of history. Which could perhaps behave differently for objects
> > that are going into a cruft pack versus being turned loose. So maybe
> > it's worth covering for the --cruft case, too. I dunno.
> 
> I think we already have coverage of those cases in t5329, specifically
> in the tests:
> 
>   - cruft packs tolerate missing trees (expire $expire)
>   - cruft packs tolerate missing blobs (expire $expire)
> 
> which are tested for both the pruning and non-pruning implementations
> (by setting $expire to "2.weeks.ago", and "never", respectively).

Ah, perfect. I only gave a quick look over t5329, and didn't see those.
So yeah, I think it is fine to just have this whole script run in
no-cruft mode.

-Peff

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2023-04-22 11:23 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17 20:54 [PATCH 00/10] gc: enable cruft packs by default Taylor Blau
2023-04-17 20:54 ` [PATCH 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-18 10:30   ` Jeff King
2023-04-18 19:40     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-17 20:54 ` [PATCH 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-17 22:54   ` Junio C Hamano
2023-04-17 23:03     ` Taylor Blau
2023-04-18 10:39       ` Jeff King
2023-04-18 14:54         ` Derrick Stolee
2023-04-17 20:54 ` [PATCH 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-17 20:54 ` [PATCH 05/10] t/t9300-fast-import.sh: " Taylor Blau
2023-04-18 10:43   ` Jeff King
2023-04-18 19:44     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-17 20:54 ` [PATCH 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 10:48   ` Jeff King
2023-04-18 19:48     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 08/10] t/t6501-freshen-objects.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 10:56   ` Jeff King
2023-04-18 19:50     ` Taylor Blau
2023-04-22 11:23       ` Jeff King
2023-04-17 20:54 ` [PATCH 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-18 11:00   ` Jeff King
2023-04-18 19:52     ` Taylor Blau
2023-04-17 20:54 ` [PATCH 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-18 11:02   ` Jeff King
2023-04-18 11:04 ` [PATCH 00/10] gc: enable cruft packs by default Jeff King
2023-04-18 19:53   ` Taylor Blau
2023-04-18 20:40 ` [PATCH v2 " Taylor Blau
2023-04-18 20:40   ` [PATCH v2 01/10] pack-write.c: plug a leak in stage_tmp_packfiles() Taylor Blau
2023-04-19 22:00     ` Junio C Hamano
2023-04-20 16:31       ` Taylor Blau
2023-04-20 16:57         ` Junio C Hamano
2023-04-18 20:40   ` [PATCH v2 02/10] builtin/repack.c: fix incorrect reference to '-C' Taylor Blau
2023-04-18 20:40   ` [PATCH v2 03/10] builtin/gc.c: ignore cruft packs with `--keep-largest-pack` Taylor Blau
2023-04-18 20:40   ` [PATCH v2 04/10] t/t5304-prune.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40   ` [PATCH v2 05/10] t/t6501-freshen-objects.sh: " Taylor Blau
2023-04-18 20:40   ` [PATCH v2 06/10] t/t6500-gc.sh: refactor cruft pack tests Taylor Blau
2023-04-18 20:40   ` [PATCH v2 07/10] t/t6500-gc.sh: add additional test cases Taylor Blau
2023-04-18 20:40   ` [PATCH v2 08/10] t/t9300-fast-import.sh: prepare for `gc --cruft` by default Taylor Blau
2023-04-18 20:40   ` [PATCH v2 09/10] builtin/gc.c: make `gc.cruftPacks` enabled " Taylor Blau
2023-04-19 22:22     ` Junio C Hamano
2023-04-20 17:24       ` Taylor Blau
2023-04-20 17:31         ` Junio C Hamano
2023-04-20 19:19           ` Taylor Blau
2023-04-18 20:41   ` [PATCH v2 10/10] repository.h: drop unused `gc_cruft_packs` Taylor Blau
2023-04-19 22:19     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.