All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/3] Add cleanup for garbage .bitmap files
@ 2015-11-14  0:10 Doug Kelly
  2015-11-14  0:10 ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
                   ` (2 more replies)
  0 siblings, 3 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:10 UTC (permalink / raw)
  To: git

Following Peff and Junio's comments when adding support for cleaning garbage
.idx files left in the pack directory, this patch introduces the ability to
detect garbage .bitmap files.  Additionally, .keep files are still reported,
but no action is taken to clean them.

This includes some refactor around count-objects' report_pack_garbage handler,
to make it more extensible when adding understanding for different file types.
Testing shows this working, but it may be a section to provide extra scrutiny
to.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-14  0:10 [PATCH 0/3] Add cleanup for garbage .bitmap files Doug Kelly
@ 2015-11-14  0:10 ` Doug Kelly
  2015-11-14  0:43   ` Stefan Beller
  2015-11-14  0:10 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
  2015-11-14  0:10 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  2 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:10 UTC (permalink / raw)
  To: git; +Cc: Doug Kelly

.bitmap and .keep files without .idx/.pack don't make much sense, so
make sure these are reported as garbage as well.  At the same time,
refactoring report_garbage to handle extra bits.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/count-objects.c | 16 ++++++----------
 cache.h                 |  4 +++-
 sha1_file.c             | 17 +++++++++++++++--
 t/t5304-prune.sh        |  2 ++
 4 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..1637037 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -17,19 +17,15 @@ static off_t loose_size;
 
 static const char *bits_to_msg(unsigned seen_bits)
 {
-	switch (seen_bits) {
-	case 0:
-		return "no corresponding .idx or .pack";
-	case PACKDIR_FILE_GARBAGE:
+	if (seen_bits ==  PACKDIR_FILE_GARBAGE)
 		return "garbage found";
-	case PACKDIR_FILE_PACK:
+	else if (seen_bits & PACKDIR_FILE_PACK && seen_bits ^ ~PACKDIR_FILE_IDX)
 		return "no corresponding .idx";
-	case PACKDIR_FILE_IDX:
+	else if (seen_bits & PACKDIR_FILE_IDX && seen_bits ^ ~PACKDIR_FILE_PACK)
 		return "no corresponding .pack";
-	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
-	default:
-		return NULL;
-	}
+	else if (seen_bits == 0 || seen_bits ^ ~(PACKDIR_FILE_IDX|PACKDIR_FILE_PACK))
+		return "no corresponding .idx or .pack";
+	return NULL;
 }
 
 static void real_report_garbage(unsigned seen_bits, const char *path)
diff --git a/cache.h b/cache.h
index 736abc0..5b9d791 100644
--- a/cache.h
+++ b/cache.h
@@ -1292,7 +1292,9 @@ extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
+#define PACKDIR_FILE_BITMAP 4
+#define PACKDIR_FILE_KEEP 8
+#define PACKDIR_FILE_GARBAGE 16
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
diff --git a/sha1_file.c b/sha1_file.c
index 3d56746..5f939e4 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list *list,
 	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
 		return;
 
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
+		return;
+
 	for (; first < last; first++)
 		report_garbage(seen_bits, list->items[first].string);
 }
@@ -1256,9 +1265,13 @@ static void report_pack_garbage(struct string_list *list)
 			first = i;
 		}
 		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
+			seen_bits |= PACKDIR_FILE_PACK;
 		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
+			seen_bits |= PACKDIR_FILE_IDX;
+		else if (!strcmp(path + baselen, "bitmap"))
+			seen_bits |= PACKDIR_FILE_BITMAP;
+		else if (!strcmp(path + baselen, "keep"))
+			seen_bits |= PACKDIR_FILE_KEEP;
 	}
 	report_helper(list, seen_bits, first, list->nr);
 }
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index def203c..1ea8279 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -261,6 +261,8 @@ test_expect_success 'clean pack garbage with gc' '
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
+warning: no corresponding .pack: .git/objects/pack/fake2.idx
+warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/3] t5304: Add test for .bitmap garbage files
  2015-11-14  0:10 [PATCH 0/3] Add cleanup for garbage .bitmap files Doug Kelly
  2015-11-14  0:10 ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
@ 2015-11-14  0:10 ` Doug Kelly
  2015-11-14  0:47   ` Stefan Beller
  2015-11-14  0:10 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  2 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:10 UTC (permalink / raw)
  To: git; +Cc: Doug Kelly

When checking for pack garbage, .bitmap files are now detected as
garbage when not associated with another .pack/.idx file.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 t/t5304-prune.sh | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1ea8279..4fa6e7a 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -230,6 +230,12 @@ test_expect_success 'garbage report in count-objects -v' '
 	: >.git/objects/pack/fake.idx &&
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake3.idx &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git count-objects -v 2>stderr &&
 	grep "index file .git/objects/pack/fake.idx is too small" stderr &&
 	grep "^warning:" stderr | sort >actual &&
@@ -238,14 +244,20 @@ warning: garbage found: .git/objects/pack/fake.bar
 warning: garbage found: .git/objects/pack/foo
 warning: garbage found: .git/objects/pack/foo.bar
 warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake4.bitmap
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
 warning: no corresponding .pack: .git/objects/pack/fake3.idx
+warning: no corresponding .pack: .git/objects/pack/fake5.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake5.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake6.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.keep
 EOF
 	test_cmp expected actual
 '
 
-test_expect_success 'clean pack garbage with gc' '
+test_expect_failure 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
@@ -254,15 +266,21 @@ test_expect_success 'clean pack garbage with gc' '
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake2.idx &&
 	: >.git/objects/pack/fake3.keep &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git gc &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
+warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake6.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
-warning: no corresponding .pack: .git/objects/pack/fake2.idx
-warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
  2015-11-14  0:10 [PATCH 0/3] Add cleanup for garbage .bitmap files Doug Kelly
  2015-11-14  0:10 ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
  2015-11-14  0:10 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2015-11-14  0:10 ` Doug Kelly
  2 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:10 UTC (permalink / raw)
  To: git; +Cc: Doug Kelly

Similar to cleaning up excess .idx files, clean any garbage .bitmap
files that are not otherwise associated with any .idx/.pack files.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/gc.c     | 12 ++++++++++--
 t/t5304-prune.sh |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index c583aad..7ddf071 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -58,8 +58,16 @@ static void clean_pack_garbage(void)
 
 static void report_pack_garbage(unsigned seen_bits, const char *path)
 {
-	if (seen_bits == PACKDIR_FILE_IDX)
-		string_list_append(&pack_garbage, path);
+	if (seen_bits & PACKDIR_FILE_IDX ||
+	    seen_bits & PACKDIR_FILE_BITMAP) {
+		const char *dot = strrchr(path, '.');
+		if (dot) {
+			int baselen = dot - path + 1;
+			if (!strcmp(path+baselen, "idx") ||
+				!strcmp(path+baselen, "bitmap"))
+				string_list_append(&pack_garbage, path);
+		}
+	}
 }
 
 static void git_config_date_string(const char *key, const char **output)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 4fa6e7a..9f9f263 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -257,7 +257,7 @@ EOF
 	test_cmp expected actual
 '
 
-test_expect_failure 'clean pack garbage with gc' '
+test_expect_success 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-14  0:10 ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
@ 2015-11-14  0:43   ` Stefan Beller
  2015-11-14  0:46     ` Doug Kelly
  2015-11-14  0:47     ` Doug Kelly
  0 siblings, 2 replies; 33+ messages in thread
From: Stefan Beller @ 2015-11-14  0:43 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git

> +       else if (seen_bits & PACKDIR_FILE_PACK && seen_bits ^ ~PACKDIR_FILE_IDX)

as just talked about: did you mention && !(seen_bits & FILE_IDX)
>
> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
> +               return;
> +
> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
> +               return;
> +
> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
> +               return;

I wonder if this should be rewritten as
    if (seen_bits & FILE_PACK && seen_bits & FILE_IDX
        && (seen_bits & FILE_KEEP || seen_bits & BITMAP))
            return;

to dense it a bit. ;)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-14  0:43   ` Stefan Beller
@ 2015-11-14  0:46     ` Doug Kelly
  2015-11-14  0:46       ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
                         ` (2 more replies)
  2015-11-14  0:47     ` Doug Kelly
  1 sibling, 3 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:46 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, Doug Kelly

.bitmap and .keep files without .idx/.pack don't make much sense, so
make sure these are reported as garbage as well.  At the same time,
refactoring report_garbage to handle extra bits.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/count-objects.c | 16 ++++++----------
 cache.h                 |  4 +++-
 sha1_file.c             | 17 +++++++++++++++--
 t/t5304-prune.sh        |  2 ++
 4 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..1637037 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -17,19 +17,15 @@ static off_t loose_size;
 
 static const char *bits_to_msg(unsigned seen_bits)
 {
-	switch (seen_bits) {
-	case 0:
-		return "no corresponding .idx or .pack";
-	case PACKDIR_FILE_GARBAGE:
+	if (seen_bits ==  PACKDIR_FILE_GARBAGE)
 		return "garbage found";
-	case PACKDIR_FILE_PACK:
+	else if (seen_bits & PACKDIR_FILE_PACK && seen_bits ^ ~PACKDIR_FILE_IDX)
 		return "no corresponding .idx";
-	case PACKDIR_FILE_IDX:
+	else if (seen_bits & PACKDIR_FILE_IDX && seen_bits ^ ~PACKDIR_FILE_PACK)
 		return "no corresponding .pack";
-	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
-	default:
-		return NULL;
-	}
+	else if (seen_bits == 0 || seen_bits ^ ~(PACKDIR_FILE_IDX|PACKDIR_FILE_PACK))
+		return "no corresponding .idx or .pack";
+	return NULL;
 }
 
 static void real_report_garbage(unsigned seen_bits, const char *path)
diff --git a/cache.h b/cache.h
index 736abc0..5b9d791 100644
--- a/cache.h
+++ b/cache.h
@@ -1292,7 +1292,9 @@ extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
+#define PACKDIR_FILE_BITMAP 4
+#define PACKDIR_FILE_KEEP 8
+#define PACKDIR_FILE_GARBAGE 16
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
diff --git a/sha1_file.c b/sha1_file.c
index 3d56746..5f939e4 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list *list,
 	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
 		return;
 
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
+		return;
+
 	for (; first < last; first++)
 		report_garbage(seen_bits, list->items[first].string);
 }
@@ -1256,9 +1265,13 @@ static void report_pack_garbage(struct string_list *list)
 			first = i;
 		}
 		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
+			seen_bits |= PACKDIR_FILE_PACK;
 		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
+			seen_bits |= PACKDIR_FILE_IDX;
+		else if (!strcmp(path + baselen, "bitmap"))
+			seen_bits |= PACKDIR_FILE_BITMAP;
+		else if (!strcmp(path + baselen, "keep"))
+			seen_bits |= PACKDIR_FILE_KEEP;
 	}
 	report_helper(list, seen_bits, first, list->nr);
 }
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index def203c..1ea8279 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -261,6 +261,8 @@ test_expect_success 'clean pack garbage with gc' '
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
+warning: no corresponding .pack: .git/objects/pack/fake2.idx
+warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/3] t5304: Add test for .bitmap garbage files
  2015-11-14  0:46     ` Doug Kelly
@ 2015-11-14  0:46       ` Doug Kelly
  2015-11-14  0:46       ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  2015-11-25 18:43       ` [PATCH 1/3] prepare_packed_git(): find more garbage Stefan Beller
  2 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:46 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, Doug Kelly

When checking for pack garbage, .bitmap files are now detected as
garbage when not associated with another .pack/.idx file.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 t/t5304-prune.sh | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1ea8279..4fa6e7a 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -230,6 +230,12 @@ test_expect_success 'garbage report in count-objects -v' '
 	: >.git/objects/pack/fake.idx &&
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake3.idx &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git count-objects -v 2>stderr &&
 	grep "index file .git/objects/pack/fake.idx is too small" stderr &&
 	grep "^warning:" stderr | sort >actual &&
@@ -238,14 +244,20 @@ warning: garbage found: .git/objects/pack/fake.bar
 warning: garbage found: .git/objects/pack/foo
 warning: garbage found: .git/objects/pack/foo.bar
 warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake4.bitmap
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
 warning: no corresponding .pack: .git/objects/pack/fake3.idx
+warning: no corresponding .pack: .git/objects/pack/fake5.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake5.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake6.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.keep
 EOF
 	test_cmp expected actual
 '
 
-test_expect_success 'clean pack garbage with gc' '
+test_expect_failure 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
@@ -254,15 +266,21 @@ test_expect_success 'clean pack garbage with gc' '
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake2.idx &&
 	: >.git/objects/pack/fake3.keep &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git gc &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
+warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake6.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
-warning: no corresponding .pack: .git/objects/pack/fake2.idx
-warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
  2015-11-14  0:46     ` Doug Kelly
  2015-11-14  0:46       ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2015-11-14  0:46       ` Doug Kelly
  2015-12-15 23:23         ` Jeff King
  2015-11-25 18:43       ` [PATCH 1/3] prepare_packed_git(): find more garbage Stefan Beller
  2 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:46 UTC (permalink / raw)
  To: git; +Cc: peff, gitster, Doug Kelly

Similar to cleaning up excess .idx files, clean any garbage .bitmap
files that are not otherwise associated with any .idx/.pack files.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/gc.c     | 12 ++++++++++--
 t/t5304-prune.sh |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index c583aad..7ddf071 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -58,8 +58,16 @@ static void clean_pack_garbage(void)
 
 static void report_pack_garbage(unsigned seen_bits, const char *path)
 {
-	if (seen_bits == PACKDIR_FILE_IDX)
-		string_list_append(&pack_garbage, path);
+	if (seen_bits & PACKDIR_FILE_IDX ||
+	    seen_bits & PACKDIR_FILE_BITMAP) {
+		const char *dot = strrchr(path, '.');
+		if (dot) {
+			int baselen = dot - path + 1;
+			if (!strcmp(path+baselen, "idx") ||
+				!strcmp(path+baselen, "bitmap"))
+				string_list_append(&pack_garbage, path);
+		}
+	}
 }
 
 static void git_config_date_string(const char *key, const char **output)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 4fa6e7a..9f9f263 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -257,7 +257,7 @@ EOF
 	test_cmp expected actual
 '
 
-test_expect_failure 'clean pack garbage with gc' '
+test_expect_success 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-14  0:43   ` Stefan Beller
  2015-11-14  0:46     ` Doug Kelly
@ 2015-11-14  0:47     ` Doug Kelly
  1 sibling, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-14  0:47 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git

Yes, without a doubt.  I think I'm blaming this one on being late on a
Friday afternoon, and really not thinking out the logic clearly. :)

On Fri, Nov 13, 2015 at 4:43 PM, Stefan Beller <sbeller@google.com> wrote:
>> +       else if (seen_bits & PACKDIR_FILE_PACK && seen_bits ^ ~PACKDIR_FILE_IDX)
>
> as just talked about: did you mention && !(seen_bits & FILE_IDX)
>>
>> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
>> +               return;
>> +
>> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
>> +               return;
>> +
>> +       if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
>> +               return;
>
> I wonder if this should be rewritten as
>     if (seen_bits & FILE_PACK && seen_bits & FILE_IDX
>         && (seen_bits & FILE_KEEP || seen_bits & BITMAP))
>             return;
>
> to dense it a bit. ;)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/3] t5304: Add test for .bitmap garbage files
  2015-11-14  0:10 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2015-11-14  0:47   ` Stefan Beller
  0 siblings, 0 replies; 33+ messages in thread
From: Stefan Beller @ 2015-11-14  0:47 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git

On Fri, Nov 13, 2015 at 4:10 PM, Doug Kelly <dougk.ff7@gmail.com> wrote:
> When checking for pack garbage, .bitmap files are now detected as
> garbage when not associated with another .pack/.idx file.
>
> Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
> ---
>  t/t5304-prune.sh | 24 +++++++++++++++++++++---
>  1 file changed, 21 insertions(+), 3 deletions(-)
>
> diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
> index 1ea8279..4fa6e7a 100755
> --- a/t/t5304-prune.sh
> +++ b/t/t5304-prune.sh
> @@ -230,6 +230,12 @@ test_expect_success 'garbage report in count-objects -v' '
>         : >.git/objects/pack/fake.idx &&
>         : >.git/objects/pack/fake2.keep &&
>         : >.git/objects/pack/fake3.idx &&
> +       : >.git/objects/pack/fake4.bitmap &&
> +       : >.git/objects/pack/fake5.bitmap &&
> +       : >.git/objects/pack/fake5.idx &&
> +       : >.git/objects/pack/fake6.keep &&
> +       : >.git/objects/pack/fake6.bitmap &&
> +       : >.git/objects/pack/fake6.idx &&
>         git count-objects -v 2>stderr &&
>         grep "index file .git/objects/pack/fake.idx is too small" stderr &&
>         grep "^warning:" stderr | sort >actual &&
> @@ -238,14 +244,20 @@ warning: garbage found: .git/objects/pack/fake.bar
>  warning: garbage found: .git/objects/pack/foo
>  warning: garbage found: .git/objects/pack/foo.bar
>  warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
> +warning: no corresponding .idx or .pack: .git/objects/pack/fake4.bitmap

Do we want to split that up further, into

    no corresponding .idx and .pack:...

to tell that actually both files are missing and we know it?

>  warning: no corresponding .idx: .git/objects/pack/foo.keep
>  warning: no corresponding .idx: .git/objects/pack/foo.pack
>  warning: no corresponding .pack: .git/objects/pack/fake3.idx
> +warning: no corresponding .pack: .git/objects/pack/fake5.bitmap
> +warning: no corresponding .pack: .git/objects/pack/fake5.idx

Wondering if we can condense this into one message (because only
one pack is missing).

> +warning: no corresponding .pack: .git/objects/pack/fake6.bitmap
> +warning: no corresponding .pack: .git/objects/pack/fake6.idx
> +warning: no corresponding .pack: .git/objects/pack/fake6.keep

same here.
    no corresponding .pack: .git/objects/pack/fake6.{keep,idx, bitmap}
would look nice and be shell compatible. (rm on that multi path just works,
in case you expect the pack to be gone)

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-14  0:46     ` Doug Kelly
  2015-11-14  0:46       ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
  2015-11-14  0:46       ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
@ 2015-11-25 18:43       ` Stefan Beller
  2015-11-26  6:15         ` Doug Kelly
  2015-11-26  6:18         ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
  2 siblings, 2 replies; 33+ messages in thread
From: Stefan Beller @ 2015-11-25 18:43 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, peff, Junio C Hamano

On Fri, Nov 13, 2015 at 4:46 PM, Doug Kelly <dougk.ff7@gmail.com> wrote:
>                 return "no corresponding .idx";
> -       case PACKDIR_FILE_IDX:
> +       else if (seen_bits & PACKDIR_FILE_IDX && seen_bits ^ ~PACKDIR_FILE_PACK)

Did you intend to use
    (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
here?

I was just looking at the state in peff/pu and it still has the xor
variant, which exposes more
than just the selected bit to the decision IIRC.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-25 18:43       ` [PATCH 1/3] prepare_packed_git(): find more garbage Stefan Beller
@ 2015-11-26  6:15         ` Doug Kelly
  2015-12-15 23:09           ` Jeff King
  2015-11-26  6:18         ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
  1 sibling, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2015-11-26  6:15 UTC (permalink / raw)
  To: git; +Cc: sbeller, peff, gitster, Doug Kelly

.bitmap and .keep files without .idx/.pack don't make much sense, so
make sure these are reported as garbage as well.  At the same time,
refactoring report_garbage to handle extra bits.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/count-objects.c | 16 ++++++----------
 cache.h                 |  4 +++-
 sha1_file.c             | 17 +++++++++++++++--
 t/t5304-prune.sh        |  2 ++
 4 files changed, 26 insertions(+), 13 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..5197b57 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -17,19 +17,15 @@ static off_t loose_size;
 
 static const char *bits_to_msg(unsigned seen_bits)
 {
-	switch (seen_bits) {
-	case 0:
-		return "no corresponding .idx or .pack";
-	case PACKDIR_FILE_GARBAGE:
+	if (seen_bits ==  PACKDIR_FILE_GARBAGE)
 		return "garbage found";
-	case PACKDIR_FILE_PACK:
+	else if (seen_bits & PACKDIR_FILE_PACK && !(seen_bits & PACKDIR_FILE_IDX))
 		return "no corresponding .idx";
-	case PACKDIR_FILE_IDX:
+	else if (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
 		return "no corresponding .pack";
-	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
-	default:
-		return NULL;
-	}
+	else if (seen_bits == 0 || !(seen_bits & (PACKDIR_FILE_IDX|PACKDIR_FILE_PACK)))
+		return "no corresponding .idx or .pack";
+	return NULL;
 }
 
 static void real_report_garbage(unsigned seen_bits, const char *path)
diff --git a/cache.h b/cache.h
index 736abc0..5b9d791 100644
--- a/cache.h
+++ b/cache.h
@@ -1292,7 +1292,9 @@ extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
+#define PACKDIR_FILE_BITMAP 4
+#define PACKDIR_FILE_KEEP 8
+#define PACKDIR_FILE_GARBAGE 16
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
diff --git a/sha1_file.c b/sha1_file.c
index 3d56746..5f939e4 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list *list,
 	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
 		return;
 
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
+		return;
+
+	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
+		return;
+
 	for (; first < last; first++)
 		report_garbage(seen_bits, list->items[first].string);
 }
@@ -1256,9 +1265,13 @@ static void report_pack_garbage(struct string_list *list)
 			first = i;
 		}
 		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
+			seen_bits |= PACKDIR_FILE_PACK;
 		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
+			seen_bits |= PACKDIR_FILE_IDX;
+		else if (!strcmp(path + baselen, "bitmap"))
+			seen_bits |= PACKDIR_FILE_BITMAP;
+		else if (!strcmp(path + baselen, "keep"))
+			seen_bits |= PACKDIR_FILE_KEEP;
 	}
 	report_helper(list, seen_bits, first, list->nr);
 }
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index def203c..1ea8279 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -261,6 +261,8 @@ test_expect_success 'clean pack garbage with gc' '
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
+warning: no corresponding .pack: .git/objects/pack/fake2.idx
+warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-25 18:43       ` [PATCH 1/3] prepare_packed_git(): find more garbage Stefan Beller
  2015-11-26  6:15         ` Doug Kelly
@ 2015-11-26  6:18         ` Doug Kelly
  1 sibling, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-11-26  6:18 UTC (permalink / raw)
  To: Stefan Beller; +Cc: git, Jeff King, Junio C Hamano

Apparently, I fixed this and forgot to re-run format-patch, so I sent
out the same patch the second time... My fault on that one.  I've at
least checked what I sent this time around, and it seems to match
what's in my current tree. :) The second and third patches should be
unmodified.

Thanks for catching that, Stefan!

On Wed, Nov 25, 2015 at 12:43 PM, Stefan Beller <sbeller@google.com> wrote:
> On Fri, Nov 13, 2015 at 4:46 PM, Doug Kelly <dougk.ff7@gmail.com> wrote:
>>                 return "no corresponding .idx";
>> -       case PACKDIR_FILE_IDX:
>> +       else if (seen_bits & PACKDIR_FILE_IDX && seen_bits ^ ~PACKDIR_FILE_PACK)
>
> Did you intend to use
>     (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
> here?
>
> I was just looking at the state in peff/pu and it still has the xor
> variant, which exposes more
> than just the selected bit to the decision IIRC.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-11-26  6:15         ` Doug Kelly
@ 2015-12-15 23:09           ` Jeff King
  2015-12-15 23:25             ` Jeff King
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2015-12-15 23:09 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, sbeller, gitster

On Thu, Nov 26, 2015 at 12:15:29AM -0600, Doug Kelly wrote:

> diff --git a/builtin/count-objects.c b/builtin/count-objects.c
> index ba92919..5197b57 100644
> --- a/builtin/count-objects.c
> +++ b/builtin/count-objects.c
> @@ -17,19 +17,15 @@ static off_t loose_size;
>  
>  static const char *bits_to_msg(unsigned seen_bits)
>  {
> -	switch (seen_bits) {
> -	case 0:
> -		return "no corresponding .idx or .pack";
> -	case PACKDIR_FILE_GARBAGE:
> +	if (seen_bits ==  PACKDIR_FILE_GARBAGE)
>  		return "garbage found";

It seems weird to use "==" on a bitfield. I think it is the case now
that we would never see GARBAGE alongside anything else, but I wonder if
we should future-proof that as:

  if (seen_bits & PACKDIR_FILE_GARBAGE)

Specifically, I am wondering what would happen if we had "foo.pack" and
"foo.bogus", where we do not know about the latter at all.

> -	case PACKDIR_FILE_PACK:
> +	else if (seen_bits & PACKDIR_FILE_PACK && !(seen_bits & PACKDIR_FILE_IDX))
>  		return "no corresponding .idx";
> -	case PACKDIR_FILE_IDX:
> +	else if (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
>  		return "no corresponding .pack";
> -	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
> -	default:
> -		return NULL;
> -	}
> +	else if (seen_bits == 0 || !(seen_bits & (PACKDIR_FILE_IDX|PACKDIR_FILE_PACK)))
> +		return "no corresponding .idx or .pack";
> +	return NULL;

This bottom conditional is interesting. I understand the second half: we
saw something pack-like, but there is not matching .idx or .pack at all
(if we saw one but not the other, we would have caught it above).

But when will we get an empty seen_bits? What did we see that triggered
this function, but didn't trigger a bit (even GARBAGE)?

I don't mind if the answer is "nothing, this is future-proofing", but am
mostly curious.

> diff --git a/sha1_file.c b/sha1_file.c
> index 3d56746..5f939e4 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list *list,
>  	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
>  		return;
>  
> +	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP))
> +		return;
> +
> +	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_KEEP))
> +		return;
> +
> +	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX|PACKDIR_FILE_BITMAP|PACKDIR_FILE_KEEP))
> +		return;
> +

It seems like we're enumerating a lot of cases here that will explode if
we get even one more file type (e.g., we add "pack-XXX.foo" in the
future). If I understand this function correctly, we're just trying to
get rid of "boring" cases that do not need to be reported.

Isn't any case that has both a pack and an idx boring (no matter if it
has a .bitmap or .keep)?

IOW, can these four conditionals just become:

  unsigned pack_with_idx = PACKDIR_FILE_PACK | PACKDIR_FILE_IDX;

  if ((seen_bits & pack_with_idx) == pack_with_idx)
	return;

?

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
  2015-11-14  0:46       ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
@ 2015-12-15 23:23         ` Jeff King
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2015-12-15 23:23 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, gitster

On Fri, Nov 13, 2015 at 04:46:27PM -0800, Doug Kelly wrote:

> Similar to cleaning up excess .idx files, clean any garbage .bitmap
> files that are not otherwise associated with any .idx/.pack files.
> 
> Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
> ---
>  builtin/gc.c     | 12 ++++++++++--
>  t/t5304-prune.sh |  2 +-
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index c583aad..7ddf071 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -58,8 +58,16 @@ static void clean_pack_garbage(void)
>  
>  static void report_pack_garbage(unsigned seen_bits, const char *path)
>  {
> -	if (seen_bits == PACKDIR_FILE_IDX)
> -		string_list_append(&pack_garbage, path);
> +	if (seen_bits & PACKDIR_FILE_IDX ||
> +	    seen_bits & PACKDIR_FILE_BITMAP) {

So here we're relying on report_helper to have culled the boring cases,
right? (Sorry if that is totally obvious; I'm mostly just thinking out
loud). That makes sense, then.

> +		const char *dot = strrchr(path, '.');
> +		if (dot) {
> +			int baselen = dot - path + 1;
> +			if (!strcmp(path+baselen, "idx") ||
> +				!strcmp(path+baselen, "bitmap"))
> +				string_list_append(&pack_garbage, path);
> +		}
> +	}

I was confused at first why we couldn't just pass "path" here. But it's
because we will get a garbage report for each related file, and we want
to keep some of them (like .keep). Which I guess makes sense.

I wonder if this would be simpler to read as just:

  if (ends_with(path, ".idx") ||
      ends_with(path, ".bitmap"))
          string_list_append(&pack_garbage, path);

Technically it is less efficient because we will compute strlen(path)
twice, but that seems like premature optimization (not to mention that
ends_with is an inline, so a good compiler can probably optimize out the
second call anyway).

> -test_expect_failure 'clean pack garbage with gc' '
> +test_expect_success 'clean pack garbage with gc' '
>  	test_when_finished "rm -f .git/objects/pack/fake*" &&
>  	test_when_finished "rm -f .git/objects/pack/foo*" &&
>  	: >.git/objects/pack/foo.keep &&

Should we be checking at the end of this test that "*.keep" didn't get
blown away? It might be nice to just test_cmp the results of "ls" on the
pack directory to confirm exactly what got deleted and what didn't.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-12-15 23:09           ` Jeff King
@ 2015-12-15 23:25             ` Jeff King
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2015-12-15 23:25 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, sbeller, gitster

On Tue, Dec 15, 2015 at 06:09:57PM -0500, Jeff King wrote:

> > @@ -1225,6 +1225,15 @@ static void report_helper(const struct string_list *list,
> [...]
> If I understand this function correctly, we're just trying to
> get rid of "boring" cases that do not need to be reported.

BTW, I wondered if this should perhaps just be calling bits_to_msg() and
seeing if it returns NULL. It seems like the logic for which cases are
"interesting" ends up duplicated. But maybe I am missing something.

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v3 0/3] prepare_packed_git(): find more garbage
  2015-12-15 23:25             ` Jeff King
@ 2015-12-19  0:06               ` Doug Kelly
  2015-12-19  0:06                 ` [PATCH 1/3] " Doug Kelly
                                   ` (4 more replies)
  0 siblings, 5 replies; 33+ messages in thread
From: Doug Kelly @ 2015-12-19  0:06 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, gitster, Doug Kelly

Corrects the issues found in review by Peff, including simplifying
the logic in report_helper().  bits_to_msg() would've been an alternate
solution to that change, however it'll get called by
real_report_garbage(), so there's no need to call it twice, especially
when the check we need within report_helper().

I think checking for seen_bits == 0 would be future-proofing should we
arrive at a file bit not otherwise match it (i.e. file.foo and
file.bar, but nothing else would cause seen_bits to be zero, but if
that's the case, we wouldn't have PACKDIR_FILE_IDX or
PACKDIR_FILE_PACK set, either, and the second half would also match.

Doug Kelly (3):
  prepare_packed_git(): find more garbage
  t5304: Add test for .bitmap garbage files
  gc: Clean garbage .bitmap files from pack dir

 builtin/count-objects.c | 16 ++++++----------
 builtin/gc.c            | 12 ++++++++++--
 cache.h                 |  4 +++-
 sha1_file.c             | 12 +++++++++---
 t/t5304-prune.sh        | 20 ++++++++++++++++++++
 5 files changed, 48 insertions(+), 16 deletions(-)

-- 
2.6.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/3] prepare_packed_git(): find more garbage
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
@ 2015-12-19  0:06                 ` Doug Kelly
  2015-12-19  0:06                 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
                                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-12-19  0:06 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, gitster, Doug Kelly

.bitmap and .keep files without .idx/.pack don't make much sense, so
make sure these are reported as garbage as well.  At the same time,
refactoring report_garbage to handle extra bits.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/count-objects.c | 16 ++++++----------
 cache.h                 |  4 +++-
 sha1_file.c             | 12 +++++++++---
 t/t5304-prune.sh        |  2 ++
 4 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..ed103ae 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -17,19 +17,15 @@ static off_t loose_size;
 
 static const char *bits_to_msg(unsigned seen_bits)
 {
-	switch (seen_bits) {
-	case 0:
-		return "no corresponding .idx or .pack";
-	case PACKDIR_FILE_GARBAGE:
+	if (seen_bits & PACKDIR_FILE_GARBAGE)
 		return "garbage found";
-	case PACKDIR_FILE_PACK:
+	else if (seen_bits & PACKDIR_FILE_PACK && !(seen_bits & PACKDIR_FILE_IDX))
 		return "no corresponding .idx";
-	case PACKDIR_FILE_IDX:
+	else if (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
 		return "no corresponding .pack";
-	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
-	default:
-		return NULL;
-	}
+	else if (!(seen_bits & (PACKDIR_FILE_IDX|PACKDIR_FILE_PACK)))
+		return "no corresponding .idx or .pack";
+	return NULL;
 }
 
 static void real_report_garbage(unsigned seen_bits, const char *path)
diff --git a/cache.h b/cache.h
index 736abc0..5b9d791 100644
--- a/cache.h
+++ b/cache.h
@@ -1292,7 +1292,9 @@ extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
+#define PACKDIR_FILE_BITMAP 4
+#define PACKDIR_FILE_KEEP 8
+#define PACKDIR_FILE_GARBAGE 16
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
diff --git a/sha1_file.c b/sha1_file.c
index 3d56746..3524274 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1222,7 +1222,9 @@ void (*report_garbage)(unsigned seen_bits, const char *path);
 static void report_helper(const struct string_list *list,
 			  int seen_bits, int first, int last)
 {
-	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
+	static const int pack_and_index = PACKDIR_FILE_PACK|PACKDIR_FILE_IDX;
+
+	if ((seen_bits & pack_and_index) == pack_and_index)
 		return;
 
 	for (; first < last; first++)
@@ -1256,9 +1258,13 @@ static void report_pack_garbage(struct string_list *list)
 			first = i;
 		}
 		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
+			seen_bits |= PACKDIR_FILE_PACK;
 		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
+			seen_bits |= PACKDIR_FILE_IDX;
+		else if (!strcmp(path + baselen, "bitmap"))
+			seen_bits |= PACKDIR_FILE_BITMAP;
+		else if (!strcmp(path + baselen, "keep"))
+			seen_bits |= PACKDIR_FILE_KEEP;
 	}
 	report_helper(list, seen_bits, first, list->nr);
 }
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index def203c..1ea8279 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -261,6 +261,8 @@ test_expect_success 'clean pack garbage with gc' '
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
+warning: no corresponding .pack: .git/objects/pack/fake2.idx
+warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/3] t5304: Add test for .bitmap garbage files
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
  2015-12-19  0:06                 ` [PATCH 1/3] " Doug Kelly
@ 2015-12-19  0:06                 ` Doug Kelly
  2015-12-19  0:06                 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
                                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2015-12-19  0:06 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, gitster, Doug Kelly

When checking for pack garbage, .bitmap files are now detected as
garbage when not associated with another .pack/.idx file.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 t/t5304-prune.sh | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1ea8279..4fa6e7a 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -230,6 +230,12 @@ test_expect_success 'garbage report in count-objects -v' '
 	: >.git/objects/pack/fake.idx &&
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake3.idx &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git count-objects -v 2>stderr &&
 	grep "index file .git/objects/pack/fake.idx is too small" stderr &&
 	grep "^warning:" stderr | sort >actual &&
@@ -238,14 +244,20 @@ warning: garbage found: .git/objects/pack/fake.bar
 warning: garbage found: .git/objects/pack/foo
 warning: garbage found: .git/objects/pack/foo.bar
 warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake4.bitmap
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
 warning: no corresponding .pack: .git/objects/pack/fake3.idx
+warning: no corresponding .pack: .git/objects/pack/fake5.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake5.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake6.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.keep
 EOF
 	test_cmp expected actual
 '
 
-test_expect_success 'clean pack garbage with gc' '
+test_expect_failure 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
@@ -254,15 +266,21 @@ test_expect_success 'clean pack garbage with gc' '
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake2.idx &&
 	: >.git/objects/pack/fake3.keep &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git gc &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
+warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake6.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
-warning: no corresponding .pack: .git/objects/pack/fake2.idx
-warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
  2015-12-19  0:06                 ` [PATCH 1/3] " Doug Kelly
  2015-12-19  0:06                 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2015-12-19  0:06                 ` Doug Kelly
  2015-12-19  2:01                   ` Jeff King
  2015-12-19  2:02                 ` [PATCH v3 0/3] prepare_packed_git(): find more garbage Jeff King
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  4 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2015-12-19  0:06 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, gitster, Doug Kelly

Similar to cleaning up excess .idx files, clean any garbage .bitmap
files that are not otherwise associated with any .idx/.pack files.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/gc.c     | 12 ++++++++++--
 t/t5304-prune.sh |  2 +-
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index c583aad..7ddf071 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -58,8 +58,16 @@ static void clean_pack_garbage(void)
 
 static void report_pack_garbage(unsigned seen_bits, const char *path)
 {
-	if (seen_bits == PACKDIR_FILE_IDX)
-		string_list_append(&pack_garbage, path);
+	if (seen_bits & PACKDIR_FILE_IDX ||
+	    seen_bits & PACKDIR_FILE_BITMAP) {
+		const char *dot = strrchr(path, '.');
+		if (dot) {
+			int baselen = dot - path + 1;
+			if (!strcmp(path+baselen, "idx") ||
+				!strcmp(path+baselen, "bitmap"))
+				string_list_append(&pack_garbage, path);
+		}
+	}
 }
 
 static void git_config_date_string(const char *key, const char **output)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 4fa6e7a..9f9f263 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -257,7 +257,7 @@ EOF
 	test_cmp expected actual
 '
 
-test_expect_failure 'clean pack garbage with gc' '
+test_expect_success 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir
  2015-12-19  0:06                 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
@ 2015-12-19  2:01                   ` Jeff King
  0 siblings, 0 replies; 33+ messages in thread
From: Jeff King @ 2015-12-19  2:01 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, sbeller, gitster

On Fri, Dec 18, 2015 at 06:06:40PM -0600, Doug Kelly wrote:

> Similar to cleaning up excess .idx files, clean any garbage .bitmap
> files that are not otherwise associated with any .idx/.pack files.
> 
> Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
> ---
>  builtin/gc.c     | 12 ++++++++++--
>  t/t5304-prune.sh |  2 +-
>  2 files changed, 11 insertions(+), 3 deletions(-)
> 
> diff --git a/builtin/gc.c b/builtin/gc.c
> index c583aad..7ddf071 100644
> --- a/builtin/gc.c
> +++ b/builtin/gc.c
> @@ -58,8 +58,16 @@ static void clean_pack_garbage(void)
>  
>  static void report_pack_garbage(unsigned seen_bits, const char *path)
>  {
> -	if (seen_bits == PACKDIR_FILE_IDX)
> -		string_list_append(&pack_garbage, path);
> +	if (seen_bits & PACKDIR_FILE_IDX ||
> +	    seen_bits & PACKDIR_FILE_BITMAP) {
> +		const char *dot = strrchr(path, '.');
> +		if (dot) {
> +			int baselen = dot - path + 1;
> +			if (!strcmp(path+baselen, "idx") ||
> +				!strcmp(path+baselen, "bitmap"))
> +				string_list_append(&pack_garbage, path);
> +		}
> +	}
>  }

Hmm. Thinking on this further, do we actually need to check seen_bits
here at all?

The original was trying to ask "is this a .idx file" by checking
seen_bits.  That was actually broken by the first patch in this series
for some cases, as we might send more bits. E.g., if you have "foo.idx"
and "foo.pack", this function will get called twice (once per file), but
with seen_bits set to IDX|BITMAP for both cases. And we would not match
the "==" above, and would therefore fail to trigger.

That case is re-fixed by this patch, which is good. But I think
seen_bits is not really telling us anything at this point. We know it's
a garbage case, or else report_helper wouldn't have passed it along to
us. But we care only about the extension in the path, which is what
distinguishes each individual call to this function.

So we can just check that.  I also think the logic may be clearer if we
handle each extension exhaustively, like:

  /* We know these are useless without the matching .pack */
  if (ends_with(path, ".bitmap") || ends_with(path, ".idx")) {
          string_list_append(&pack_garbage, path);
	  return;
  }

  /*
   * A pack without other files cannot be used, but should be saved,
   * as this is a recoverable situation (we may even see it racily
   * as new packs come into existence).
   */
  if (ends_with(path, ".pack"))
	  return;

  /*
   * A .keep file is useless without the matching pack, but it
   * _could_ contain information generated by the user. Let's keep it.
   * In the future, we may expand this to look for obvious leftover
   * receive-pack locks and drop them.
   */
  if (ends_with(path, ".keep"))
          return;

  /*
   * A totally unrelated garbage file should be kept, to err
   * on the conservative side.
   */
  if (seen_bits & PACKDIR_FILE_GARBAGE)
	return;

  /*
   * We have a file type that the garbage-reporting functions
   * know about but we don't. This function needs updating.
   */
  die("BUG: report_pack_garbage confused");

> -test_expect_failure 'clean pack garbage with gc' '
> +test_expect_success 'clean pack garbage with gc' '
>  	test_when_finished "rm -f .git/objects/pack/fake*" &&
>  	test_when_finished "rm -f .git/objects/pack/foo*" &&
>  	: >.git/objects/pack/foo.keep &&

And I think here we should make sure that we are covering the above
situations (and especially that we are keeping files that should be
kept).

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 0/3] prepare_packed_git(): find more garbage
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
                                   ` (2 preceding siblings ...)
  2015-12-19  0:06                 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
@ 2015-12-19  2:02                 ` Jeff King
  2015-12-19  2:03                   ` Jeff King
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  4 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2015-12-19  2:02 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, sbeller, gitster

On Fri, Dec 18, 2015 at 06:06:37PM -0600, Doug Kelly wrote:

> Corrects the issues found in review by Peff, including simplifying
> the logic in report_helper().  bits_to_msg() would've been an alternate
> solution to that change, however it'll get called by
> real_report_garbage(), so there's no need to call it twice, especially
> when the check we need within report_helper().

OK. The new logic in 1/3 looks fine to me.

> I think checking for seen_bits == 0 would be future-proofing should we
> arrive at a file bit not otherwise match it (i.e. file.foo and
> file.bar, but nothing else would cause seen_bits to be zero, but if
> that's the case, we wouldn't have PACKDIR_FILE_IDX or
> PACKDIR_FILE_PACK set, either, and the second half would also match.

Yeah, I think this is sound.

I left a few comments on 3/3. I don't think it's _wrong_, but I think we
can be a bit more thorough (and IMHO, a little more maintainable, but
others might disagree).

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 0/3] prepare_packed_git(): find more garbage
  2015-12-19  2:02                 ` [PATCH v3 0/3] prepare_packed_git(): find more garbage Jeff King
@ 2015-12-19  2:03                   ` Jeff King
  2016-01-11 16:35                     ` Stefan Beller
  0 siblings, 1 reply; 33+ messages in thread
From: Jeff King @ 2015-12-19  2:03 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, sbeller, gitster

On Fri, Dec 18, 2015 at 09:02:47PM -0500, Jeff King wrote:

> I left a few comments on 3/3. I don't think it's _wrong_, but I think we
> can be a bit more thorough (and IMHO, a little more maintainable, but
> others might disagree).

Oh, and I forgot to say thank you for working on this. :)

-Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH v3 0/3] prepare_packed_git(): find more garbage
  2015-12-19  2:03                   ` Jeff King
@ 2016-01-11 16:35                     ` Stefan Beller
  0 siblings, 0 replies; 33+ messages in thread
From: Stefan Beller @ 2016-01-11 16:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Doug Kelly, git, Junio C Hamano

On Fri, Dec 18, 2015 at 6:03 PM, Jeff King <peff@peff.net> wrote:
> On Fri, Dec 18, 2015 at 09:02:47PM -0500, Jeff King wrote:
>
>> I left a few comments on 3/3. I don't think it's _wrong_, but I think we
>> can be a bit more thorough (and IMHO, a little more maintainable, but
>> others might disagree).
>
> Oh, and I forgot to say thank you for working on this. :)

Thanks for working on this from me, too!
[PATCH 1/3] prepare_packed_git(): find more garbage looks good to me.


>
> -Peff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir
  2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
                                   ` (3 preceding siblings ...)
  2015-12-19  2:02                 ` [PATCH v3 0/3] prepare_packed_git(): find more garbage Jeff King
@ 2016-01-13 17:07                 ` Doug Kelly
  2016-01-13 17:07                   ` [PATCH 1/4] prepare_packed_git(): find more garbage Doug Kelly
                                     ` (3 more replies)
  4 siblings, 4 replies; 33+ messages in thread
From: Doug Kelly @ 2016-01-13 17:07 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, Doug Kelly

Updated version to prepare_packed_git based on additional feedback,
including Peff's idea to explicitly call out each file we want to
handle in gc.  Also adds a test case in t5304 to ensure files we
explicitly don't wnat to touch are not modified.

Doug Kelly (4):
  prepare_packed_git(): find more garbage
  t5304: Add test for .bitmap garbage files
  t5304: Ensure wanted files are not deleted
  gc: Clean garbage .bitmap files from pack dir

 builtin/count-objects.c | 16 ++++++----------
 builtin/gc.c            | 35 ++++++++++++++++++++++++++++++++++-
 cache.h                 |  4 +++-
 sha1_file.c             | 12 +++++++++---
 t/t5304-prune.sh        | 37 +++++++++++++++++++++++++++++++++++++
 5 files changed, 89 insertions(+), 15 deletions(-)

-- 
2.6.1

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH 1/4] prepare_packed_git(): find more garbage
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
@ 2016-01-13 17:07                   ` Doug Kelly
  2016-01-13 17:07                   ` [PATCH 2/4] t5304: Add test for .bitmap garbage files Doug Kelly
                                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2016-01-13 17:07 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, Doug Kelly

.bitmap and .keep files without .idx/.pack don't make much sense, so
make sure these are reported as garbage as well.  At the same time,
refactoring report_garbage to handle extra bits.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 builtin/count-objects.c | 16 ++++++----------
 cache.h                 |  4 +++-
 sha1_file.c             | 12 +++++++++---
 t/t5304-prune.sh        |  2 ++
 4 files changed, 20 insertions(+), 14 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index ba92919..ed103ae 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -17,19 +17,15 @@ static off_t loose_size;
 
 static const char *bits_to_msg(unsigned seen_bits)
 {
-	switch (seen_bits) {
-	case 0:
-		return "no corresponding .idx or .pack";
-	case PACKDIR_FILE_GARBAGE:
+	if (seen_bits & PACKDIR_FILE_GARBAGE)
 		return "garbage found";
-	case PACKDIR_FILE_PACK:
+	else if (seen_bits & PACKDIR_FILE_PACK && !(seen_bits & PACKDIR_FILE_IDX))
 		return "no corresponding .idx";
-	case PACKDIR_FILE_IDX:
+	else if (seen_bits & PACKDIR_FILE_IDX && !(seen_bits & PACKDIR_FILE_PACK))
 		return "no corresponding .pack";
-	case PACKDIR_FILE_PACK|PACKDIR_FILE_IDX:
-	default:
-		return NULL;
-	}
+	else if (!(seen_bits & (PACKDIR_FILE_IDX|PACKDIR_FILE_PACK)))
+		return "no corresponding .idx or .pack";
+	return NULL;
 }
 
 static void real_report_garbage(unsigned seen_bits, const char *path)
diff --git a/cache.h b/cache.h
index dfc459c..aee1d51 100644
--- a/cache.h
+++ b/cache.h
@@ -1317,7 +1317,9 @@ extern struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_
 /* A hook to report invalid files in pack directory */
 #define PACKDIR_FILE_PACK 1
 #define PACKDIR_FILE_IDX 2
-#define PACKDIR_FILE_GARBAGE 4
+#define PACKDIR_FILE_BITMAP 4
+#define PACKDIR_FILE_KEEP 8
+#define PACKDIR_FILE_GARBAGE 16
 extern void (*report_garbage)(unsigned seen_bits, const char *path);
 
 extern void prepare_packed_git(void);
diff --git a/sha1_file.c b/sha1_file.c
index 73ccd49..b21b2ba 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1223,7 +1223,9 @@ void (*report_garbage)(unsigned seen_bits, const char *path);
 static void report_helper(const struct string_list *list,
 			  int seen_bits, int first, int last)
 {
-	if (seen_bits == (PACKDIR_FILE_PACK|PACKDIR_FILE_IDX))
+	static const int pack_and_index = PACKDIR_FILE_PACK|PACKDIR_FILE_IDX;
+
+	if ((seen_bits & pack_and_index) == pack_and_index)
 		return;
 
 	for (; first < last; first++)
@@ -1257,9 +1259,13 @@ static void report_pack_garbage(struct string_list *list)
 			first = i;
 		}
 		if (!strcmp(path + baselen, "pack"))
-			seen_bits |= 1;
+			seen_bits |= PACKDIR_FILE_PACK;
 		else if (!strcmp(path + baselen, "idx"))
-			seen_bits |= 2;
+			seen_bits |= PACKDIR_FILE_IDX;
+		else if (!strcmp(path + baselen, "bitmap"))
+			seen_bits |= PACKDIR_FILE_BITMAP;
+		else if (!strcmp(path + baselen, "keep"))
+			seen_bits |= PACKDIR_FILE_KEEP;
 	}
 	report_helper(list, seen_bits, first, list->nr);
 }
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index def203c..1ea8279 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -261,6 +261,8 @@ test_expect_success 'clean pack garbage with gc' '
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
+warning: no corresponding .pack: .git/objects/pack/fake2.idx
+warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 2/4] t5304: Add test for .bitmap garbage files
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  2016-01-13 17:07                   ` [PATCH 1/4] prepare_packed_git(): find more garbage Doug Kelly
@ 2016-01-13 17:07                   ` Doug Kelly
  2016-01-13 20:42                     ` Junio C Hamano
  2016-01-13 17:07                   ` [PATCH 3/4] t5304: Ensure wanted files are not deleted Doug Kelly
  2016-01-13 17:07                   ` [PATCH 4/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  3 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2016-01-13 17:07 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, Doug Kelly

When checking for pack garbage, .bitmap files are now detected as
garbage when not associated with another .pack/.idx file.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 t/t5304-prune.sh | 24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 1ea8279..4fa6e7a 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -230,6 +230,12 @@ test_expect_success 'garbage report in count-objects -v' '
 	: >.git/objects/pack/fake.idx &&
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake3.idx &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git count-objects -v 2>stderr &&
 	grep "index file .git/objects/pack/fake.idx is too small" stderr &&
 	grep "^warning:" stderr | sort >actual &&
@@ -238,14 +244,20 @@ warning: garbage found: .git/objects/pack/fake.bar
 warning: garbage found: .git/objects/pack/foo
 warning: garbage found: .git/objects/pack/foo.bar
 warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake4.bitmap
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
 warning: no corresponding .pack: .git/objects/pack/fake3.idx
+warning: no corresponding .pack: .git/objects/pack/fake5.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake5.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.bitmap
+warning: no corresponding .pack: .git/objects/pack/fake6.idx
+warning: no corresponding .pack: .git/objects/pack/fake6.keep
 EOF
 	test_cmp expected actual
 '
 
-test_expect_success 'clean pack garbage with gc' '
+test_expect_failure 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
@@ -254,15 +266,21 @@ test_expect_success 'clean pack garbage with gc' '
 	: >.git/objects/pack/fake2.keep &&
 	: >.git/objects/pack/fake2.idx &&
 	: >.git/objects/pack/fake3.keep &&
+	: >.git/objects/pack/fake4.bitmap &&
+	: >.git/objects/pack/fake5.bitmap &&
+	: >.git/objects/pack/fake5.idx &&
+	: >.git/objects/pack/fake6.keep &&
+	: >.git/objects/pack/fake6.bitmap &&
+	: >.git/objects/pack/fake6.idx &&
 	git gc &&
 	git count-objects -v 2>stderr &&
 	grep "^warning:" stderr | sort >actual &&
 	cat >expected <<\EOF &&
+warning: no corresponding .idx or .pack: .git/objects/pack/fake2.keep
 warning: no corresponding .idx or .pack: .git/objects/pack/fake3.keep
+warning: no corresponding .idx or .pack: .git/objects/pack/fake6.keep
 warning: no corresponding .idx: .git/objects/pack/foo.keep
 warning: no corresponding .idx: .git/objects/pack/foo.pack
-warning: no corresponding .pack: .git/objects/pack/fake2.idx
-warning: no corresponding .pack: .git/objects/pack/fake2.keep
 EOF
 	test_cmp expected actual
 '
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 3/4] t5304: Ensure wanted files are not deleted
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  2016-01-13 17:07                   ` [PATCH 1/4] prepare_packed_git(): find more garbage Doug Kelly
  2016-01-13 17:07                   ` [PATCH 2/4] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2016-01-13 17:07                   ` Doug Kelly
  2016-01-13 20:55                     ` Junio C Hamano
  2016-01-13 17:07                   ` [PATCH 4/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
  3 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2016-01-13 17:07 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, Doug Kelly

Explicitly test for and ensure files that may be wanted are not
deleted during a gc operation.  These include .pack without .idx
(which may be in-flight), garbage in the directory, and .keep files
the user created.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
---
 t/t5304-prune.sh | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 4fa6e7a..f7c380c 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -285,6 +285,23 @@ EOF
 	test_cmp expected actual
 '
 
+test_expect_success 'ensure unknown garbage kept with gc' '
+	test_when_finished "rm -f .git/objects/pack/fake*" &&
+	test_when_finished "rm -f .git/objects/pack/foo*" &&
+	: >.git/objects/pack/foo.keep &&
+	: >.git/objects/pack/fake.pack &&
+	: >.git/objects/pack/fake2.foo &&
+	git gc &&
+	git count-objects -v 2>stderr &&
+	grep "^warning:" stderr | sort >actual &&
+	cat >expected <<\EOF &&
+warning: garbage found: .git/objects/pack/fake2.foo
+warning: no corresponding .idx or .pack: .git/objects/pack/foo.keep
+warning: no corresponding .idx: .git/objects/pack/fake.pack
+EOF
+	test_cmp expected actual
+'
+
 test_expect_success 'prune .git/shallow' '
 	SHA1=`echo hi|git commit-tree HEAD^{tree}` &&
 	echo $SHA1 >.git/shallow &&
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH 4/4] gc: Clean garbage .bitmap files from pack dir
  2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
                                     ` (2 preceding siblings ...)
  2016-01-13 17:07                   ` [PATCH 3/4] t5304: Ensure wanted files are not deleted Doug Kelly
@ 2016-01-13 17:07                   ` Doug Kelly
  3 siblings, 0 replies; 33+ messages in thread
From: Doug Kelly @ 2016-01-13 17:07 UTC (permalink / raw)
  To: git; +Cc: peff, sbeller, Doug Kelly

Similar to cleaning up excess .idx files, clean any garbage .bitmap
files that are not otherwise associated with any .idx/.pack files.

Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
Suggested-by: Jeff King <peff@peff.net>
---
 builtin/gc.c     | 35 ++++++++++++++++++++++++++++++++++-
 t/t5304-prune.sh |  2 +-
 2 files changed, 35 insertions(+), 2 deletions(-)

diff --git a/builtin/gc.c b/builtin/gc.c
index c583aad..79e9886 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -58,8 +58,41 @@ static void clean_pack_garbage(void)
 
 static void report_pack_garbage(unsigned seen_bits, const char *path)
 {
-	if (seen_bits == PACKDIR_FILE_IDX)
+	/* We know these are useless without the matching .pack */
+	if (ends_with(path, ".bitmap") || ends_with(path, ".idx")) {
 		string_list_append(&pack_garbage, path);
+		return;
+	}
+
+	/*
+	 * A pack without other files cannot be used, but should be saved,
+	 * as this is a recoverable situation (we may even see it racily
+	 * as new packs come into existence).
+	 */
+	if (ends_with(path, ".pack"))
+		return;
+
+	/*
+	 * A .keep file is useless without the matching pack, but it
+	 * _could_ contain information generated by the user. Let's keep it.
+	 * In the future, we may expand this to look for obvious leftover
+	 * receive-pack locks and drop them.
+	 */
+	if (ends_with(path, ".keep"))
+		return;
+
+	/*
+	 * A totally unrelated garbage file should be kept, to err
+	 * on the conservative side.
+	 */
+	if (seen_bits & PACKDIR_FILE_GARBAGE)
+		return;
+
+	/*
+	 * We have a file type that the garbage-reporting functions
+	 * know about but we don't. This function needs updating.
+	 */
+	die("BUG: report_pack_garbage confused");
 }
 
 static void git_config_date_string(const char *key, const char **output)
diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index f7c380c..cbcc0c0 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -257,7 +257,7 @@ EOF
 	test_cmp expected actual
 '
 
-test_expect_failure 'clean pack garbage with gc' '
+test_expect_success 'clean pack garbage with gc' '
 	test_when_finished "rm -f .git/objects/pack/fake*" &&
 	test_when_finished "rm -f .git/objects/pack/foo*" &&
 	: >.git/objects/pack/foo.keep &&
-- 
2.6.1

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: [PATCH 2/4] t5304: Add test for .bitmap garbage files
  2016-01-13 17:07                   ` [PATCH 2/4] t5304: Add test for .bitmap garbage files Doug Kelly
@ 2016-01-13 20:42                     ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2016-01-13 20:42 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, peff, sbeller

Doug Kelly <dougk.ff7@gmail.com> writes:

> When checking for pack garbage, .bitmap files are now detected as
> garbage when not associated with another .pack/.idx file.

Probably the above would read better with s/are now/should now be/,
as the _expect_failure this step introduces will be corrected with 
Patch 4/4.

Also I'd suggest s/another/any/.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted
  2016-01-13 17:07                   ` [PATCH 3/4] t5304: Ensure wanted files are not deleted Doug Kelly
@ 2016-01-13 20:55                     ` Junio C Hamano
  2016-01-18 16:54                       ` Doug Kelly
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2016-01-13 20:55 UTC (permalink / raw)
  To: Doug Kelly; +Cc: git, peff, sbeller

Doug Kelly <dougk.ff7@gmail.com> writes:

> Subject: Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted

I'd suggest s/wanted/non-garbage/.

> Explicitly test for and ensure files that may be wanted are not
> deleted during a gc operation.  These include .pack without .idx
> (which may be in-flight), garbage in the directory, and .keep files
> the user created.

"garbage in the directory" is not well defined.  "files in the
directory that clearly are not related to packing" is probably what
you meant, but the definition of "related to packing" is still
fuzzy.  Please clarify.

The following is me thinking aloud about things that you would need
to think about while attempting to clarify this.

What should the code do if we find

    pack-b0a9d62a7471e58832a575a78d57f8fb26822125.frotz

in $GIT_OBJECT_DIRECTORY/pack/ directory?  Is it a "garbage in the
directory"?  The filename looks so similar to the usual things with
know suffixes .pack, .idx, .bitmap, and .keep, that we may want to
consider that it might be another file related to the packing left
by a future version of Git and if we do not see corresponding .pack
we would want to remove it?  Or do we want to do something else?

What should "gc" do if we find

    pack-frotz.idx

without corresponding ".pack"?  Wouldn't it be safer to consider it
a garbage unrelated to packing (because regular packing would have
given it with 40-hex name, not "frotz") and leave it undeleted?

Thanks.

> Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
> ---
>  t/t5304-prune.sh | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
> index 4fa6e7a..f7c380c 100755
> --- a/t/t5304-prune.sh
> +++ b/t/t5304-prune.sh
> @@ -285,6 +285,23 @@ EOF
>  	test_cmp expected actual
>  '
>  
> +test_expect_success 'ensure unknown garbage kept with gc' '
> +	test_when_finished "rm -f .git/objects/pack/fake*" &&
> +	test_when_finished "rm -f .git/objects/pack/foo*" &&
> +	: >.git/objects/pack/foo.keep &&
> +	: >.git/objects/pack/fake.pack &&
> +	: >.git/objects/pack/fake2.foo &&
> +	git gc &&
> +	git count-objects -v 2>stderr &&
> +	grep "^warning:" stderr | sort >actual &&
> +	cat >expected <<\EOF &&
> +warning: garbage found: .git/objects/pack/fake2.foo
> +warning: no corresponding .idx or .pack: .git/objects/pack/foo.keep
> +warning: no corresponding .idx: .git/objects/pack/fake.pack
> +EOF
> +	test_cmp expected actual
> +'
> +
>  test_expect_success 'prune .git/shallow' '
>  	SHA1=`echo hi|git commit-tree HEAD^{tree}` &&
>  	echo $SHA1 >.git/shallow &&

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted
  2016-01-13 20:55                     ` Junio C Hamano
@ 2016-01-18 16:54                       ` Doug Kelly
  2016-01-19 18:36                         ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Doug Kelly @ 2016-01-18 16:54 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git List, Jeff King, Stefan Beller

On Wed, Jan 13, 2016 at 2:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Doug Kelly <dougk.ff7@gmail.com> writes:
>
>> Subject: Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted
>
> I'd suggest s/wanted/non-garbage/.
>

I'm okay with this.

>> Explicitly test for and ensure files that may be wanted are not
>> deleted during a gc operation.  These include .pack without .idx
>> (which may be in-flight), garbage in the directory, and .keep files
>> the user created.
>
> "garbage in the directory" is not well defined.  "files in the
> directory that clearly are not related to packing" is probably what
> you meant, but the definition of "related to packing" is still
> fuzzy.  Please clarify.

This is probably a good point.  Perhaps a better way to think about it
would be by rewording the paragraph to something like this:

Explicitly test for and ensure files that may either be desired by the user
or are possibly not garbage are not deleted during a gc operation.
These include .pack files missing a corresponding .idx file (possibly due
to it being in-flight), .keep files created by the user, and other
unknown garbage in the pack directory.  These files will still be identified
by "git count-objects -v", but should not be removed automatically by
gc.  Only files we are absolutely sure are unnecessary will be removed
as a part of the gc process.

>
> The following is me thinking aloud about things that you would need
> to think about while attempting to clarify this.
>
> What should the code do if we find
>
>     pack-b0a9d62a7471e58832a575a78d57f8fb26822125.frotz
>
> in $GIT_OBJECT_DIRECTORY/pack/ directory?  Is it a "garbage in the
> directory"?  The filename looks so similar to the usual things with
> know suffixes .pack, .idx, .bitmap, and .keep, that we may want to
> consider that it might be another file related to the packing left
> by a future version of Git and if we do not see corresponding .pack
> we would want to remove it?  Or do we want to do something else?
>
> What should "gc" do if we find
>
>     pack-frotz.idx
>
> without corresponding ".pack"?  Wouldn't it be safer to consider it
> a garbage unrelated to packing (because regular packing would have
> given it with 40-hex name, not "frotz") and leave it undeleted?
>

I think the above paragraph helps explain what we're doing and why.
In your examples, a somewhat valid looking pack file with an unknown
extension may be flagged as "garbage," but should not be deleted
during the gc.  Similarly, we decided that an .idx file with no
corresponding .pack was safe to delete (since the pack is written before
idx, and the initial performance problem was related to scanning a large
number of idx files).

I'm not saying there's nothing to be said for the difference in the base
filename without extension.  Currently, the logic to remove pack garbage
doesn't look at that, though: it only considers the extension, and what
related files are found in the directory.  Whether this is good or bad, I'm
not sure.  It certainly does what I need at fairly low risk, though.

Does this help clarify the situation more?

> Thanks.
>
>> Signed-off-by: Doug Kelly <dougk.ff7@gmail.com>
>> ---
>>  t/t5304-prune.sh | 17 +++++++++++++++++
>>  1 file changed, 17 insertions(+)
>>
>> diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
>> index 4fa6e7a..f7c380c 100755
>> --- a/t/t5304-prune.sh
>> +++ b/t/t5304-prune.sh
>> @@ -285,6 +285,23 @@ EOF
>>       test_cmp expected actual
>>  '
>>
>> +test_expect_success 'ensure unknown garbage kept with gc' '
>> +     test_when_finished "rm -f .git/objects/pack/fake*" &&
>> +     test_when_finished "rm -f .git/objects/pack/foo*" &&
>> +     : >.git/objects/pack/foo.keep &&
>> +     : >.git/objects/pack/fake.pack &&
>> +     : >.git/objects/pack/fake2.foo &&
>> +     git gc &&
>> +     git count-objects -v 2>stderr &&
>> +     grep "^warning:" stderr | sort >actual &&
>> +     cat >expected <<\EOF &&
>> +warning: garbage found: .git/objects/pack/fake2.foo
>> +warning: no corresponding .idx or .pack: .git/objects/pack/foo.keep
>> +warning: no corresponding .idx: .git/objects/pack/fake.pack
>> +EOF
>> +     test_cmp expected actual
>> +'
>> +
>>  test_expect_success 'prune .git/shallow' '
>>       SHA1=`echo hi|git commit-tree HEAD^{tree}` &&
>>       echo $SHA1 >.git/shallow &&

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted
  2016-01-18 16:54                       ` Doug Kelly
@ 2016-01-19 18:36                         ` Junio C Hamano
  0 siblings, 0 replies; 33+ messages in thread
From: Junio C Hamano @ 2016-01-19 18:36 UTC (permalink / raw)
  To: Doug Kelly; +Cc: Git List, Jeff King, Stefan Beller

Doug Kelly <dougk.ff7@gmail.com> writes:

> On Wed, Jan 13, 2016 at 2:55 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> Doug Kelly <dougk.ff7@gmail.com> writes:
>>
>>> Subject: Re: [PATCH 3/4] t5304: Ensure wanted files are not deleted
>>
>> I'd suggest s/wanted/non-garbage/.
>>
>
> I'm okay with this.
>
>>> Explicitly test for and ensure files that may be wanted are not
>>> deleted during a gc operation.  These include .pack without .idx
>>> (which may be in-flight), garbage in the directory, and .keep files
>>> the user created.
>>
>> "garbage in the directory" is not well defined.  "files in the
>> directory that clearly are not related to packing" is probably what
>> you meant, but the definition of "related to packing" is still
>> fuzzy.  Please clarify.
>
> This is probably a good point.  Perhaps a better way to think about it
> would be by rewording the paragraph to something like this:
>
> Explicitly test for and ensure files that may either be desired by the user
> or are possibly not garbage are not deleted during a gc operation.
> These include .pack files missing a corresponding .idx file (possibly due
> to it being in-flight), .keep files created by the user, and other
> unknown garbage in the pack directory.  These files will still be identified
> by "git count-objects -v", but should not be removed automatically by
> gc.  Only files we are absolutely sure are unnecessary will be removed
> as a part of the gc process.

That, especially "other _unknown_ garbage", looks much better than
the original.

> I'm not saying there's nothing to be said for the difference in the base
> filename without extension.  Currently, the logic to remove pack garbage
> doesn't look at that, though: it only considers the extension, and what
> related files are found in the directory.  Whether this is good or bad, I'm
> not sure.  It certainly does what I need at fairly low risk, though.
>
> Does this help clarify the situation more?

I was shooting for making you _think_ exactly about what you wrote
in the above paragraph, i.e. what the current logic does and if it
is sensible, is overly pessimistic for some files, and/or is risky
for some other files and if so in what way, as that would help you
record the thinking behind the different treatment for files based
on various file extentions clearly in the log message.

Thanks.

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2016-01-19 18:36 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-14  0:10 [PATCH 0/3] Add cleanup for garbage .bitmap files Doug Kelly
2015-11-14  0:10 ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
2015-11-14  0:43   ` Stefan Beller
2015-11-14  0:46     ` Doug Kelly
2015-11-14  0:46       ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
2015-11-14  0:46       ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
2015-12-15 23:23         ` Jeff King
2015-11-25 18:43       ` [PATCH 1/3] prepare_packed_git(): find more garbage Stefan Beller
2015-11-26  6:15         ` Doug Kelly
2015-12-15 23:09           ` Jeff King
2015-12-15 23:25             ` Jeff King
2015-12-19  0:06               ` [PATCH v3 0/3] " Doug Kelly
2015-12-19  0:06                 ` [PATCH 1/3] " Doug Kelly
2015-12-19  0:06                 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
2015-12-19  0:06                 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly
2015-12-19  2:01                   ` Jeff King
2015-12-19  2:02                 ` [PATCH v3 0/3] prepare_packed_git(): find more garbage Jeff King
2015-12-19  2:03                   ` Jeff King
2016-01-11 16:35                     ` Stefan Beller
2016-01-13 17:07                 ` [PATCH v4 0/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
2016-01-13 17:07                   ` [PATCH 1/4] prepare_packed_git(): find more garbage Doug Kelly
2016-01-13 17:07                   ` [PATCH 2/4] t5304: Add test for .bitmap garbage files Doug Kelly
2016-01-13 20:42                     ` Junio C Hamano
2016-01-13 17:07                   ` [PATCH 3/4] t5304: Ensure wanted files are not deleted Doug Kelly
2016-01-13 20:55                     ` Junio C Hamano
2016-01-18 16:54                       ` Doug Kelly
2016-01-19 18:36                         ` Junio C Hamano
2016-01-13 17:07                   ` [PATCH 4/4] gc: Clean garbage .bitmap files from pack dir Doug Kelly
2015-11-26  6:18         ` [PATCH 1/3] prepare_packed_git(): find more garbage Doug Kelly
2015-11-14  0:47     ` Doug Kelly
2015-11-14  0:10 ` [PATCH 2/3] t5304: Add test for .bitmap garbage files Doug Kelly
2015-11-14  0:47   ` Stefan Beller
2015-11-14  0:10 ` [PATCH 3/3] gc: Clean garbage .bitmap files from pack dir Doug Kelly

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.