All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/16] make prune mtime-checking more careful
@ 2014-10-03 20:20 Jeff King
  2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
                   ` (17 more replies)
  0 siblings, 18 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:20 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

At GitHub we've occasionally run across repos getting corrupted by trees
and blobs near the tip going missing. We do a lot of "test merges"
between branches and HEAD (this is what feeds the "OK to merge" button
on the web interface), and the objects are almost always related to
these merges. The objects are removed by prune, which doesn't realize
that they are part of an ongoing operation. Prune uses the filesystem
mtime to determine this, but we are not very thorough in making sure
that is kept up to date.

This series tries to fix that with two techniques:

  1. When we try to write an object to disk, we optimize out the write
     if we already have the object. Instead, we should still update the
     mtime of the object in this case.

  2. Treat objects reachable from "recent" objects as recent themselves.

     When we check that we have an object, we do not check whether we
     have all of the objects it can reach. If you have some new objects
     that refer to some old objects (e.g., you create and delete a tree
     on day 1, and then create a new tree referring to the blob on day
     2), then prune may delete the old object but not the new (in this
     case, we delete the blob but not the tree).

     Any subsequent use of the new object will check that we have it
     (e.g., commit-tree makes sure we have the tree we feed it), but not
     other objects it can reach. This can lead to referencing a
     half-formed part of the graph.

Note that this does not make prune race-free. For example, you could
check for and update the mtime of an object just as prune is deleting
it, and think that it is written when it is not. Fixing that would
require some atomic mechanism for prune to check the mtime and delete.
But I do think this series cuts us down to "real" race conditions, with
millisecond-ish timing. The problems we're fixing here are much worse
than that. The distance between an object being written and being
referred may operate on human timescales (e.g., writing a commit
message). Or the time distance between two objects that refer to each
other may be days or weeks; a prune where one falls in the "recent"
boundary and another does not can be disastrous.

There's quite a lot of patches here, but most of them are preparatory
cleanups. The meat is in patches 13, 15, and 16.

  [01/16]: foreach_alt_odb: propagate return value from callback
  [02/16]: isxdigit: cast input to unsigned char
  [03/16]: object_array: factor out slopbuf-freeing logic
  [04/16]: object_array: add a "clear" function
  [05/16]: clean up name allocation in prepare_revision_walk
  [06/16]: reachable: clear pending array after walking it
  [07/16]: t5304: use test_path_is_* instead of "test -f"
  [08/16]: t5304: use helper to report failure of "test foo = bar"
  [09/16]: prune: factor out loose-object directory traversal
  [10/16]: count-objects: do not use xsize_t when counting object size
  [11/16]: count-objects: use for_each_loose_file_in_objdir
  [12/16]: sha1_file: add for_each iterators for loose and packed objects
  [13/16]: prune: keep objects reachable from recent objects
  [14/16]: pack-objects: refactor unpack-unreachable expiration check
  [15/16]: pack-objects: match prune logic for discarding objects
  [16/16]: write_sha1_file: freshen existing objects

Note that these aren't yet running on GitHub servers. I know that they
fix real potential problems (see the new t6501 for examples), but I
don't know for sure if they will catch the problems we have seen. The
frequency of these issues is relatively rare, so even once deployed, we
won't know for sure until a few weeks or months have passed.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* [PATCH 01/16] foreach_alt_odb: propagate return value from callback
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
@ 2014-10-03 20:21 ` Jeff King
  2014-10-03 22:55   ` René Scharfe
  2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:21 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

We check the return value of the callback and stop iterating
if it is non-zero. However, we do not make the non-zero
return value available to the caller, so they have no way of
knowing whether the operation succeeded or not (technically
they can keep their own error flag in the callback data, but
that is unlike our other for_each functions).

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h     |  2 +-
 sha1_file.c | 12 ++++++++----
 2 files changed, 9 insertions(+), 5 deletions(-)

diff --git a/cache.h b/cache.h
index 8206039..cd16e25 100644
--- a/cache.h
+++ b/cache.h
@@ -1143,7 +1143,7 @@ extern void prepare_alt_odb(void);
 extern void read_info_alternates(const char * relative_base, int depth);
 extern void add_to_alternates_file(const char *reference);
 typedef int alt_odb_fn(struct alternate_object_database *, void *);
-extern void foreach_alt_odb(alt_odb_fn, void*);
+extern int foreach_alt_odb(alt_odb_fn, void*);
 
 struct pack_window {
 	struct pack_window *next;
diff --git a/sha1_file.c b/sha1_file.c
index c08c0cb..bae1c15 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -412,14 +412,18 @@ void add_to_alternates_file(const char *reference)
 		link_alt_odb_entries(alt, strlen(alt), '\n', NULL, 0);
 }
 
-void foreach_alt_odb(alt_odb_fn fn, void *cb)
+int foreach_alt_odb(alt_odb_fn fn, void *cb)
 {
 	struct alternate_object_database *ent;
+	int r = 0;
 
 	prepare_alt_odb();
-	for (ent = alt_odb_list; ent; ent = ent->next)
-		if (fn(ent, cb))
-			return;
+	for (ent = alt_odb_list; ent; ent = ent->next) {
+		int r = fn(ent, cb);
+		if (r)
+			break;
+	}
+	return r;
 }
 
 void prepare_alt_odb(void)
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 02/16] isxdigit: cast input to unsigned char
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
  2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
@ 2014-10-03 20:22 ` Jeff King
  2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:22 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

Otherwise, callers must do so or risk triggering
-Wchar-subscript (and rightfully so; a signed char might
cause us to use a bogus negative index into the
hexval_table).

While we are dropping the now-unnecessary casts from the
caller in urlmatch.c, we can get rid of similar casts in
actually parsing the hex by using the hexval() helper, which
implicitly casts to unsigned (but note that we cannot
implement isxdigit in terms of hexval(), as it also casts
its return value to unsigned).

Signed-off-by: Jeff King <peff@peff.net>
---
 git-compat-util.h | 2 +-
 urlmatch.c        | 8 ++++----
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/git-compat-util.h b/git-compat-util.h
index 0c4e663..3d5a8fb 100644
--- a/git-compat-util.h
+++ b/git-compat-util.h
@@ -677,7 +677,7 @@ extern const unsigned char sane_ctype[256];
 #define iscntrl(x) (sane_istest(x,GIT_CNTRL))
 #define ispunct(x) sane_istest(x, GIT_PUNCT | GIT_REGEX_SPECIAL | \
 		GIT_GLOB_SPECIAL | GIT_PATHSPEC_MAGIC)
-#define isxdigit(x) (hexval_table[x] != -1)
+#define isxdigit(x) (hexval_table[(unsigned char)(x)] != -1)
 #define tolower(x) sane_case((unsigned char)(x), 0x20)
 #define toupper(x) sane_case((unsigned char)(x), 0)
 #define is_pathspec_magic(x) sane_istest(x,GIT_PATHSPEC_MAGIC)
diff --git a/urlmatch.c b/urlmatch.c
index 3d4c54b..618d216 100644
--- a/urlmatch.c
+++ b/urlmatch.c
@@ -43,11 +43,11 @@ static int append_normalized_escapes(struct strbuf *buf,
 		from_len--;
 		if (ch == '%') {
 			if (from_len < 2 ||
-			    !isxdigit((unsigned char)from[0]) ||
-			    !isxdigit((unsigned char)from[1]))
+			    !isxdigit(from[0]) ||
+			    !isxdigit(from[1]))
 				return 0;
-			ch = hexval_table[(unsigned char)*from++] << 4;
-			ch |= hexval_table[(unsigned char)*from++];
+			ch = hexval(*from++) << 4;
+			ch |= hexval(*from++);
 			from_len -= 2;
 			was_esc = 1;
 		}
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 03/16] object_array: factor out slopbuf-freeing logic
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
  2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
  2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
@ 2014-10-03 20:22 ` Jeff King
  2014-10-07 11:25   ` Michael Haggerty
  2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:22 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

This is not a lot of code, but it's a logical construct that
should not need to be repeated (and we are about to add a
third repetition).

Signed-off-by: Jeff King <peff@peff.net>
---
 object.c | 12 ++++++++----
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/object.c b/object.c
index ca9d790..14238dc 100644
--- a/object.c
+++ b/object.c
@@ -355,6 +355,12 @@ void add_object_array_with_context(struct object *obj, const char *name, struct
 		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
 }
 
+static void object_array_release_entry(struct object_array_entry *ent)
+{
+	if (ent->name != object_array_slopbuf)
+		free(ent->name);
+}
+
 void object_array_filter(struct object_array *array,
 			 object_array_each_func_t want, void *cb_data)
 {
@@ -367,8 +373,7 @@ void object_array_filter(struct object_array *array,
 				objects[dst] = objects[src];
 			dst++;
 		} else {
-			if (objects[src].name != object_array_slopbuf)
-				free(objects[src].name);
+			object_array_release_entry(&objects[src]);
 		}
 	}
 	array->nr = dst;
@@ -400,8 +405,7 @@ void object_array_remove_duplicates(struct object_array *array)
 				objects[array->nr] = objects[src];
 			array->nr++;
 		} else {
-			if (objects[src].name != object_array_slopbuf)
-				free(objects[src].name);
+			object_array_release_entry(&objects[src]);
 		}
 	}
 }
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 04/16] object_array: add a "clear" function
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (2 preceding siblings ...)
  2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
@ 2014-10-03 20:22 ` Jeff King
  2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:22 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

There's currently no easy way to free the memory associated
with an object_array (and in most cases, we simply leak the
memory in a rev_info's pending array). Let's provide a
helper to make this easier to handle.

We can make use of it in list-objects.c, which does the same
thing by hand (but fails to free the "name" field of each
entry, potentially leaking memory).

Signed-off-by: Jeff King <peff@peff.net>
---
 list-objects.c |  7 +------
 object.c       | 10 ++++++++++
 object.h       |  6 ++++++
 3 files changed, 17 insertions(+), 6 deletions(-)

diff --git a/list-objects.c b/list-objects.c
index 3595ee7..fad6808 100644
--- a/list-objects.c
+++ b/list-objects.c
@@ -228,11 +228,6 @@ void traverse_commit_list(struct rev_info *revs,
 		die("unknown pending object %s (%s)",
 		    sha1_to_hex(obj->sha1), name);
 	}
-	if (revs->pending.nr) {
-		free(revs->pending.objects);
-		revs->pending.nr = 0;
-		revs->pending.alloc = 0;
-		revs->pending.objects = NULL;
-	}
+	object_array_clear(&revs->pending);
 	strbuf_release(&base);
 }
diff --git a/object.c b/object.c
index 14238dc..e36e33d 100644
--- a/object.c
+++ b/object.c
@@ -379,6 +379,16 @@ void object_array_filter(struct object_array *array,
 	array->nr = dst;
 }
 
+void object_array_clear(struct object_array *array)
+{
+	int i;
+	for (i = 0; i < array->nr; i++)
+		object_array_release_entry(&array->objects[i]);
+	free(array->objects);
+	array->objects = NULL;
+	array->nr = array->alloc = 0;
+}
+
 /*
  * Return true iff array already contains an entry with name.
  */
diff --git a/object.h b/object.h
index e028ced..2a755a2 100644
--- a/object.h
+++ b/object.h
@@ -133,6 +133,12 @@ void object_array_filter(struct object_array *array,
  */
 void object_array_remove_duplicates(struct object_array *array);
 
+/*
+ * Remove any objects from the array, freeing all used memory; afterwards
+ * the array is ready to store more objects with add_object_array().
+ */
+void object_array_clear(struct object_array *array);
+
 void clear_object_flags(unsigned flags);
 
 #endif /* OBJECT_H */
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 05/16] clean up name allocation in prepare_revision_walk
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (3 preceding siblings ...)
  2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
@ 2014-10-03 20:23 ` Jeff King
  2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:23 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

When we enter prepare_revision_walk, we have zero or more
entries in our "pending" array. We disconnect that array
from the rev_info, and then process each entry:

  1. If the entry is a commit and the --source option is in
     effect, we keep a pointer to the object name.

  2. Otherwise, we re-add the item to the pending list with
     a blank name.

We then throw away the old array by freeing the array
itself, but do not touch the "name" field of each entry. For
any items of type (2), we leak the memory associated with
the name. This commit fixes that by calling object_array_clear,
which handles the cleanup for us.

That breaks (1), though, because it depends on the memory
pointed to by the name to last forever. We can solve that by
making a copy of the name. This is slightly less efficient,
but it shouldn't matter in practice, as we do it only for
the tip commits of the traversal.

Signed-off-by: Jeff King <peff@peff.net>
---
 revision.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/revision.c b/revision.c
index e498b7c..01cc276 100644
--- a/revision.c
+++ b/revision.c
@@ -300,7 +300,7 @@ static struct commit *handle_commit(struct rev_info *revs,
 			revs->limited = 1;
 		}
 		if (revs->show_source && !commit->util)
-			commit->util = (void *) name;
+			commit->util = xstrdup(name);
 		return commit;
 	}
 
@@ -2656,15 +2656,16 @@ void reset_revision_walk(void)
 
 int prepare_revision_walk(struct rev_info *revs)
 {
-	int nr = revs->pending.nr;
-	struct object_array_entry *e, *list;
+	int i;
+	struct object_array old_pending;
 	struct commit_list **next = &revs->commits;
 
-	e = list = revs->pending.objects;
+	memcpy(&old_pending, &revs->pending, sizeof(old_pending));
 	revs->pending.nr = 0;
 	revs->pending.alloc = 0;
 	revs->pending.objects = NULL;
-	while (--nr >= 0) {
+	for (i = 0; i < old_pending.nr; i++) {
+		struct object_array_entry *e = old_pending.objects + i;
 		struct commit *commit = handle_commit(revs, e->item, e->name);
 		if (commit) {
 			if (!(commit->object.flags & SEEN)) {
@@ -2672,10 +2673,9 @@ int prepare_revision_walk(struct rev_info *revs)
 				next = commit_list_append(commit, next);
 			}
 		}
-		e++;
 	}
 	if (!revs->leak_pending)
-		free(list);
+		object_array_clear(&old_pending);
 
 	/* Signal whether we need per-parent treesame decoration */
 	if (revs->simplify_merges ||
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 06/16] reachable: clear pending array after walking it
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (4 preceding siblings ...)
  2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
@ 2014-10-03 20:24 ` Jeff King
  2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:24 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

We add a number of objects to our "pending" array, and then
process it with a combination of get_revision and walking
the pending array ourselves (to catch any non-commits). The
commits in the pending array are cleaned up automatically by
prepare_revision_walk, but we essentially leak any other
objects (they are technically still reachable from rev_info,
but no callers ever look at them or bother to clean them
up).

This is not a huge deal in practice, as the number of
non-commits tends to be small. However, a future patch will
broaden this considerably. Let's call object_array_clear to
free the memory.

Signed-off-by: Jeff King <peff@peff.net>
---
 reachable.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/reachable.c b/reachable.c
index 6f6835b..d99bd31 100644
--- a/reachable.c
+++ b/reachable.c
@@ -131,6 +131,8 @@ static void walk_commit_list(struct rev_info *revs,
 		}
 		die("unknown pending object %s (%s)", sha1_to_hex(obj->sha1), name);
 	}
+
+	object_array_clear(&revs->pending);
 }
 
 static int add_one_reflog_ent(unsigned char *osha1, unsigned char *nsha1,
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 07/16] t5304: use test_path_is_* instead of "test -f"
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (5 preceding siblings ...)
  2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
@ 2014-10-03 20:25 ` Jeff King
  2014-10-03 22:12   ` Junio C Hamano
  2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
                   ` (10 subsequent siblings)
  17 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:25 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

This is slightly more robust (checking "! test -f" would not
notice a directory of the same name, though that is not
likely to happen here). It also makes debugging easier, as
the test script will output a message on failure.

Signed-off-by: Jeff King <peff@peff.net>
---
This patch is totally optional. I did it while debugging t5304 (strange
how badly prune works when you accidentally invert the mtime check!)
and figured it might be worth keeping as a cleanup.

 t/t5304-prune.sh | 46 +++++++++++++++++++++++-----------------------
 1 file changed, 23 insertions(+), 23 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index 01c6a3f..b0ffb05 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -14,7 +14,7 @@ add_blob() {
 	BLOB=$(echo aleph_0 | git hash-object -w --stdin) &&
 	BLOB_FILE=.git/objects/$(echo $BLOB | sed "s/^../&\//") &&
 	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	test-chmtime =+0 $BLOB_FILE
 }
 
@@ -35,9 +35,9 @@ test_expect_success 'prune stale packs' '
 	: > .git/objects/tmp_2.pack &&
 	test-chmtime =-86501 .git/objects/tmp_1.pack &&
 	git prune --expire 1.day &&
-	test -f $orig_pack &&
-	test -f .git/objects/tmp_2.pack &&
-	! test -f .git/objects/tmp_1.pack
+	test_path_is_file $orig_pack &&
+	test_path_is_file .git/objects/tmp_2.pack &&
+	test_path_is_missing .git/objects/tmp_1.pack
 
 '
 
@@ -46,11 +46,11 @@ test_expect_success 'prune --expire' '
 	add_blob &&
 	git prune --expire=1.hour.ago &&
 	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	test-chmtime =-86500 $BLOB_FILE &&
 	git prune --expire 1.day &&
 	test $before = $(git count-objects | sed "s/ .*//") &&
-	! test -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -60,11 +60,11 @@ test_expect_success 'gc: implicit prune --expire' '
 	test-chmtime =-$((2*$week-30)) $BLOB_FILE &&
 	git gc &&
 	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	test-chmtime =-$((2*$week+1)) $BLOB_FILE &&
 	git gc &&
 	test $before = $(git count-objects | sed "s/ .*//") &&
-	! test -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -110,7 +110,7 @@ test_expect_success 'prune: do not prune detached HEAD with no reflog' '
 	git commit --allow-empty -m "detached commit" &&
 	# verify that there is no reflogs
 	# (should be removed and disabled by previous test)
-	test ! -e .git/logs &&
+	test_path_is_missing .git/logs &&
 	git prune -n >prune_actual &&
 	: >prune_expected &&
 	test_cmp prune_actual prune_expected
@@ -145,7 +145,7 @@ test_expect_success 'gc --no-prune' '
 	git config gc.pruneExpire 2.days.ago &&
 	git gc --no-prune &&
 	test 1 = $(git count-objects | sed "s/ .*//") &&
-	test -f $BLOB_FILE
+	test_path_is_file $BLOB_FILE
 
 '
 
@@ -153,10 +153,10 @@ test_expect_success 'gc respects gc.pruneExpire' '
 
 	git config gc.pruneExpire 5002.days.ago &&
 	git gc &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire 5000.days.ago &&
 	git gc &&
-	test ! -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -165,9 +165,9 @@ test_expect_success 'gc --prune=<date>' '
 	add_blob &&
 	test-chmtime =-$((5001*$day)) $BLOB_FILE &&
 	git gc --prune=5002.days.ago &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	git gc --prune=5000.days.ago &&
-	test ! -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -175,9 +175,9 @@ test_expect_success 'gc --prune=never' '
 
 	add_blob &&
 	git gc --prune=never &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	git gc --prune=now &&
-	test ! -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -186,10 +186,10 @@ test_expect_success 'gc respects gc.pruneExpire=never' '
 	git config gc.pruneExpire never &&
 	add_blob &&
 	git gc &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	git config gc.pruneExpire now &&
 	git gc &&
-	test ! -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -197,9 +197,9 @@ test_expect_success 'prune --expire=never' '
 
 	add_blob &&
 	git prune --expire=never &&
-	test -f $BLOB_FILE &&
+	test_path_is_file $BLOB_FILE &&
 	git prune &&
-	test ! -f $BLOB_FILE
+	test_path_is_missing $BLOB_FILE
 
 '
 
@@ -210,10 +210,10 @@ test_expect_success 'gc: prune old objects after local clone' '
 	(
 		cd aclone &&
 		test 1 = $(git count-objects | sed "s/ .*//") &&
-		test -f $BLOB_FILE &&
+		test_path_is_file $BLOB_FILE &&
 		git gc --prune &&
 		test 0 = $(git count-objects | sed "s/ .*//") &&
-		! test -f $BLOB_FILE
+		test_path_is_missing $BLOB_FILE
 	)
 '
 
@@ -250,7 +250,7 @@ test_expect_success 'prune .git/shallow' '
 	grep $SHA1 .git/shallow &&
 	grep $SHA1 out &&
 	git prune &&
-	! test -f .git/shallow
+	test_path_is_missing .git/shallow
 '
 
 test_done
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (6 preceding siblings ...)
  2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
@ 2014-10-03 20:27 ` Jeff King
  2014-10-03 22:17   ` Junio C Hamano
  2014-10-07 13:21   ` Michael Haggerty
  2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
                   ` (9 subsequent siblings)
  17 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:27 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

For small outputs, we sometimes use:

  test "$(some_cmd)" = "something we expect"

instead of a full test_cmp. The downside of this is that
when it fails, there is no output at all from the script.
Let's introduce a small helper to make tests easier to
debug.

Signed-off-by: Jeff King <peff@peff.net>
---
This is in the same boat as the last commit; we can drop it without
hurting the rest of the series.

Is test_eq too cutesy or obfuscated? I have often wanted it when
debugging other tests, too. Our usual technique is to do:

  echo whatever >expect &&
  do_something >actual &&
  test_cmp expect actual

That's a bit verbose. We could hide it behind something like test_eq,
too, but it introduces several extra new processes. And I know people on
some fork-challenged platforms are very sensitive to the number of
spawned processes in the test suite.

 t/t5304-prune.sh        | 16 ++++++++--------
 t/test-lib-functions.sh | 11 +++++++++++
 2 files changed, 19 insertions(+), 8 deletions(-)

diff --git a/t/t5304-prune.sh b/t/t5304-prune.sh
index b0ffb05..502860e 100755
--- a/t/t5304-prune.sh
+++ b/t/t5304-prune.sh
@@ -13,7 +13,7 @@ add_blob() {
 	before=$(git count-objects | sed "s/ .*//") &&
 	BLOB=$(echo aleph_0 | git hash-object -w --stdin) &&
 	BLOB_FILE=.git/objects/$(echo $BLOB | sed "s/^../&\//") &&
-	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
+	test_eq $((1 + $before)) $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE &&
 	test-chmtime =+0 $BLOB_FILE
 }
@@ -45,11 +45,11 @@ test_expect_success 'prune --expire' '
 
 	add_blob &&
 	git prune --expire=1.hour.ago &&
-	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
+	test_eq $((1 + $before)) $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE &&
 	test-chmtime =-86500 $BLOB_FILE &&
 	git prune --expire 1.day &&
-	test $before = $(git count-objects | sed "s/ .*//") &&
+	test_eq $before $(git count-objects | sed "s/ .*//") &&
 	test_path_is_missing $BLOB_FILE
 
 '
@@ -59,11 +59,11 @@ test_expect_success 'gc: implicit prune --expire' '
 	add_blob &&
 	test-chmtime =-$((2*$week-30)) $BLOB_FILE &&
 	git gc &&
-	test $((1 + $before)) = $(git count-objects | sed "s/ .*//") &&
+	test_eq $((1 + $before)) $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE &&
 	test-chmtime =-$((2*$week+1)) $BLOB_FILE &&
 	git gc &&
-	test $before = $(git count-objects | sed "s/ .*//") &&
+	test_eq $before $(git count-objects | sed "s/ .*//") &&
 	test_path_is_missing $BLOB_FILE
 
 '
@@ -144,7 +144,7 @@ test_expect_success 'gc --no-prune' '
 	test-chmtime =-$((5001*$day)) $BLOB_FILE &&
 	git config gc.pruneExpire 2.days.ago &&
 	git gc --no-prune &&
-	test 1 = $(git count-objects | sed "s/ .*//") &&
+	test_eq 1 $(git count-objects | sed "s/ .*//") &&
 	test_path_is_file $BLOB_FILE
 
 '
@@ -209,10 +209,10 @@ test_expect_success 'gc: prune old objects after local clone' '
 	git clone --no-hardlinks . aclone &&
 	(
 		cd aclone &&
-		test 1 = $(git count-objects | sed "s/ .*//") &&
+		test_eq 1 $(git count-objects | sed "s/ .*//") &&
 		test_path_is_file $BLOB_FILE &&
 		git gc --prune &&
-		test 0 = $(git count-objects | sed "s/ .*//") &&
+		test_eq 0 $(git count-objects | sed "s/ .*//") &&
 		test_path_is_missing $BLOB_FILE
 	)
 '
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index dafd6ad..0a17614 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -634,6 +634,17 @@ test_cmp_bin() {
 	cmp "$@"
 }
 
+# This is the same as 'test "$1" $3 "$2"' except that it
+# will output a useful message to stderr on failure. If
+# $3 is omitted, defaults to "=".
+test_eq () {
+	if ! test "$1" "${3:-=}" "$2"
+	then
+		echo >&2 "test_eq failed: $1 ${3:-=} $2"
+		false
+	fi
+}
+
 # Check if the file expected to be empty is indeed empty, and barfs
 # otherwise.
 
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 09/16] prune: factor out loose-object directory traversal
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (7 preceding siblings ...)
  2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
@ 2014-10-03 20:29 ` Jeff King
  2014-10-03 22:19   ` Junio C Hamano
  2014-10-07 14:07   ` Michael Haggerty
  2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
                   ` (8 subsequent siblings)
  17 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:29 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

Prune has to walk $GIT_DIR/objects/?? in order to find the
set of loose objects to prune. Other parts of the code
(e.g., count-objects) want to do the same. Let's factor it
out into a reusable for_each-style function.

Note that this is not quite a straight code movement. There
are two differences:

  1. The original code iterated from 0 to 256, trying to
     opendir("$GIT_DIR/%02x"). The new code just does a
     readdir() on the object directory, and descends into
     any matching directories. This is faster on
     already-pruned repositories, and should not ever be
     slower (nobody ever creates other files in the object
     directory).

  2. The original code had strange behavior when it found a
     file of the form "[0-9a-f]{2}/.{38}" that did _not_
     contain all hex digits. It executed a "break" from the
     loop, meaning that we stopped pruning in that directory
     (but still pruned other directories!). This was
     probably a bug; we do not want to process the file as
     an object, but we should keep going otherwise.

Signed-off-by: Jeff King <peff@peff.net>
---
I admit the speedup in (1) almost certainly doesn't matter. It is real,
and I found out about it while writing a different program that was
basically "count-objects" across a large number of repositories. However
for a single repo it's probably not big enough to matter (calling
count-objects in a loop while get dominated by the startup costs). The
end result is a little more obvious IMHO, but that's subjective.

 builtin/prune.c | 87 ++++++++++++++++------------------------------------
 cache.h         | 31 +++++++++++++++++++
 sha1_file.c     | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 152 insertions(+), 61 deletions(-)

diff --git a/builtin/prune.c b/builtin/prune.c
index 144a3bd..8286680 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -31,11 +31,23 @@ static int prune_tmp_file(const char *fullpath)
 	return 0;
 }
 
-static int prune_object(const char *fullpath, const unsigned char *sha1)
+static int prune_object(const unsigned char *sha1, const char *fullpath,
+			void *data)
 {
 	struct stat st;
-	if (lstat(fullpath, &st))
-		return error("Could not stat '%s'", fullpath);
+
+	/*
+	 * Do we know about this object?
+	 * It must have been reachable
+	 */
+	if (lookup_object(sha1))
+		return 0;
+
+	if (lstat(fullpath, &st)) {
+		/* report errors, but do not stop pruning */
+		error("Could not stat '%s'", fullpath);
+		return 0;
+	}
 	if (st.st_mtime > expire)
 		return 0;
 	if (show_only || verbose) {
@@ -48,68 +60,20 @@ static int prune_object(const char *fullpath, const unsigned char *sha1)
 	return 0;
 }
 
-static int prune_dir(int i, struct strbuf *path)
+static int prune_cruft(const char *basename, const char *path, void *data)
 {
-	size_t baselen = path->len;
-	DIR *dir = opendir(path->buf);
-	struct dirent *de;
-
-	if (!dir)
-		return 0;
-
-	while ((de = readdir(dir)) != NULL) {
-		char name[100];
-		unsigned char sha1[20];
-
-		if (is_dot_or_dotdot(de->d_name))
-			continue;
-		if (strlen(de->d_name) == 38) {
-			sprintf(name, "%02x", i);
-			memcpy(name+2, de->d_name, 39);
-			if (get_sha1_hex(name, sha1) < 0)
-				break;
-
-			/*
-			 * Do we know about this object?
-			 * It must have been reachable
-			 */
-			if (lookup_object(sha1))
-				continue;
-
-			strbuf_addf(path, "/%s", de->d_name);
-			prune_object(path->buf, sha1);
-			strbuf_setlen(path, baselen);
-			continue;
-		}
-		if (starts_with(de->d_name, "tmp_obj_")) {
-			strbuf_addf(path, "/%s", de->d_name);
-			prune_tmp_file(path->buf);
-			strbuf_setlen(path, baselen);
-			continue;
-		}
-		fprintf(stderr, "bad sha1 file: %s/%s\n", path->buf, de->d_name);
-	}
-	closedir(dir);
-	if (!show_only)
-		rmdir(path->buf);
+	if (starts_with(basename, "tmp_obj_"))
+		prune_tmp_file(path);
+	else
+		fprintf(stderr, "bad sha1 file: %s\n", path);
 	return 0;
 }
 
-static void prune_object_dir(const char *path)
+static int prune_subdir(const char *basename, const char *path, void *data)
 {
-	struct strbuf buf = STRBUF_INIT;
-	size_t baselen;
-	int i;
-
-	strbuf_addstr(&buf, path);
-	strbuf_addch(&buf, '/');
-	baselen = buf.len;
-
-	for (i = 0; i < 256; i++) {
-		strbuf_addf(&buf, "%02x", i);
-		prune_dir(i, &buf);
-		strbuf_setlen(&buf, baselen);
-	}
+	if (!show_only)
+		rmdir(path);
+	return 0;
 }
 
 /*
@@ -173,7 +137,8 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 
 	mark_reachable_objects(&revs, 1, progress);
 	stop_progress(&progress);
-	prune_object_dir(get_object_directory());
+	for_each_loose_file_in_objdir(get_object_directory(), prune_object,
+				      prune_cruft, prune_subdir, NULL);
 
 	prune_packed_objects(show_only ? PRUNE_PACKED_DRY_RUN : 0);
 	remove_temporary_files(get_object_directory());
diff --git a/cache.h b/cache.h
index cd16e25..7abe7f6 100644
--- a/cache.h
+++ b/cache.h
@@ -1239,6 +1239,37 @@ extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsig
 extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
 extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
 
+/*
+ * Iterate over the files in the loose-object parts of the object
+ * directory "path", triggering the following callbacks:
+ *
+ *  - loose_object is called for each loose object we find.
+ *
+ *  - loose_cruft is called for any files that do not appear to be
+ *    loose objects.
+ *
+ *  - loose_subdir is called for each top-level hashed subdirectory
+ *    of the object directory (e.g., "$OBJDIR/f0"). It is called
+ *    after the objects in the directory are processed.
+ *
+ * Any callback that is NULL will be ignored. Callbacks returning non-zero
+ * will end the iteration.
+ */
+typedef int each_loose_object_fn(const unsigned char *sha1,
+				 const char *path,
+				 void *data);
+typedef int each_loose_cruft_fn(const char *basename,
+				const char *path,
+				void *data);
+typedef int each_loose_subdir_fn(const char *basename,
+				 const char *path,
+				 void *data);
+int for_each_loose_file_in_objdir(const char *path,
+				  each_loose_object_fn obj_cb,
+				  each_loose_cruft_fn cruft_cb,
+				  each_loose_subdir_fn subdir_cb,
+				  void *data);
+
 struct object_info {
 	/* Request */
 	enum object_type *typep;
diff --git a/sha1_file.c b/sha1_file.c
index bae1c15..9fdad47 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3218,3 +3218,98 @@ void assert_sha1_type(const unsigned char *sha1, enum object_type expect)
 		die("%s is not a valid '%s' object", sha1_to_hex(sha1),
 		    typename(expect));
 }
+
+static int opendir_error(const char *path)
+{
+	if (errno == ENOENT)
+		return 0;
+	return error("unable to open %s: %s", path, strerror(errno));
+}
+
+static int for_each_file_in_obj_subdir(struct strbuf *path,
+				       const char *prefix,
+				       each_loose_object_fn obj_cb,
+				       each_loose_cruft_fn cruft_cb,
+				       each_loose_subdir_fn subdir_cb,
+				       void *data)
+{
+	size_t baselen = path->len;
+	DIR *dir = opendir(path->buf);
+	struct dirent *de;
+	int r = 0;
+
+	if (!dir)
+		return opendir_error(path->buf);
+
+	while ((de = readdir(dir))) {
+		if (is_dot_or_dotdot(de->d_name))
+			continue;
+
+		strbuf_setlen(path, baselen);
+		strbuf_addf(path, "/%s", de->d_name);
+
+		if (strlen(de->d_name) == 38)  {
+			char hex[41];
+			unsigned char sha1[20];
+
+			memcpy(hex, prefix, 2);
+			memcpy(hex + 2, de->d_name, 38);
+			hex[40] = 0;
+			if (!get_sha1_hex(hex, sha1)) {
+				if (obj_cb) {
+					r = obj_cb(sha1, path->buf, data);
+					if (r)
+						break;
+				}
+				continue;
+			}
+		}
+
+		if (cruft_cb) {
+			r = cruft_cb(de->d_name, path->buf, data);
+			if (r)
+				break;
+		}
+	}
+	if (!r && subdir_cb)
+		r = subdir_cb(de->d_name, path->buf, data);
+	closedir(dir);
+	return r;
+}
+
+int for_each_loose_file_in_objdir(const char *path,
+			    each_loose_object_fn obj_cb,
+			    each_loose_cruft_fn cruft_cb,
+			    each_loose_subdir_fn subdir_cb,
+			    void *data)
+{
+	struct strbuf buf = STRBUF_INIT;
+	size_t baselen;
+	DIR *dir = opendir(path);
+	struct dirent *de;
+	int r = 0;
+
+	if (!dir)
+		return opendir_error(path);
+
+	strbuf_addstr(&buf, path);
+	baselen = buf.len;
+
+	while ((de = readdir(dir))) {
+		if (!isxdigit(de->d_name[0]) ||
+		    !isxdigit(de->d_name[1]) ||
+		    de->d_name[2])
+			continue;
+
+		strbuf_addf(&buf, "/%s", de->d_name);
+		r = for_each_file_in_obj_subdir(&buf, de->d_name, obj_cb,
+						cruft_cb, subdir_cb, data);
+		strbuf_setlen(&buf, baselen);
+		if (r)
+			break;
+	}
+
+	closedir(dir);
+	strbuf_release(&buf);
+	return r;
+}
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 10/16] count-objects: do not use xsize_t when counting object size
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (8 preceding siblings ...)
  2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
@ 2014-10-03 20:31 ` Jeff King
  2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:31 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

The point of xsize_t is to safely cast an off_t into a size_t
(because we are about to mmap). But in count-objects, we are
summing the sizes in an off_t. Using xsize_t means that
count-objects could fail on a 32-bit system with a 4G
object (not likely, as other parts of git would fail, but
we should at least be correct here).

Signed-off-by: Jeff King <peff@peff.net>
---
I think the on_disk_bytes is a little weird here, too. We count actual
disk-usage blocks for loose objects here, which makes sense. But we do
_not_ do so for packfiles, or for "garbage" files. Which seems kind of
inconsistent.

I kind of doubt anybody cares too much either way, though.

 builtin/count-objects.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index a7f70cb..316a805 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -53,7 +53,7 @@ static void count_objects(DIR *d, char *path, int len, int verbose,
 			if (lstat(path, &st) || !S_ISREG(st.st_mode))
 				bad = 1;
 			else
-				(*loose_size) += xsize_t(on_disk_bytes(st));
+				(*loose_size) += on_disk_bytes(st);
 		}
 		if (bad) {
 			if (verbose) {
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (9 preceding siblings ...)
  2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
@ 2014-10-03 20:31 ` Jeff King
  2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:31 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

This drops our line count considerably, and should make
things more readable by keeping the counting logic separate
from the traversal.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/count-objects.c | 101 ++++++++++++++----------------------------------
 1 file changed, 30 insertions(+), 71 deletions(-)

diff --git a/builtin/count-objects.c b/builtin/count-objects.c
index 316a805..e47ef0b 100644
--- a/builtin/count-objects.c
+++ b/builtin/count-objects.c
@@ -11,6 +11,9 @@
 
 static unsigned long garbage;
 static off_t size_garbage;
+static int verbose;
+static unsigned long loose, packed, packed_loose;
+static off_t loose_size;
 
 static void real_report_garbage(const char *desc, const char *path)
 {
@@ -21,61 +24,31 @@ static void real_report_garbage(const char *desc, const char *path)
 	garbage++;
 }
 
-static void count_objects(DIR *d, char *path, int len, int verbose,
-			  unsigned long *loose,
-			  off_t *loose_size,
-			  unsigned long *packed_loose)
+static void loose_garbage(const char *path)
 {
-	struct dirent *ent;
-	while ((ent = readdir(d)) != NULL) {
-		char hex[41];
-		unsigned char sha1[20];
-		const char *cp;
-		int bad = 0;
+	if (verbose)
+		report_garbage("garbage found", path);
+}
 
-		if (is_dot_or_dotdot(ent->d_name))
-			continue;
-		for (cp = ent->d_name; *cp; cp++) {
-			int ch = *cp;
-			if (('0' <= ch && ch <= '9') ||
-			    ('a' <= ch && ch <= 'f'))
-				continue;
-			bad = 1;
-			break;
-		}
-		if (cp - ent->d_name != 38)
-			bad = 1;
-		else {
-			struct stat st;
-			memcpy(path + len + 3, ent->d_name, 38);
-			path[len + 2] = '/';
-			path[len + 41] = 0;
-			if (lstat(path, &st) || !S_ISREG(st.st_mode))
-				bad = 1;
-			else
-				(*loose_size) += on_disk_bytes(st);
-		}
-		if (bad) {
-			if (verbose) {
-				struct strbuf sb = STRBUF_INIT;
-				strbuf_addf(&sb, "%.*s/%s",
-					    len + 2, path, ent->d_name);
-				report_garbage("garbage found", sb.buf);
-				strbuf_release(&sb);
-			}
-			continue;
-		}
-		(*loose)++;
-		if (!verbose)
-			continue;
-		memcpy(hex, path+len, 2);
-		memcpy(hex+2, ent->d_name, 38);
-		hex[40] = 0;
-		if (get_sha1_hex(hex, sha1))
-			die("internal error");
-		if (has_sha1_pack(sha1))
-			(*packed_loose)++;
+static int count_loose(const unsigned char *sha1, const char *path, void *data)
+{
+	struct stat st;
+
+	if (lstat(path, &st) || !S_ISREG(st.st_mode))
+		loose_garbage(path);
+	else {
+		loose_size += on_disk_bytes(st);
+		loose++;
+		if (verbose && has_sha1_pack(sha1))
+			packed_loose++;
 	}
+	return 0;
+}
+
+static int count_cruft(const char *basename, const char *path, void *data)
+{
+	loose_garbage(path);
+	return 0;
 }
 
 static char const * const count_objects_usage[] = {
@@ -85,12 +58,7 @@ static char const * const count_objects_usage[] = {
 
 int cmd_count_objects(int argc, const char **argv, const char *prefix)
 {
-	int i, verbose = 0, human_readable = 0;
-	const char *objdir = get_object_directory();
-	int len = strlen(objdir);
-	char *path = xmalloc(len + 50);
-	unsigned long loose = 0, packed = 0, packed_loose = 0;
-	off_t loose_size = 0;
+	int human_readable = 0;
 	struct option opts[] = {
 		OPT__VERBOSE(&verbose, N_("be verbose")),
 		OPT_BOOL('H', "human-readable", &human_readable,
@@ -104,19 +72,10 @@ int cmd_count_objects(int argc, const char **argv, const char *prefix)
 		usage_with_options(count_objects_usage, opts);
 	if (verbose)
 		report_garbage = real_report_garbage;
-	memcpy(path, objdir, len);
-	if (len && objdir[len-1] != '/')
-		path[len++] = '/';
-	for (i = 0; i < 256; i++) {
-		DIR *d;
-		sprintf(path + len, "%02x", i);
-		d = opendir(path);
-		if (!d)
-			continue;
-		count_objects(d, path, len, verbose,
-			      &loose, &loose_size, &packed_loose);
-		closedir(d);
-	}
+
+	for_each_loose_file_in_objdir(get_object_directory(),
+				      count_loose, count_cruft, NULL, NULL);
+
 	if (verbose) {
 		struct packed_git *p;
 		unsigned long num_pack = 0;
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (10 preceding siblings ...)
  2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
@ 2014-10-03 20:32 ` Jeff King
  2014-10-05  8:15   ` René Scharfe
  2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:32 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

We typically iterate over the reachable objects in a
repository by starting at the tips and walking the graph.
There's no easy way to iterate over all of the objects,
including unreachable ones. Let's provide a way of doing so.

Signed-off-by: Jeff King <peff@peff.net>
---
 cache.h     | 11 +++++++++++
 sha1_file.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 73 insertions(+)

diff --git a/cache.h b/cache.h
index 7abe7f6..3826b4b 100644
--- a/cache.h
+++ b/cache.h
@@ -1270,6 +1270,17 @@ int for_each_loose_file_in_objdir(const char *path,
 				  each_loose_subdir_fn subdir_cb,
 				  void *data);
 
+/*
+ * Iterate over loose and packed objects in both the local
+ * repository and any alternates repositories.
+ */
+typedef int each_packed_object_fn(const unsigned char *sha1,
+				  struct packed_git *pack,
+				  uint32_t pos,
+				  void *data);
+extern int for_each_loose_object(each_loose_object_fn, void *);
+extern int for_each_packed_object(each_packed_object_fn, void *);
+
 struct object_info {
 	/* Request */
 	enum object_type *typep;
diff --git a/sha1_file.c b/sha1_file.c
index 9fdad47..d017289 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -3313,3 +3313,65 @@ int for_each_loose_file_in_objdir(const char *path,
 	strbuf_release(&buf);
 	return r;
 }
+
+struct loose_alt_odb_data {
+	each_loose_object_fn *cb;
+	void *data;
+};
+
+static int loose_from_alt_odb(struct alternate_object_database *alt,
+			      void *vdata)
+{
+	struct loose_alt_odb_data *data = vdata;
+	return for_each_loose_file_in_objdir(alt->base,
+					     data->cb, NULL, NULL,
+					     data->data);
+}
+
+int for_each_loose_object(each_loose_object_fn cb, void *data)
+{
+	struct loose_alt_odb_data alt;
+	int r;
+
+	r = for_each_loose_file_in_objdir(get_object_directory(),
+					  cb, NULL, NULL, data);
+	if (r)
+		return r;
+
+	alt.cb = cb;
+	alt.data = data;
+	return foreach_alt_odb(loose_from_alt_odb, &alt);
+}
+
+int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
+{
+	uint32_t i;
+	int r = 0;
+
+	for (i = 0; i < p->num_objects; i++) {
+		const unsigned char *sha1 = nth_packed_object_sha1(p, i);
+
+		if (!sha1)
+			return error("unable to get sha1 of object %u in %s",
+				     i, p->pack_name);
+
+		r = cb(sha1, p, i, data);
+		if (r)
+			break;
+	}
+	return r;
+}
+
+int for_each_packed_object(each_packed_object_fn cb, void *data)
+{
+	struct packed_git *p;
+	int r = 0;
+
+	prepare_packed_git();
+	for (p = packed_git; p; p = p->next) {
+		r = for_each_object_in_pack(p, cb, data);
+		if (r)
+			break;
+	}
+	return 0;
+}
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (11 preceding siblings ...)
  2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
@ 2014-10-03 20:39 ` Jeff King
  2014-10-03 21:47   ` Junio C Hamano
  2014-10-07 16:29   ` Michael Haggerty
  2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
                   ` (4 subsequent siblings)
  17 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:39 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

Our current strategy with prune is that an object falls into
one of three categories:

  1. Reachable (from ref tips, reflogs, index, etc).

  2. Not reachable, but recent (based on the --expire time
     and the file's mtime).

  3. Not reachable and not recent.

We keep objects from (1) and (2), but prune objects in (3).
The point of (2) is that these objects may be part of an
in-progress operation that has not yet updated any refs.

However, it is not always the case that objects for an
in-progress operation will have a recent mtime. For example,
the object database may have an old copy of a blob (from an
abandoned operation, a branch that was deleted, etc). If we
create a new tree that points to it, a simultaneous prune
will leave our tree, but delete the blob. Referencing that
tree with a commit will then work (we check that the tree is
in the object database, but not that all of its referred
objects are), as will mentioning the commit in a ref. But
the resulting repo is corrupt; we are missing the blob
reachable from a ref.

One way to solve this is to be more thorough when
referencing a sha1: make sure that not only do we have that
sha1, but that we have the objects it refers to, and so
forth recursively. The problem is that this is very
expensive.  Creating a parent link would require traversing
the entire object graph down to the roots.

Instead, this patch pushes the extra work onto prune, which
runs less frequently (and has to look at the whole object
graph anyway). It creates a new category of objects: objects
which are not recent, but which are reachable from a recent
object. We do not prune these objects, just like the
reachable and recent ones.

This lets us avoid the recursive check above, because if we
have an object, even if it is unreachable, we should have
its referent:

  - if we are creating new objects, then we cannot create
    the parent object without having the child

  - and if we are pruning objects, will not prune the child
    if we are keeping the parent

The big exception would be if one were to write the object
in a way that avoided referential integrity (e.g., using
hash-object). But if you are in the habit of doing that, you
deserve what you get.

Naively, the simplest way to implement this would be to add
all recent objects as tips to the reachability traversal.
However, this does not perform well. In a recently-packed
repository, all reachable objects will also be recent, and
therefore we have to consider each object twice (both as a
tip, and when we reach it in the traversal). I tested this,
and it added about 10s to a 30s prune on linux.git. This
patch instead performs the normal reachability traversal
first, then follows up with a second traversal for recent
objects, skipping any that have already been marked.

Signed-off-by: Jeff King <peff@peff.net>
---
I put the mark-recent code into mark_reachable_objects here,
but it does not technically have to be there. It reuses the
same rev_info object (which is convenient), but the SEEN
flags from the first traversal are marked on the global
commit objects themselves. So we could break it out into a
separate function.

However, we'd have to refactor the progress reporting; the
numbers are kept internally to mark_reachable, and we would
want to continue them for the second traversal (though I
suppose you could start a second progress meter with
"Checking recent objects" or something if you wanted).

 builtin/prune.c            |   2 +-
 builtin/reflog.c           |   2 +-
 reachable.c                | 111 +++++++++++++++++++++++++++++++++++++++++++++
 reachable.h                |   3 +-
 t/t6501-freshen-objects.sh |  88 +++++++++++++++++++++++++++++++++++
 5 files changed, 203 insertions(+), 3 deletions(-)
 create mode 100755 t/t6501-freshen-objects.sh

diff --git a/builtin/prune.c b/builtin/prune.c
index 8286680..a965574 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -135,7 +135,7 @@ int cmd_prune(int argc, const char **argv, const char *prefix)
 	if (show_progress)
 		progress = start_progress_delay(_("Checking connectivity"), 0, 0, 2);
 
-	mark_reachable_objects(&revs, 1, progress);
+	mark_reachable_objects(&revs, 1, expire, progress);
 	stop_progress(&progress);
 	for_each_loose_file_in_objdir(get_object_directory(), prune_object,
 				      prune_cruft, prune_subdir, NULL);
diff --git a/builtin/reflog.c b/builtin/reflog.c
index e8a8fb1..80bddc2 100644
--- a/builtin/reflog.c
+++ b/builtin/reflog.c
@@ -649,7 +649,7 @@ static int cmd_reflog_expire(int argc, const char **argv, const char *prefix)
 		init_revisions(&cb.revs, prefix);
 		if (cb.verbose)
 			printf("Marking reachable objects...");
-		mark_reachable_objects(&cb.revs, 0, NULL);
+		mark_reachable_objects(&cb.revs, 0, 0, NULL);
 		if (cb.verbose)
 			putchar('\n');
 	}
diff --git a/reachable.c b/reachable.c
index d99bd31..f443265 100644
--- a/reachable.c
+++ b/reachable.c
@@ -212,7 +212,109 @@ static void add_cache_refs(struct rev_info *revs)
 		add_cache_tree(active_cache_tree, revs);
 }
 
+struct recent_data {
+	struct rev_info *revs;
+	unsigned long timestamp;
+};
+
+static void add_recent_object(const unsigned char *sha1,
+			      unsigned long mtime,
+			      struct recent_data *data)
+{
+	struct object *obj;
+	enum object_type type;
+
+	if (mtime <= data->timestamp)
+		return;
+
+	/*
+	 * We do not want to call parse_object here, because
+	 * inflating blobs and trees could be very expensive.
+	 * However, we do need to know the correct type for
+	 * later processing, and the revision machinery expects
+	 * commits and tags to have been parsed.
+	 */
+	type = sha1_object_info(sha1, NULL);
+	if (type < 0)
+		die("unable to get object info for %s", sha1_to_hex(sha1));
+
+	switch (type) {
+	case OBJ_TAG:
+	case OBJ_COMMIT:
+		obj = parse_object_or_die(sha1, NULL);
+		break;
+	case OBJ_TREE:
+		obj = (struct object *)lookup_tree(sha1);
+		break;
+	case OBJ_BLOB:
+		obj = (struct object *)lookup_blob(sha1);
+		break;
+	default:
+		die("unknown object type for %s: %s",
+		    sha1_to_hex(sha1), typename(type));
+	}
+
+	if (!obj)
+		die("unable to lookup %s", sha1_to_hex(sha1));
+
+	add_pending_object(data->revs, obj, "");
+}
+
+static int add_recent_loose(const unsigned char *sha1,
+			    const char *path, void *data)
+{
+	struct stat st;
+	struct object *obj = lookup_object(sha1);
+
+	if (obj && obj->flags & SEEN)
+		return 0;
+
+	if (stat(path, &st) < 0) {
+		/*
+		 * It's OK if an object went away during our iteration; this
+		 * could be due to a simultaneous repack. But anything else
+		 * we should abort, since we might then fail to mark objects
+		 * which should not be pruned.
+		 */
+		if (errno == ENOENT)
+			return 0;
+		return error("unable to stat %s: %s",
+			     sha1_to_hex(sha1), strerror(errno));
+	}
+
+	add_recent_object(sha1, st.st_mtime, data);
+	return 0;
+}
+
+static int add_recent_packed(const unsigned char *sha1,
+			     struct packed_git *p, uint32_t pos,
+			     void *data)
+{
+	struct object *obj = lookup_object(sha1);
+
+	if (obj && obj->flags & SEEN)
+		return 0;
+	add_recent_object(sha1, p->mtime, data);
+	return 0;
+}
+
+static int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
+						  unsigned long timestamp)
+{
+	struct recent_data data;
+	int r;
+
+	data.revs = revs;
+	data.timestamp = timestamp;
+
+	r = for_each_loose_object(add_recent_loose, &data);
+	if (r)
+		return r;
+	return for_each_packed_object(add_recent_packed, &data);
+}
+
 void mark_reachable_objects(struct rev_info *revs, int mark_reflog,
+			    unsigned long mark_recent,
 			    struct progress *progress)
 {
 	struct connectivity_progress cp;
@@ -248,5 +350,14 @@ void mark_reachable_objects(struct rev_info *revs, int mark_reflog,
 	if (prepare_revision_walk(revs))
 		die("revision walk setup failed");
 	walk_commit_list(revs, &cp);
+
+	if (mark_recent) {
+		if (add_unseen_recent_objects_to_traversal(revs, mark_recent))
+			die("unable to mark recent objects");
+		if (prepare_revision_walk(revs))
+			die("revision walk setup failed");
+		walk_commit_list(revs, &cp);
+	}
+
 	display_progress(cp.progress, cp.count);
 }
diff --git a/reachable.h b/reachable.h
index 5d082ad..141fe30 100644
--- a/reachable.h
+++ b/reachable.h
@@ -2,6 +2,7 @@
 #define REACHEABLE_H
 
 struct progress;
-extern void mark_reachable_objects(struct rev_info *revs, int mark_reflog, struct progress *);
+extern void mark_reachable_objects(struct rev_info *revs, int mark_reflog,
+				   unsigned long mark_recent, struct progress *);
 
 #endif
diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
new file mode 100755
index 0000000..de941c2
--- /dev/null
+++ b/t/t6501-freshen-objects.sh
@@ -0,0 +1,88 @@
+#!/bin/sh
+#
+# This test covers the handling of objects which might have old
+# mtimes in the filesystem (because they were used previously)
+# and are just now becoming referenced again.
+#
+# We're going to do two things that are a little bit "fake" to
+# help make our simulation easier:
+#
+#   1. We'll turn off reflogs. You can still run into
+#      problems with reflogs on, but your objects
+#      don't get pruned until both the reflog expiration
+#      has passed on their references, _and_ they are out
+#      of prune's expiration period. Dropping reflogs
+#      means we only have to deal with one variable in our tests,
+#      but the results generalize.
+#
+#   2. We'll use a temporary index file to create our
+#      works-in-progress. Most workflows would mention
+#      referenced objects in the index, which prune takes
+#      into account. However, many operations don't. For
+#      example, a partial commit with "git commit foo"
+#      will use a temporary index. Or they may not need
+#      an index at all (e.g., creating a new commit
+#      to refer to an existing tree).
+
+test_description='check pruning of dependent objects'
+. ./test-lib.sh
+
+# We care about reachability, so we do not want to use
+# the normal test_commit, which creates extra tags.
+add () {
+	echo "$1" >"$1" &&
+	git add "$1"
+}
+commit () {
+	test_tick &&
+	add "$1" &&
+	git commit -m "$1"
+}
+
+test_expect_success 'disable reflogs' '
+	git config core.logallrefupdates false &&
+	rm -rf .git/logs
+'
+
+test_expect_success 'setup basic history' '
+	commit base
+'
+
+test_expect_success 'create and abandon some objects' '
+	git checkout -b experiment &&
+	commit abandon &&
+	git checkout master &&
+	git branch -D experiment
+'
+
+test_expect_success 'simulate time passing' '
+	find .git/objects -type f |
+	xargs test-chmtime -v -86400
+'
+
+test_expect_success 'start writing new commit with old blob' '
+	tree=$(
+		GIT_INDEX_FILE=index.tmp &&
+		export GIT_INDEX_FILE &&
+		git read-tree HEAD &&
+		add unrelated &&
+		add abandon &&
+		git write-tree
+	)
+'
+
+test_expect_success 'simultaneous gc' '
+	git gc --prune=12.hours.ago
+'
+
+test_expect_success 'finish writing out commit' '
+	commit=$(echo foo | git commit-tree -p HEAD $tree) &&
+	git update-ref HEAD $commit
+'
+
+# "abandon" blob should have been rescued by reference from new tree
+test_expect_success 'repository passes fsck' '
+	git fsck
+'
+
+test_done
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (12 preceding siblings ...)
  2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
@ 2014-10-03 20:39 ` Jeff King
  2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:39 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

When we are loosening unreachable packed objects, we do not
bother to process objects that would simply be pruned
immediately anyway. The "would be pruned" check is a simple
comparison, but is about to get more complicated. Let's pull
it out into a separate function.

Note that this is slightly less efficient than the original,
which avoided even opening old packs, since no object in
them could pass the current check, which cares only about
the pack mtime.  But the new rules will depend on the exact
object, so we need to perform the check even for old packs.

Note also that we fix a minor buglet when the pack mtime is
exactly the same as the expiration time. The prune code
considers that worth pruning, whereas our check here
considered it worth keeping. This wasn't a big deal. Besides
being unlikely to happen, the result was simply that the
object was loosened and then pruned, missing the
optimization. Still, we can easily fix it while we are here.

Signed-off-by: Jeff King <peff@peff.net>
---
 builtin/pack-objects.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index d391934..2fe2ab0 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -2407,6 +2407,16 @@ static int has_sha1_pack_kept_or_nonlocal(const unsigned char *sha1)
 	return 0;
 }
 
+static int loosened_object_can_be_discarded(const unsigned char *sha1,
+					    unsigned long mtime)
+{
+	if (!unpack_unreachable_expiration)
+		return 0;
+	if (mtime > unpack_unreachable_expiration)
+		return 0;
+	return 1;
+}
+
 static void loosen_unused_packed_objects(struct rev_info *revs)
 {
 	struct packed_git *p;
@@ -2417,17 +2427,14 @@ static void loosen_unused_packed_objects(struct rev_info *revs)
 		if (!p->pack_local || p->pack_keep)
 			continue;
 
-		if (unpack_unreachable_expiration &&
-		    p->mtime < unpack_unreachable_expiration)
-			continue;
-
 		if (open_pack_index(p))
 			die("cannot open pack index");
 
 		for (i = 0; i < p->num_objects; i++) {
 			sha1 = nth_packed_object_sha1(p, i);
 			if (!packlist_find(&to_pack, sha1, NULL) &&
-				!has_sha1_pack_kept_or_nonlocal(sha1))
+			    !has_sha1_pack_kept_or_nonlocal(sha1) &&
+			    !loosened_object_can_be_discarded(sha1, p->mtime))
 				if (force_object_loose(sha1, p->mtime))
 					die("unable to force loose object");
 		}
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 15/16] pack-objects: match prune logic for discarding objects
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (13 preceding siblings ...)
  2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
@ 2014-10-03 20:40 ` Jeff King
  2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:40 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

A recent commit taught git-prune to keep non-recent objects
that are reachable from recent ones. However, pack-objects,
when loosening unreachable objects, tries to optimize out
the write in the case that the object will be immediately
pruned. It now gets this wrong, since its rule does not
reflect the new prune code (and this can be seen by running
t6501 with a strategically placed repack).

Let's teach pack-objects similar logic.

Signed-off-by: Jeff King <peff@peff.net>
---
The test changes look big because of the indentation. View with "-w" for
a sane diff.

 builtin/pack-objects.c     | 38 +++++++++++++++++++
 reachable.c                |  4 +-
 reachable.h                |  2 +
 t/t6501-freshen-objects.sh | 93 +++++++++++++++++++++++++++-------------------
 4 files changed, 97 insertions(+), 40 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 2fe2ab0..1db5ea5 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -20,6 +20,8 @@
 #include "streaming.h"
 #include "thread-utils.h"
 #include "pack-bitmap.h"
+#include "reachable.h"
+#include "sha1-array.h"
 
 static const char *pack_usage[] = {
 	N_("git pack-objects --stdout [options...] [< ref-list | < object-list]"),
@@ -2407,6 +2409,15 @@ static int has_sha1_pack_kept_or_nonlocal(const unsigned char *sha1)
 	return 0;
 }
 
+/*
+ * Store a list of sha1s that are should not be discarded
+ * because they are either written too recently, or are
+ * reachable from another object that was.
+ *
+ * This is filled by get_object_list.
+ */
+static struct sha1_array recent_objects;
+
 static int loosened_object_can_be_discarded(const unsigned char *sha1,
 					    unsigned long mtime)
 {
@@ -2414,6 +2425,8 @@ static int loosened_object_can_be_discarded(const unsigned char *sha1,
 		return 0;
 	if (mtime > unpack_unreachable_expiration)
 		return 0;
+	if (sha1_array_lookup(&recent_objects, sha1) >= 0)
+		return 0;
 	return 1;
 }
 
@@ -2470,6 +2483,19 @@ static int get_object_list_from_bitmap(struct rev_info *revs)
 	return 0;
 }
 
+static void record_recent_object(struct object *obj,
+				 const struct name_path *path,
+				 const char *last,
+				 void *data)
+{
+	sha1_array_append(&recent_objects, obj->sha1);
+}
+
+static void record_recent_commit(struct commit *commit, void *data)
+{
+	sha1_array_append(&recent_objects, commit->object.sha1);
+}
+
 static void get_object_list(int ac, const char **av)
 {
 	struct rev_info revs;
@@ -2517,10 +2543,22 @@ static void get_object_list(int ac, const char **av)
 	mark_edges_uninteresting(&revs, show_edge);
 	traverse_commit_list(&revs, show_commit, show_object, NULL);
 
+	if (unpack_unreachable_expiration) {
+		if (add_unseen_recent_objects_to_traversal(&revs,
+				unpack_unreachable_expiration))
+			die("unable to add recent objects");
+		if (prepare_revision_walk(&revs))
+			die("revision walk setup failed");
+		traverse_commit_list(&revs, record_recent_commit,
+				     record_recent_object, NULL);
+	}
+
 	if (keep_unreachable)
 		add_objects_in_unpacked_packs(&revs);
 	if (unpack_unreachable)
 		loosen_unused_packed_objects(&revs);
+
+	sha1_array_clear(&recent_objects);
 }
 
 static int option_parse_index_version(const struct option *opt,
diff --git a/reachable.c b/reachable.c
index f443265..5e35f21 100644
--- a/reachable.c
+++ b/reachable.c
@@ -298,8 +298,8 @@ static int add_recent_packed(const unsigned char *sha1,
 	return 0;
 }
 
-static int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
-						  unsigned long timestamp)
+int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
+					   unsigned long timestamp)
 {
 	struct recent_data data;
 	int r;
diff --git a/reachable.h b/reachable.h
index 141fe30..d23efc3 100644
--- a/reachable.h
+++ b/reachable.h
@@ -2,6 +2,8 @@
 #define REACHEABLE_H
 
 struct progress;
+extern int add_unseen_recent_objects_to_traversal(struct rev_info *revs,
+						  unsigned long timestamp);
 extern void mark_reachable_objects(struct rev_info *revs, int mark_reflog,
 				   unsigned long mark_recent, struct progress *);
 
diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
index de941c2..e25c47d 100755
--- a/t/t6501-freshen-objects.sh
+++ b/t/t6501-freshen-objects.sh
@@ -39,50 +39,67 @@ commit () {
 	git commit -m "$1"
 }
 
-test_expect_success 'disable reflogs' '
-	git config core.logallrefupdates false &&
-	rm -rf .git/logs
-'
+maybe_repack () {
+	if test -n "$repack"; then
+		git repack -ad
+	fi
+}
+
+for repack in '' true; do
+	title=${repack:+repack}
+	title=${title:-loose}
+
+	test_expect_success "make repo completely empty ($title)" '
+		rm -rf .git &&
+		git init
+	'
+
+	test_expect_success "disable reflogs ($title)" '
+		git config core.logallrefupdates false &&
+		rm -rf .git/logs
+	'
 
-test_expect_success 'setup basic history' '
-	commit base
-'
+	test_expect_success "setup basic history ($title)" '
+		commit base
+	'
 
-test_expect_success 'create and abandon some objects' '
-	git checkout -b experiment &&
-	commit abandon &&
-	git checkout master &&
-	git branch -D experiment
-'
+	test_expect_success "create and abandon some objects ($title)" '
+		git checkout -b experiment &&
+		commit abandon &&
+		maybe_repack &&
+		git checkout master &&
+		git branch -D experiment
+	'
 
-test_expect_success 'simulate time passing' '
-	find .git/objects -type f |
-	xargs test-chmtime -v -86400
-'
+	test_expect_success "simulate time passing ($title)" '
+		find .git/objects -type f |
+		xargs test-chmtime -v -86400
+	'
 
-test_expect_success 'start writing new commit with old blob' '
-	tree=$(
-		GIT_INDEX_FILE=index.tmp &&
-		export GIT_INDEX_FILE &&
-		git read-tree HEAD &&
-		add unrelated &&
-		add abandon &&
-		git write-tree
-	)
-'
+	test_expect_success "start writing new commit with old blob ($title)" '
+		tree=$(
+			GIT_INDEX_FILE=index.tmp &&
+			export GIT_INDEX_FILE &&
+			git read-tree HEAD &&
+			add unrelated &&
+			add abandon &&
+			git write-tree
+		)
+	'
 
-test_expect_success 'simultaneous gc' '
-	git gc --prune=12.hours.ago
-'
+	test_expect_success "simultaneous gc ($title)" '
+		git gc --prune=12.hours.ago
+	'
 
-test_expect_success 'finish writing out commit' '
-	commit=$(echo foo | git commit-tree -p HEAD $tree) &&
-	git update-ref HEAD $commit
-'
+	test_expect_success "finish writing out commit ($title)" '
+		commit=$(echo foo | git commit-tree -p HEAD $tree) &&
+		git update-ref HEAD $commit
+	'
 
-# "abandon" blob should have been rescued by reference from new tree
-test_expect_success 'repository passes fsck' '
-	git fsck
-'
+	# "abandon" blob should have been rescued by reference from new tree
+	test_expect_success "repository passes fsck ($title)" '
+		git fsck
+	'
+done
 
 test_done
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* [PATCH 16/16] write_sha1_file: freshen existing objects
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (14 preceding siblings ...)
  2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
@ 2014-10-03 20:41 ` Jeff King
  2014-10-03 21:29   ` Junio C Hamano
  2014-10-05  9:12   ` René Scharfe
  2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
  2014-10-04 22:22 ` Junio C Hamano
  17 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-10-03 20:41 UTC (permalink / raw)
  To: git; +Cc: Michael Haggerty

When we try to write a loose object file, we first check
whether that object already exists. If so, we skip the
write as an optimization. However, this can interfere with
prune's strategy of using mtimes to mark files in progress.

For example, if a branch contains a particular tree object
and is deleted, that tree object may become unreachable, and
have an old mtime. If a new operation then tries to write
the same tree, this ends up as a noop; we notice we
already have the object and do nothing. A prune running
simultaneously with this operation will see the object as
old, and may delete it.

We can solve this by "freshening" objects that we avoid
writing by updating their mtime. The algorithm for doing so
is essentially the same as that of has_sha1_file. Therefore
we provide a new (static) interface "check_and_freshen",
which finds and optionally freshens the object. It's trivial
to implement freshening and simple checking by tweaking a
single parameter.

Signed-off-by: Jeff King <peff@peff.net>
---
 sha1_file.c                | 51 +++++++++++++++++++++++++++++++++++++++-------
 t/t6501-freshen-objects.sh | 27 ++++++++++++++++++++++++
 2 files changed, 71 insertions(+), 7 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index d017289..d263b38 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -442,27 +442,53 @@ void prepare_alt_odb(void)
 	read_info_alternates(get_object_directory(), 0);
 }
 
-static int has_loose_object_local(const unsigned char *sha1)
+static int freshen_file(const char *fn)
 {
-	return !access(sha1_file_name(sha1), F_OK);
+	struct utimbuf t;
+	t.actime = t.modtime = time(NULL);
+	return utime(fn, &t);
 }
 
-int has_loose_object_nonlocal(const unsigned char *sha1)
+static int check_and_freshen_file(const char *fn, int freshen)
+{
+	if (access(fn, F_OK))
+		return 0;
+	if (freshen && freshen_file(fn))
+		return 0;
+	return 1;
+}
+
+static int check_and_freshen_local(const unsigned char *sha1, int freshen)
+{
+	return check_and_freshen_file(sha1_file_name(sha1), freshen);
+}
+
+static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
 {
 	struct alternate_object_database *alt;
 	prepare_alt_odb();
 	for (alt = alt_odb_list; alt; alt = alt->next) {
 		fill_sha1_path(alt->name, sha1);
-		if (!access(alt->base, F_OK))
+		if (check_and_freshen_file(alt->base, freshen))
 			return 1;
 	}
 	return 0;
 }
 
+static int check_and_freshen(const unsigned char *sha1, int freshen)
+{
+	return check_and_freshen_local(sha1, freshen) ||
+	       check_and_freshen_nonlocal(sha1, freshen);
+}
+
+int has_loose_object_nonlocal(const unsigned char *sha1)
+{
+	return check_and_freshen_nonlocal(sha1, 0);
+}
+
 static int has_loose_object(const unsigned char *sha1)
 {
-	return has_loose_object_local(sha1) ||
-	       has_loose_object_nonlocal(sha1);
+	return check_and_freshen(sha1, 0);
 }
 
 static unsigned int pack_used_ctr;
@@ -2949,6 +2975,17 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
 	return move_temp_to_file(tmp_file, filename);
 }
 
+static int freshen_loose_object(const unsigned char *sha1)
+{
+	return check_and_freshen(sha1, 1);
+}
+
+static int freshen_packed_object(const unsigned char *sha1)
+{
+	struct pack_entry e;
+	return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
+}
+
 int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1)
 {
 	unsigned char sha1[20];
@@ -2961,7 +2998,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
 	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
 	if (returnsha1)
 		hashcpy(returnsha1, sha1);
-	if (has_sha1_file(sha1))
+	if (freshen_loose_object(sha1) || freshen_packed_object(sha1))
 		return 0;
 	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
 }
diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
index e25c47d..157f3f9 100755
--- a/t/t6501-freshen-objects.sh
+++ b/t/t6501-freshen-objects.sh
@@ -100,6 +100,33 @@ for repack in '' true; do
 	test_expect_success "repository passes fsck ($title)" '
 		git fsck
 	'
+
+	test_expect_success "abandon objects again ($title)" '
+		git reset --hard HEAD^ &&
+		find .git/objects -type f |
+		xargs test-chmtime -v -86400
+	'
+
+	test_expect_success "start writing new commit with same tree ($title)" '
+		tree=$(
+			GIT_INDEX_FILE=index.tmp &&
+			export GIT_INDEX_FILE &&
+			git read-tree HEAD &&
+			add abandon &&
+			add unrelated &&
+			git write-tree
+		)
+	'
+
+	test_expect_success "simultaneous gc ($title)" '
+		git gc --prune=12.hours.ago
+	'
+
+	# tree should have been refreshed by write-tree
+	test_expect_success "finish writing out commit ($title)" '
+		commit=$(echo foo | git commit-tree -p HEAD $tree) &&
+		git update-ref HEAD $commit
+	'
 done
 
 test_done
-- 
2.1.1.566.gdb1f904

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 16/16] write_sha1_file: freshen existing objects
  2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
@ 2014-10-03 21:29   ` Junio C Hamano
  2014-10-04  0:01     ` Jeff King
  2014-10-05  9:12   ` René Scharfe
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 21:29 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> When we try to write a loose object file, we first check
> whether that object already exists. If so, we skip the
> write as an optimization. However, this can interfere with
> prune's strategy of using mtimes to mark files in progress.
>
> For example, if a branch contains a particular tree object
> and is deleted, that tree object may become unreachable, and
> have an old mtime. If a new operation then tries to write
> the same tree, this ends up as a noop; we notice we
> already have the object and do nothing. A prune running
> simultaneously with this operation will see the object as
> old, and may delete it.
>
> We can solve this by "freshening" objects that we avoid
> writing by updating their mtime. The algorithm for doing so
> is essentially the same as that of has_sha1_file. Therefore
> we provide a new (static) interface "check_and_freshen",
> which finds and optionally freshens the object. It's trivial
> to implement freshening and simple checking by tweaking a
> single parameter.

An old referent by a recent unreachable may be in pack.  Is it
expected that the same pack will have many similar old objects (in
other words, is it worth trying to optimize check-and-freshen by
bypassing access() and utime(), perhaps by keeping a "freshened in
this process already" flag in struct packed_git)?

Could check-and-freshen-nonlocal() ever be called with freshen set
to true?  Should it be?  In other words, should we be mucking with
objects in other people's repositories with utime()?

Other than these two minor questions, I am happy with this patch.

Thanks.

> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  sha1_file.c                | 51 +++++++++++++++++++++++++++++++++++++++-------
>  t/t6501-freshen-objects.sh | 27 ++++++++++++++++++++++++
>  2 files changed, 71 insertions(+), 7 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index d017289..d263b38 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -442,27 +442,53 @@ void prepare_alt_odb(void)
>  	read_info_alternates(get_object_directory(), 0);
>  }
>  
> -static int has_loose_object_local(const unsigned char *sha1)
> +static int freshen_file(const char *fn)
>  {
> -	return !access(sha1_file_name(sha1), F_OK);
> +	struct utimbuf t;
> +	t.actime = t.modtime = time(NULL);
> +	return utime(fn, &t);
>  }
>  
> -int has_loose_object_nonlocal(const unsigned char *sha1)
> +static int check_and_freshen_file(const char *fn, int freshen)
> +{
> +	if (access(fn, F_OK))
> +		return 0;
> +	if (freshen && freshen_file(fn))
> +		return 0;
> +	return 1;
> +}
> +
> +static int check_and_freshen_local(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_file(sha1_file_name(sha1), freshen);
> +}
> +
> +static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
>  {
>  	struct alternate_object_database *alt;
>  	prepare_alt_odb();
>  	for (alt = alt_odb_list; alt; alt = alt->next) {
>  		fill_sha1_path(alt->name, sha1);
> -		if (!access(alt->base, F_OK))
> +		if (check_and_freshen_file(alt->base, freshen))
>  			return 1;
>  	}
>  	return 0;
>  }
>  
> +static int check_and_freshen(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_local(sha1, freshen) ||
> +	       check_and_freshen_nonlocal(sha1, freshen);
> +}
> +
> +int has_loose_object_nonlocal(const unsigned char *sha1)
> +{
> +	return check_and_freshen_nonlocal(sha1, 0);
> +}
> +
>  static int has_loose_object(const unsigned char *sha1)
>  {
> -	return has_loose_object_local(sha1) ||
> -	       has_loose_object_nonlocal(sha1);
> +	return check_and_freshen(sha1, 0);
>  }
>  
>  static unsigned int pack_used_ctr;
> @@ -2949,6 +2975,17 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
>  	return move_temp_to_file(tmp_file, filename);
>  }
>  
> +static int freshen_loose_object(const unsigned char *sha1)
> +{
> +	return check_and_freshen(sha1, 1);
> +}
> +
> +static int freshen_packed_object(const unsigned char *sha1)
> +{
> +	struct pack_entry e;
> +	return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
> +}
> +
>  int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1)
>  {
>  	unsigned char sha1[20];
> @@ -2961,7 +2998,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
>  	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
>  	if (returnsha1)
>  		hashcpy(returnsha1, sha1);
> -	if (has_sha1_file(sha1))
> +	if (freshen_loose_object(sha1) || freshen_packed_object(sha1))
>  		return 0;
>  	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
>  }
> diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
> index e25c47d..157f3f9 100755
> --- a/t/t6501-freshen-objects.sh
> +++ b/t/t6501-freshen-objects.sh
> @@ -100,6 +100,33 @@ for repack in '' true; do
>  	test_expect_success "repository passes fsck ($title)" '
>  		git fsck
>  	'
> +
> +	test_expect_success "abandon objects again ($title)" '
> +		git reset --hard HEAD^ &&
> +		find .git/objects -type f |
> +		xargs test-chmtime -v -86400
> +	'
> +
> +	test_expect_success "start writing new commit with same tree ($title)" '
> +		tree=$(
> +			GIT_INDEX_FILE=index.tmp &&
> +			export GIT_INDEX_FILE &&
> +			git read-tree HEAD &&
> +			add abandon &&
> +			add unrelated &&
> +			git write-tree
> +		)
> +	'
> +
> +	test_expect_success "simultaneous gc ($title)" '
> +		git gc --prune=12.hours.ago
> +	'
> +
> +	# tree should have been refreshed by write-tree
> +	test_expect_success "finish writing out commit ($title)" '
> +		commit=$(echo foo | git commit-tree -p HEAD $tree) &&
> +		git update-ref HEAD $commit
> +	'
>  done
>  
>  test_done

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
@ 2014-10-03 21:47   ` Junio C Hamano
  2014-10-04  0:09     ` Jeff King
  2014-10-04  0:30     ` Jeff King
  2014-10-07 16:29   ` Michael Haggerty
  1 sibling, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 21:47 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> Instead, this patch pushes the extra work onto prune, which
> runs less frequently (and has to look at the whole object
> graph anyway). It creates a new category of objects: objects
> which are not recent, but which are reachable from a recent
> object. We do not prune these objects, just like the
> reachable and recent ones.
>
> This lets us avoid the recursive check above, because if we
> have an object, even if it is unreachable, we should have
> its referent:
>
>   - if we are creating new objects, then we cannot create
>     the parent object without having the child
>
>   - and if we are pruning objects, will not prune the child
>     if we are keeping the parent

Sorry but this part is beyond a simple panda brain.

I can understand this

	If we have an object, even if it is unreachable, we
	should have its referent.
        
as a description of the desired behaviour.  If we have a tree that
is unreachable, we must make sure that we do not discard blobs that
are reachable from that tree, or we would end up corrupting our
repository if we ever allow that tree to become reachable from our
refs later.

But how does that connect to these two bullet points?

>   - if we are creating new objects, then we cannot create
>     the parent object without having the child

We cannot create the parent (e.g. "tree") without having the child
(e.g. "blob that is referred to by the tree we are creating").
So this bullet point is repeating the same thing?

>   - and if we are pruning objects, will not prune the child
>     if we are keeping the parent

We will not prune "blob" that are reachable from a "tree" that we
are not yet ready to prune.  So this again is repeating the same
thing?

But these are "this is how we want our system to behave".  And if we
assume our system behaves like so, then prune would be safe.

But it is unclear how that behaviour is realized.  Puzzled...

... goes and thinks ...

With this patch applied, the system will not prune unreachable old
objects that are reachable from a recent object (the recent object
itself may or may not be reachable but that does not make any
difference).  And that is sufficient to ensure the integrity of the
repository even if you allow new objects to be created reusing any
of these unreachable objects that are left behind by prune, because
the reachability check done during prune (with this patch applied)
makes sure any object left in the repository can safely be used as a
starting point of connectivity traversal.

Ok, I think I got it now, but then do we still need to utime(2) the
loose object files for unreachable objects that are referenced by
a recent object (which is done in a later patch), or is that purely
an optimization for the next round of gc where you would have more
recent objects (i.e. you do not have to traverse to find out an old
one is reachable from a new one, as there will be fewer old ones)?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 07/16] t5304: use test_path_is_* instead of "test -f"
  2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
@ 2014-10-03 22:12   ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 22:12 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> This is slightly more robust (checking "! test -f" would not
> notice a directory of the same name, though that is not
> likely to happen here). It also makes debugging easier, as
> the test script will output a message on failure.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This patch is totally optional. I did it while debugging t5304 (strange
> how badly prune works when you accidentally invert the mtime check!)
> and figured it might be worth keeping as a cleanup.

Looks sensible; thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
@ 2014-10-03 22:17   ` Junio C Hamano
  2014-10-04  0:13     ` Jeff King
  2014-10-07 13:21   ` Michael Haggerty
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 22:17 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> For small outputs, we sometimes use:
>
>   test "$(some_cmd)" = "something we expect"
>
> instead of a full test_cmp. The downside of this is that
> when it fails, there is no output at all from the script.
> Let's introduce a small helper to make tests easier to
> debug.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This is in the same boat as the last commit; we can drop it without
> hurting the rest of the series.
>
> Is test_eq too cutesy or obfuscated? 

Not really.  As long as its two strings are not tooooo long, the
output may still be readable.

> I have often wanted it when
> debugging other tests, too. Our usual technique is to do:
>
>   echo whatever >expect &&
>   do_something >actual &&
>   test_cmp expect actual
>
> That's a bit verbose. We could hide it behind something like test_eq,
> too, but it introduces several extra new processes.

What do you mean by "extra new processes"?  Whether open coded in a
verbose way, or wrapped inside a helper, e.g.

	test_eql () {
		echo "$1" >expect &&
                shift &&
                "$@" >actual &&
                test_cmp expect actual
	}
        ...
	test_eql whatever do_something

the number of processes would be the same, no?

Or do you mean test_cmp is an extra process compared with

	test_eq whatever "$(do_something)"

Hopefully, do_something does something more than what takes test_cmp
to run, so I wouldn't be worried too much about it.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 09/16] prune: factor out loose-object directory traversal
  2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
@ 2014-10-03 22:19   ` Junio C Hamano
  2014-10-04  0:24     ` Jeff King
  2014-10-07 14:07   ` Michael Haggerty
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 22:19 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> Prune has to walk $GIT_DIR/objects/?? in order to find the
> set of loose objects to prune. Other parts of the code
> (e.g., count-objects) want to do the same. Let's factor it
> out into a reusable for_each-style function.

Doesn't fsck also look at these as well?  I recall that we also have
a "sort by inum" trick employed there.  Would it also be applicable
to these two callers?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (15 preceding siblings ...)
  2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
@ 2014-10-03 22:20 ` Junio C Hamano
  2014-10-04 22:22 ` Junio C Hamano
  17 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-03 22:20 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> ... The objects are removed by prune, which doesn't realize
> that they are part of an ongoing operation. Prune uses the filesystem
> mtime to determine this, but we are not very thorough in making sure
> that is kept up to date.

The whole series looked quite sensible.  Thanks.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/16] foreach_alt_odb: propagate return value from callback
  2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
@ 2014-10-03 22:55   ` René Scharfe
  2014-10-04  0:31     ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: René Scharfe @ 2014-10-03 22:55 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Michael Haggerty

Am 03.10.2014 um 22:21 schrieb Jeff King:
> We check the return value of the callback and stop iterating
> if it is non-zero. However, we do not make the non-zero
> return value available to the caller, so they have no way of
> knowing whether the operation succeeded or not (technically
> they can keep their own error flag in the callback data, but
> that is unlike our other for_each functions).
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   cache.h     |  2 +-
>   sha1_file.c | 12 ++++++++----
>   2 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/cache.h b/cache.h
> index 8206039..cd16e25 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1143,7 +1143,7 @@ extern void prepare_alt_odb(void);
>   extern void read_info_alternates(const char * relative_base, int depth);
>   extern void add_to_alternates_file(const char *reference);
>   typedef int alt_odb_fn(struct alternate_object_database *, void *);
> -extern void foreach_alt_odb(alt_odb_fn, void*);
> +extern int foreach_alt_odb(alt_odb_fn, void*);
>
>   struct pack_window {
>   	struct pack_window *next;
> diff --git a/sha1_file.c b/sha1_file.c
> index c08c0cb..bae1c15 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -412,14 +412,18 @@ void add_to_alternates_file(const char *reference)
>   		link_alt_odb_entries(alt, strlen(alt), '\n', NULL, 0);
>   }
>
> -void foreach_alt_odb(alt_odb_fn fn, void *cb)
> +int foreach_alt_odb(alt_odb_fn fn, void *cb)
>   {
>   	struct alternate_object_database *ent;
> +	int r = 0;
>
>   	prepare_alt_odb();
> -	for (ent = alt_odb_list; ent; ent = ent->next)
> -		if (fn(ent, cb))
> -			return;
> +	for (ent = alt_odb_list; ent; ent = ent->next) {
> +		int r = fn(ent, cb);
> +		if (r)
> +			break;
> +	}
> +	return r;

This will always return zero.  You probably shadowed r unintentionally 
inside the loop, right?

>   }
>
>   void prepare_alt_odb(void)
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 16/16] write_sha1_file: freshen existing objects
  2014-10-03 21:29   ` Junio C Hamano
@ 2014-10-04  0:01     ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:01 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

On Fri, Oct 03, 2014 at 02:29:58PM -0700, Junio C Hamano wrote:

> > We can solve this by "freshening" objects that we avoid
> > writing by updating their mtime. The algorithm for doing so
> > is essentially the same as that of has_sha1_file. Therefore
> > we provide a new (static) interface "check_and_freshen",
> > which finds and optionally freshens the object. It's trivial
> > to implement freshening and simple checking by tweaking a
> > single parameter.
> 
> An old referent by a recent unreachable may be in pack.  Is it
> expected that the same pack will have many similar old objects (in
> other words, is it worth trying to optimize check-and-freshen by
> bypassing access() and utime(), perhaps by keeping a "freshened in
> this process already" flag in struct packed_git)?

Thanks for reminding me. I considered something like that early on and
then completely forgot to revisit it. I do not have numbers either way
on whether it is an optimization worth doing. On the one hand, it is
very easy to do.  On the other, it probably does not make a big
difference; we are literally skipping the write of an entire object, and
have just run a complete sha1 over the contents. A single utime() call
probably is not a big deal.

> Could check-and-freshen-nonlocal() ever be called with freshen set
> to true?  Should it be?  In other words, should we be mucking with
> objects in other people's repositories with utime()?

Yes, it can, and I think the answer to "should" is "yes" for safety,
though I agree it feels a little hacky. I did explicitly write it so
that we fail-safe when freshening doesn't work. That is, if we try to
freshen an object that is in an alternate and we cannot (e.g., because
we don't have write access), we'll fallback to writing out a new loose
object locally.

That's very much the safest thing to do, but obviously it performs less
well. Again, this is the code path where we _would have_ written out the
object anyway, so it might not be that bad. But I don't know to what
degree the current code relies on that optimization for reasonable
performance. E.g., if you clone from a read-only alternate and then try
to `git write-tree` immediately on the index, will we literally make a
full copy of each tree object?

Hmm, that should be easy to test...

  $ su - nobody
  $ git clone -s ~peff/compile/linux /tmp/foo
  $ cd /tmp/foo

  $ git count-objects
  0 objects, 0 kilobytes
  $ git write-tree
  $ git count-objects
  0 objects, 0 kilobytes

So far so good. Let's blow away the cache-tree to make sure...

  $ rm .git/index
  $ git read-tree HEAD
  $ git write-tree
  $ git count-objects
  0 objects, 0 kilobytes

So that's promising. But it's far from a proof that there isn't some
other code path that will be negatively impacted.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-03 21:47   ` Junio C Hamano
@ 2014-10-04  0:09     ` Jeff King
  2014-10-04  0:30     ` Jeff King
  1 sibling, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

On Fri, Oct 03, 2014 at 02:47:57PM -0700, Junio C Hamano wrote:

> Sorry but this part is beyond a simple panda brain.

I probably didn't explain it very well. I found it rather confusing to
reason about. Let's see if we can go through it together.

> I can understand this
> 
> 	If we have an object, even if it is unreachable, we
> 	should have its referent.
>         
> as a description of the desired behaviour.  If we have a tree that
> is unreachable, we must make sure that we do not discard blobs that
> are reachable from that tree, or we would end up corrupting our
> repository if we ever allow that tree to become reachable from our
> refs later.

Yes, exactly.

> >   - if we are creating new objects, then we cannot create
> >     the parent object without having the child
> 
> We cannot create the parent (e.g. "tree") without having the child
> (e.g. "blob that is referred to by the tree we are creating").
> So this bullet point is repeating the same thing?

Sort of. The first statement "if we have an object, we should have its
referent" is a desired state. The bullet points are trying to reason
about the _changes_ in state, and argue that there are no state changes
that take us from a valid state to an invalid one.

There are two actions that affect the state of the object database:
adding objects and removing objects.

We cannot go from a valid state to an invalid state by adding objects,
because we cannot create the parent without having the child. That is
already the case before this patch (though as noted, you can "cheat"
if you know the sha1 from another repository and bypass git's safety
checks, but I do not think that is a case worth worrying about).

> >   - and if we are pruning objects, will not prune the child
> >     if we are keeping the parent
> 
> We will not prune "blob" that are reachable from a "tree" that we
> are not yet ready to prune.  So this again is repeating the same
> thing?

This is the other action. When removing objects, we will keep the child
along with the parent, and therefore we cannot move to an invalid state.
That's the part that this patch does.

> With this patch applied, the system will not prune unreachable old
> objects that are reachable from a recent object (the recent object
> itself may or may not be reachable but that does not make any
> difference).  And that is sufficient to ensure the integrity of the
> repository even if you allow new objects to be created reusing any
> of these unreachable objects that are left behind by prune, because
> the reachability check done during prune (with this patch applied)
> makes sure any object left in the repository can safely be used as a
> starting point of connectivity traversal.

Yes, exactly.

> Ok, I think I got it now, but then do we still need to utime(2) the
> loose object files for unreachable objects that are referenced by
> a recent object (which is done in a later patch), or is that purely
> an optimization for the next round of gc where you would have more
> recent objects (i.e. you do not have to traverse to find out an old
> one is reachable from a new one, as there will be fewer old ones)?

No, we don't need to utime() the loose objects. As long as there is a
recently-written object that refers to them, they are considered worth
keeping.

The later patch that calls utime() on objects is really about defeating
the write_sha1_file optimization. That is, you might _think_ you have
written a recent object that refers to other objects, but the sha1-file
code silently turned it into a noop.

Does that make more sense?

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-03 22:17   ` Junio C Hamano
@ 2014-10-04  0:13     ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

On Fri, Oct 03, 2014 at 03:17:18PM -0700, Junio C Hamano wrote:

> > That's a bit verbose. We could hide it behind something like test_eq,
> > too, but it introduces several extra new processes.
> 
> What do you mean by "extra new processes"?  Whether open coded in a
> verbose way, or wrapped inside a helper, e.g.
> 
> 	test_eql () {
> 		echo "$1" >expect &&
>                 shift &&
>                 "$@" >actual &&
>                 test_cmp expect actual
> 	}
>         ...
> 	test_eql whatever do_something
> 
> the number of processes would be the same, no?
> 
> Or do you mean test_cmp is an extra process compared with
> 
> 	test_eq whatever "$(do_something)"

Sorry, yeah, I meant new processes versus "test $foo = $bar".

> Hopefully, do_something does something more than what takes test_cmp
> to run, so I wouldn't be worried too much about it.

Yeah, I may just be overly paranoid here. If we are not worried about a
few extra processes, then the test_eql you showed above may be
preferable, because its output is uniform with other test_cmp tests
(although maybe it also introduces problems, because it does not handle
stray whitespace in the same way, and it puts extra files in the working
tree).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 09/16] prune: factor out loose-object directory traversal
  2014-10-03 22:19   ` Junio C Hamano
@ 2014-10-04  0:24     ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:24 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

On Fri, Oct 03, 2014 at 03:19:29PM -0700, Junio C Hamano wrote:

> Jeff King <peff@peff.net> writes:
> 
> > Prune has to walk $GIT_DIR/objects/?? in order to find the
> > set of loose objects to prune. Other parts of the code
> > (e.g., count-objects) want to do the same. Let's factor it
> > out into a reusable for_each-style function.
> 
> Doesn't fsck also look at these as well?  I recall that we also have
> a "sort by inum" trick employed there.  Would it also be applicable
> to these two callers?

I didn't think to check fsck. It does have a similar function and could
be converted.

Fsck does store the inode and later visits the files in inode order. I
don't know if it matters as much here, because we are not opening most
of the files, just stat()ing them.

I'd also be kind of surprised if this optimization really matters in
practice these days. The inode-sorting comes from 7e8c174 (fsck-cache:
sort entries by inode number, 2005-05-02). That predates packfiles by
several months, I believe, and you are not likely to have a very large
number of loose objects anymore (and if you do, you will get _way_
better performance simply by packing).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-03 21:47   ` Junio C Hamano
  2014-10-04  0:09     ` Jeff King
@ 2014-10-04  0:30     ` Jeff King
  2014-10-04  3:04       ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:30 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Michael Haggerty

On Fri, Oct 03, 2014 at 02:47:57PM -0700, Junio C Hamano wrote:

> With this patch applied, the system will not prune unreachable old
> objects that are reachable from a recent object (the recent object
> itself may or may not be reachable but that does not make any
> difference).  And that is sufficient to ensure the integrity of the
> repository even if you allow new objects to be created reusing any
> of these unreachable objects that are left behind by prune, because
> the reachability check done during prune (with this patch applied)
> makes sure any object left in the repository can safely be used as a
> starting point of connectivity traversal.

Your use of "safely" in the last sentence here made me think.

In a repository that has had this patch from the beginning, it should be
safe to traverse the unreachable but unpruned objects, because the
property of the repository we are trying to guarantee means that we will
have all of the referents.

But in a repository created by current git (or one where you have been
fooling around with hash-object), we might not. Our secondary traversal
for recent objects may fail because we are already missing some of the
referents, and prepare_revision_walk (or traverse_commit_list) will
barf. That's a bad thing.

I think we need to tell that second traversal that it's OK to skip
missing objects that are reachable from recent objects. It's not ideal,
but there's nothing better we can do.  Eventually those objects will
fall off the back of the expiration window, but we should not be holding
up prunes (which, after all, are the thing that expires them :) ) in the
meantime.

I think it would be enough to just set revs->ignore_missing_links for
the secondary traversal.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 01/16] foreach_alt_odb: propagate return value from callback
  2014-10-03 22:55   ` René Scharfe
@ 2014-10-04  0:31     ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-04  0:31 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Michael Haggerty

On Sat, Oct 04, 2014 at 12:55:13AM +0200, René Scharfe wrote:

> >-void foreach_alt_odb(alt_odb_fn fn, void *cb)
> >+int foreach_alt_odb(alt_odb_fn fn, void *cb)
> >  {
> >  	struct alternate_object_database *ent;
> >+	int r = 0;
> >
> >  	prepare_alt_odb();
> >-	for (ent = alt_odb_list; ent; ent = ent->next)
> >-		if (fn(ent, cb))
> >-			return;
> >+	for (ent = alt_odb_list; ent; ent = ent->next) {
> >+		int r = fn(ent, cb);
> >+		if (r)
> >+			break;
> >+	}
> >+	return r;
> 
> This will always return zero.  You probably shadowed r unintentionally
> inside the loop, right?

Eek, yes. Thanks for catching. I'll fix it in the re-roll.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-04  0:30     ` Jeff King
@ 2014-10-04  3:04       ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-04  3:04 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> On Fri, Oct 03, 2014 at 02:47:57PM -0700, Junio C Hamano wrote:
>
>> With this patch applied, the system will not prune unreachable old
>> objects that are reachable from a recent object (the recent object
>> itself may or may not be reachable but that does not make any
>> difference).  And that is sufficient to ensure the integrity of the
>> repository even if you allow new objects to be created reusing any
>> of these unreachable objects that are left behind by prune, because
>> the reachability check done during prune (with this patch applied)
>> makes sure any object left in the repository can safely be used as a
>> starting point of connectivity traversal.
>
> Your use of "safely" in the last sentence here made me think.
>
> In a repository that has had this patch from the beginning, it should be
> safe to traverse the unreachable but unpruned objects, because the
> property of the repository we are trying to guarantee means that we will
> have all of the referents.
> ...

Another case that may be of interest is to start a dumb HTTP commit
walker to fetch new history and kill it before it manages to
complete the graph and update the refs.  It has the same property as
running hash-objects to create an unconnected cruft object after you
got your repository into a "safe" state, but it would be a lot less
easier to blame the user for the resulting "breakage".

Perhaps the dumb commit walkers outlived their usefulness and we
should start talking about their deprecation and eventual removal at
a later version of Git?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
                   ` (16 preceding siblings ...)
  2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
@ 2014-10-04 22:22 ` Junio C Hamano
  2014-10-05  9:19   ` René Scharfe
  2014-10-06  1:42   ` Jeff King
  17 siblings, 2 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-04 22:22 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> There's quite a lot of patches here, but most of them are preparatory
> cleanups. The meat is in patches 13, 15, and 16.
>
>   [01/16]: foreach_alt_odb: propagate return value from callback
>   [02/16]: isxdigit: cast input to unsigned char
>   [03/16]: object_array: factor out slopbuf-freeing logic
>   [04/16]: object_array: add a "clear" function
>   [05/16]: clean up name allocation in prepare_revision_walk
>   [06/16]: reachable: clear pending array after walking it
>   [07/16]: t5304: use test_path_is_* instead of "test -f"
>   [08/16]: t5304: use helper to report failure of "test foo = bar"
>   [09/16]: prune: factor out loose-object directory traversal
>   [10/16]: count-objects: do not use xsize_t when counting object size
>   [11/16]: count-objects: use for_each_loose_file_in_objdir
>   [12/16]: sha1_file: add for_each iterators for loose and packed objects
>   [13/16]: prune: keep objects reachable from recent objects
>   [14/16]: pack-objects: refactor unpack-unreachable expiration check
>   [15/16]: pack-objects: match prune logic for discarding objects
>   [16/16]: write_sha1_file: freshen existing objects

Somebody please help me out here.

This applied on top of 'maint' (which does have c40fdd01) makes the
test #9 (prune: do not prune detached HEAD with no reflog) fail.

If we merge 'dt/cache-tree-repair' (which in turn needs
'jc/reopen-lock-file') to 'maint' and then apply these on top, the
said test passes.  But I do not see an apparent reason why X-<.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects
  2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
@ 2014-10-05  8:15   ` René Scharfe
  2014-10-05 10:47     ` Ramsay Jones
  0 siblings, 1 reply; 58+ messages in thread
From: René Scharfe @ 2014-10-05  8:15 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Michael Haggerty

Am 03.10.2014 um 22:32 schrieb Jeff King:
> We typically iterate over the reachable objects in a
> repository by starting at the tips and walking the graph.
> There's no easy way to iterate over all of the objects,
> including unreachable ones. Let's provide a way of doing so.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   cache.h     | 11 +++++++++++
>   sha1_file.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   2 files changed, 73 insertions(+)
>
> diff --git a/cache.h b/cache.h
> index 7abe7f6..3826b4b 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1270,6 +1270,17 @@ int for_each_loose_file_in_objdir(const char *path,
>   				  each_loose_subdir_fn subdir_cb,
>   				  void *data);
>
> +/*
> + * Iterate over loose and packed objects in both the local
> + * repository and any alternates repositories.
> + */
> +typedef int each_packed_object_fn(const unsigned char *sha1,
> +				  struct packed_git *pack,
> +				  uint32_t pos,
> +				  void *data);
> +extern int for_each_loose_object(each_loose_object_fn, void *);
> +extern int for_each_packed_object(each_packed_object_fn, void *);
> +
>   struct object_info {
>   	/* Request */
>   	enum object_type *typep;
> diff --git a/sha1_file.c b/sha1_file.c
> index 9fdad47..d017289 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3313,3 +3313,65 @@ int for_each_loose_file_in_objdir(const char *path,
>   	strbuf_release(&buf);
>   	return r;
>   }
> +
> +struct loose_alt_odb_data {
> +	each_loose_object_fn *cb;
> +	void *data;
> +};
> +
> +static int loose_from_alt_odb(struct alternate_object_database *alt,
> +			      void *vdata)
> +{
> +	struct loose_alt_odb_data *data = vdata;
> +	return for_each_loose_file_in_objdir(alt->base,
> +					     data->cb, NULL, NULL,
> +					     data->data);
> +}
> +
> +int for_each_loose_object(each_loose_object_fn cb, void *data)
> +{
> +	struct loose_alt_odb_data alt;
> +	int r;
> +
> +	r = for_each_loose_file_in_objdir(get_object_directory(),
> +					  cb, NULL, NULL, data);
> +	if (r)
> +		return r;
> +
> +	alt.cb = cb;
> +	alt.data = data;
> +	return foreach_alt_odb(loose_from_alt_odb, &alt);
> +}
> +
> +int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)

Should this one be declared static?  It seems to be used only in 
sha1_file.c.

> +{
> +	uint32_t i;
> +	int r = 0;
> +
> +	for (i = 0; i < p->num_objects; i++) {
> +		const unsigned char *sha1 = nth_packed_object_sha1(p, i);
> +
> +		if (!sha1)
> +			return error("unable to get sha1 of object %u in %s",
> +				     i, p->pack_name);
> +
> +		r = cb(sha1, p, i, data);
> +		if (r)
> +			break;
> +	}
> +	return r;
> +}
> +
> +int for_each_packed_object(each_packed_object_fn cb, void *data)
> +{
> +	struct packed_git *p;
> +	int r = 0;
> +
> +	prepare_packed_git();
> +	for (p = packed_git; p; p = p->next) {
> +		r = for_each_object_in_pack(p, cb, data);
> +		if (r)
> +			break;
> +	}
> +	return 0;
> +}

Perhaps return r instead here?

René

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 16/16] write_sha1_file: freshen existing objects
  2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
  2014-10-03 21:29   ` Junio C Hamano
@ 2014-10-05  9:12   ` René Scharfe
  1 sibling, 0 replies; 58+ messages in thread
From: René Scharfe @ 2014-10-05  9:12 UTC (permalink / raw)
  To: Jeff King, git; +Cc: Michael Haggerty

Am 03.10.2014 um 22:41 schrieb Jeff King:
> When we try to write a loose object file, we first check
> whether that object already exists. If so, we skip the
> write as an optimization. However, this can interfere with
> prune's strategy of using mtimes to mark files in progress.
>
> For example, if a branch contains a particular tree object
> and is deleted, that tree object may become unreachable, and
> have an old mtime. If a new operation then tries to write
> the same tree, this ends up as a noop; we notice we
> already have the object and do nothing. A prune running
> simultaneously with this operation will see the object as
> old, and may delete it.
>
> We can solve this by "freshening" objects that we avoid
> writing by updating their mtime. The algorithm for doing so
> is essentially the same as that of has_sha1_file. Therefore
> we provide a new (static) interface "check_and_freshen",
> which finds and optionally freshens the object. It's trivial
> to implement freshening and simple checking by tweaking a
> single parameter.
>
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>   sha1_file.c                | 51 +++++++++++++++++++++++++++++++++++++++-------
>   t/t6501-freshen-objects.sh | 27 ++++++++++++++++++++++++
>   2 files changed, 71 insertions(+), 7 deletions(-)
>
> diff --git a/sha1_file.c b/sha1_file.c
> index d017289..d263b38 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -442,27 +442,53 @@ void prepare_alt_odb(void)
>   	read_info_alternates(get_object_directory(), 0);
>   }
>
> -static int has_loose_object_local(const unsigned char *sha1)
> +static int freshen_file(const char *fn)
>   {
> -	return !access(sha1_file_name(sha1), F_OK);
> +	struct utimbuf t;
> +	t.actime = t.modtime = time(NULL);
> +	return utime(fn, &t);
>   }

Mental note: freshen_file() returns 0 on success.

>
> -int has_loose_object_nonlocal(const unsigned char *sha1)
> +static int check_and_freshen_file(const char *fn, int freshen)
> +{
> +	if (access(fn, F_OK))
> +		return 0;
> +	if (freshen && freshen_file(fn))
> +		return 0;
> +	return 1;
> +}

Returns 1 on success.

> +
> +static int check_and_freshen_local(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_file(sha1_file_name(sha1), freshen);
> +}

Returns 1 on success.

> +
> +static int check_and_freshen_nonlocal(const unsigned char *sha1, int freshen)
>   {
>   	struct alternate_object_database *alt;
>   	prepare_alt_odb();
>   	for (alt = alt_odb_list; alt; alt = alt->next) {
>   		fill_sha1_path(alt->name, sha1);
> -		if (!access(alt->base, F_OK))
> +		if (check_and_freshen_file(alt->base, freshen))
>   			return 1;
>   	}
>   	return 0;
>   }

Returns 1 on success.

>
> +static int check_and_freshen(const unsigned char *sha1, int freshen)
> +{
> +	return check_and_freshen_local(sha1, freshen) ||
> +	       check_and_freshen_nonlocal(sha1, freshen);
> +}

Returns 1 on success.

> +
> +int has_loose_object_nonlocal(const unsigned char *sha1)
> +{
> +	return check_and_freshen_nonlocal(sha1, 0);
> +}
> +
>   static int has_loose_object(const unsigned char *sha1)
>   {
> -	return has_loose_object_local(sha1) ||
> -	       has_loose_object_nonlocal(sha1);
> +	return check_and_freshen(sha1, 0);
>   }
>
>   static unsigned int pack_used_ctr;
> @@ -2949,6 +2975,17 @@ static int write_loose_object(const unsigned char *sha1, char *hdr, int hdrlen,
>   	return move_temp_to_file(tmp_file, filename);
>   }
>
> +static int freshen_loose_object(const unsigned char *sha1)
> +{
> +	return check_and_freshen(sha1, 1);
> +}

Returns 1 on success.

> +
> +static int freshen_packed_object(const unsigned char *sha1)
> +{
> +	struct pack_entry e;
> +	return find_pack_entry(sha1, &e) && freshen_file(e.p->pack_name);
> +}

Returns 1 if a pack entry is found and freshen_file() fails, and 0 if no 
entry is found or freshen_file() succeeds.

It should be "&& !freshen(...)" instead, no?

Or better, let freshen_file() return 1 on success as the other functions 
here.

> +
>   int write_sha1_file(const void *buf, unsigned long len, const char *type, unsigned char *returnsha1)
>   {
>   	unsigned char sha1[20];
> @@ -2961,7 +2998,7 @@ int write_sha1_file(const void *buf, unsigned long len, const char *type, unsign
>   	write_sha1_file_prepare(buf, len, type, sha1, hdr, &hdrlen);
>   	if (returnsha1)
>   		hashcpy(returnsha1, sha1);
> -	if (has_sha1_file(sha1))
> +	if (freshen_loose_object(sha1) || freshen_packed_object(sha1))
>   		return 0;
>   	return write_loose_object(sha1, hdr, hdrlen, buf, len, 0);
>   }
> diff --git a/t/t6501-freshen-objects.sh b/t/t6501-freshen-objects.sh
> index e25c47d..157f3f9 100755
> --- a/t/t6501-freshen-objects.sh
> +++ b/t/t6501-freshen-objects.sh
> @@ -100,6 +100,33 @@ for repack in '' true; do
>   	test_expect_success "repository passes fsck ($title)" '
>   		git fsck
>   	'
> +
> +	test_expect_success "abandon objects again ($title)" '
> +		git reset --hard HEAD^ &&
> +		find .git/objects -type f |
> +		xargs test-chmtime -v -86400
> +	'
> +
> +	test_expect_success "start writing new commit with same tree ($title)" '
> +		tree=$(
> +			GIT_INDEX_FILE=index.tmp &&
> +			export GIT_INDEX_FILE &&
> +			git read-tree HEAD &&
> +			add abandon &&
> +			add unrelated &&
> +			git write-tree
> +		)
> +	'
> +
> +	test_expect_success "simultaneous gc ($title)" '
> +		git gc --prune=12.hours.ago
> +	'
> +
> +	# tree should have been refreshed by write-tree
> +	test_expect_success "finish writing out commit ($title)" '
> +		commit=$(echo foo | git commit-tree -p HEAD $tree) &&
> +		git update-ref HEAD $commit
> +	'
>   done
>
>   test_done
>

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-04 22:22 ` Junio C Hamano
@ 2014-10-05  9:19   ` René Scharfe
  2014-10-06  1:42   ` Jeff King
  1 sibling, 0 replies; 58+ messages in thread
From: René Scharfe @ 2014-10-05  9:19 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git, Michael Haggerty

Am 05.10.2014 um 00:22 schrieb Junio C Hamano:
> Jeff King <peff@peff.net> writes:
>
>> There's quite a lot of patches here, but most of them are preparatory
>> cleanups. The meat is in patches 13, 15, and 16.
>>
>>    [01/16]: foreach_alt_odb: propagate return value from callback
>>    [02/16]: isxdigit: cast input to unsigned char
>>    [03/16]: object_array: factor out slopbuf-freeing logic
>>    [04/16]: object_array: add a "clear" function
>>    [05/16]: clean up name allocation in prepare_revision_walk
>>    [06/16]: reachable: clear pending array after walking it
>>    [07/16]: t5304: use test_path_is_* instead of "test -f"
>>    [08/16]: t5304: use helper to report failure of "test foo = bar"
>>    [09/16]: prune: factor out loose-object directory traversal
>>    [10/16]: count-objects: do not use xsize_t when counting object size
>>    [11/16]: count-objects: use for_each_loose_file_in_objdir
>>    [12/16]: sha1_file: add for_each iterators for loose and packed objects
>>    [13/16]: prune: keep objects reachable from recent objects
>>    [14/16]: pack-objects: refactor unpack-unreachable expiration check
>>    [15/16]: pack-objects: match prune logic for discarding objects
>>    [16/16]: write_sha1_file: freshen existing objects
>
> Somebody please help me out here.
>
> This applied on top of 'maint' (which does have c40fdd01) makes the
> test #9 (prune: do not prune detached HEAD with no reflog) fail.

The test passes if the return value of freshen_file() is negated in 
freshen_packed_object() (see my reply to patch 16).

> If we merge 'dt/cache-tree-repair' (which in turn needs
> 'jc/reopen-lock-file') to 'maint' and then apply these on top, the
> said test passes.  But I do not see an apparent reason why X-<.

Didn't check this interaction, but it sounds strange.

René

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects
  2014-10-05  8:15   ` René Scharfe
@ 2014-10-05 10:47     ` Ramsay Jones
  0 siblings, 0 replies; 58+ messages in thread
From: Ramsay Jones @ 2014-10-05 10:47 UTC (permalink / raw)
  To: René Scharfe, Jeff King, git; +Cc: Michael Haggerty

On 05/10/14 09:15, René Scharfe wrote:
> Am 03.10.2014 um 22:32 schrieb Jeff King:
>> We typically iterate over the reachable objects in a
>> repository by starting at the tips and walking the graph.
>> There's no easy way to iterate over all of the objects,
>> including unreachable ones. Let's provide a way of doing so.
>>
>> Signed-off-by: Jeff King <peff@peff.net>
>> ---
>>   cache.h     | 11 +++++++++++
>>   sha1_file.c | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>   2 files changed, 73 insertions(+)
>>
>> diff --git a/cache.h b/cache.h
>> index 7abe7f6..3826b4b 100644
>> --- a/cache.h
>> +++ b/cache.h
>> @@ -1270,6 +1270,17 @@ int for_each_loose_file_in_objdir(const char *path,
>>                     each_loose_subdir_fn subdir_cb,
>>                     void *data);
>>
>> +/*
>> + * Iterate over loose and packed objects in both the local
>> + * repository and any alternates repositories.
>> + */
>> +typedef int each_packed_object_fn(const unsigned char *sha1,
>> +                  struct packed_git *pack,
>> +                  uint32_t pos,
>> +                  void *data);
>> +extern int for_each_loose_object(each_loose_object_fn, void *);
>> +extern int for_each_packed_object(each_packed_object_fn, void *);
>> +
>>   struct object_info {
>>       /* Request */
>>       enum object_type *typep;
>> diff --git a/sha1_file.c b/sha1_file.c
>> index 9fdad47..d017289 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -3313,3 +3313,65 @@ int for_each_loose_file_in_objdir(const char *path,
>>       strbuf_release(&buf);
>>       return r;
>>   }
>> +
>> +struct loose_alt_odb_data {
>> +    each_loose_object_fn *cb;
>> +    void *data;
>> +};
>> +
>> +static int loose_from_alt_odb(struct alternate_object_database *alt,
>> +                  void *vdata)
>> +{
>> +    struct loose_alt_odb_data *data = vdata;
>> +    return for_each_loose_file_in_objdir(alt->base,
>> +                         data->cb, NULL, NULL,
>> +                         data->data);
>> +}
>> +
>> +int for_each_loose_object(each_loose_object_fn cb, void *data)
>> +{
>> +    struct loose_alt_odb_data alt;
>> +    int r;
>> +
>> +    r = for_each_loose_file_in_objdir(get_object_directory(),
>> +                      cb, NULL, NULL, data);
>> +    if (r)
>> +        return r;
>> +
>> +    alt.cb = cb;
>> +    alt.data = data;
>> +    return foreach_alt_odb(loose_from_alt_odb, &alt);
>> +}
>> +
>> +int for_each_object_in_pack(struct packed_git *p, each_packed_object_fn cb, void *data)
> 
> Should this one be declared static?  It seems to be used only in sha1_file.c.

Heh, I was just about to make the same observation myself (with included patch).

I could imagine this function being useful elsewhere, but until it gains some
more external callers I think it should remain static (so it doesn't cause a
sparse warning), rather than add an extern declaration to cache.h (which would
also suppress sparse).

ATB,
Ramsay Jones

> 
>> +{
>> +    uint32_t i;
>> +    int r = 0;
>> +
>> +    for (i = 0; i < p->num_objects; i++) {
>> +        const unsigned char *sha1 = nth_packed_object_sha1(p, i);
>> +
>> +        if (!sha1)
>> +            return error("unable to get sha1 of object %u in %s",
>> +                     i, p->pack_name);
>> +
>> +        r = cb(sha1, p, i, data);
>> +        if (r)
>> +            break;
>> +    }
>> +    return r;
>> +}
>> +
>> +int for_each_packed_object(each_packed_object_fn cb, void *data)
>> +{
>> +    struct packed_git *p;
>> +    int r = 0;
>> +
>> +    prepare_packed_git();
>> +    for (p = packed_git; p; p = p->next) {
>> +        r = for_each_object_in_pack(p, cb, data);
>> +        if (r)
>> +            break;
>> +    }
>> +    return 0;
>> +}
> 
> Perhaps return r instead here?
> 
> René
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> .
> 

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-04 22:22 ` Junio C Hamano
  2014-10-05  9:19   ` René Scharfe
@ 2014-10-06  1:42   ` Jeff King
  2014-10-08  8:31     ` Jeff King
  1 sibling, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-06  1:42 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, git, Michael Haggerty

On Sat, Oct 04, 2014 at 03:22:10PM -0700, Junio C Hamano wrote:

> This applied on top of 'maint' (which does have c40fdd01) makes the
> test #9 (prune: do not prune detached HEAD with no reflog) fail.

I'll fix the bone-headed error returns that René noticed and double
check that they were the complete culprit in the test failure you saw
(and not just masking some other problem).

> If we merge 'dt/cache-tree-repair' (which in turn needs
> 'jc/reopen-lock-file') to 'maint' and then apply these on top, the
> said test passes.  But I do not see an apparent reason why X-<.

I suspect it's an unintended interaction that just happens to let my
bogus code code the right thing, but I'll double check on that, too.

Thanks both of you for spotting the problems.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/16] object_array: factor out slopbuf-freeing logic
  2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
@ 2014-10-07 11:25   ` Michael Haggerty
  2014-10-08  7:36     ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-07 11:25 UTC (permalink / raw)
  To: Jeff King, git

On 10/03/2014 10:22 PM, Jeff King wrote:
> This is not a lot of code, but it's a logical construct that
> should not need to be repeated (and we are about to add a
> third repetition).
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
>  object.c | 12 ++++++++----
>  1 file changed, 8 insertions(+), 4 deletions(-)
> 
> diff --git a/object.c b/object.c
> index ca9d790..14238dc 100644
> --- a/object.c
> +++ b/object.c
> @@ -355,6 +355,12 @@ void add_object_array_with_context(struct object *obj, const char *name, struct
>  		add_object_array_with_mode_context(obj, name, array, S_IFINVALID, context);
>  }
>  
> +static void object_array_release_entry(struct object_array_entry *ent)
> +{
> +	if (ent->name != object_array_slopbuf)
> +		free(ent->name);
> +}
> +

Would it be a little safer to set ent->name to NULL or to
object_array_slopbuf after freeing the memory, to prevent accidents?

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
  2014-10-03 22:17   ` Junio C Hamano
@ 2014-10-07 13:21   ` Michael Haggerty
  2014-10-07 17:29     ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-07 13:21 UTC (permalink / raw)
  To: Jeff King, git, Junio C Hamano

On 10/03/2014 10:27 PM, Jeff King wrote:
> For small outputs, we sometimes use:
> 
>   test "$(some_cmd)" = "something we expect"
> 
> instead of a full test_cmp. The downside of this is that
> when it fails, there is no output at all from the script.
> Let's introduce a small helper to make tests easier to
> debug.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> This is in the same boat as the last commit; we can drop it without
> hurting the rest of the series.
> 
> Is test_eq too cutesy or obfuscated? I have often wanted it when
> debugging other tests, too. Our usual technique is to do:
> 
>   echo whatever >expect &&
>   do_something >actual &&
>   test_cmp expect actual
> 
> That's a bit verbose. We could hide it behind something like test_eq,
> too, but it introduces several extra new processes. And I know people on
> some fork-challenged platforms are very sensitive to the number of
> spawned processes in the test suite.

I don't like the three-argument version of test_eq. Wouldn't using a
comparison operator other than "=" would be very confusing, given that
"eq" is in the name of the function? It also doesn't look like you use
this feature.

If you want to write a helper that allows arbitrary comparator
operators, then I think it would be more readable to put the comparison
operator in the middle, like

    test_test foo = bar

And in fact once you've done that, couldn't we just make this a generic
wrapper for any `test` command?

test_test () {
	if ! test "$@"
	then
		echo >&2 "test failed: $*"
		false
	fi
}

Feel free to bikeshed the function name.

An alternative direction to go would be to specialize the function for
equality testing and delegate to test_cmp to get better output for
failures, but optimized to avoid excess process creation in the happy path:

test_eq () {
	if test "$1" != "$2"
	then
		printf "%s" "$1" >expect &&
		printf "%s" "$2" >actual &&
		test_cmp expect actual
	fi
}

(but using properly-created temporary file names).

Finally, if we want a function intended mainly for checking program
output (like the test_eql function suggested by Junio), it might be
simpler to use if it accepts the function output on its stdin:

	test_output () {
		echo "$1" >expect &&
                cat >actual &&
                test_cmp expect actual
	}
        ...
	do_something | test_output whatever

This would make it easier to generate the input using an arbitrary shell
pipeline.

> [...]

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 09/16] prune: factor out loose-object directory traversal
  2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
  2014-10-03 22:19   ` Junio C Hamano
@ 2014-10-07 14:07   ` Michael Haggerty
  2014-10-08  7:33     ` Jeff King
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-07 14:07 UTC (permalink / raw)
  To: Jeff King, git

On 10/03/2014 10:29 PM, Jeff King wrote:
> Prune has to walk $GIT_DIR/objects/?? in order to find the
> set of loose objects to prune. Other parts of the code
> (e.g., count-objects) want to do the same. Let's factor it
> out into a reusable for_each-style function.
> 
> Note that this is not quite a straight code movement. There
> are two differences:
> 
>   1. The original code iterated from 0 to 256, trying to
>      opendir("$GIT_DIR/%02x"). The new code just does a
>      readdir() on the object directory, and descends into
>      any matching directories. This is faster on
>      already-pruned repositories, and should not ever be
>      slower (nobody ever creates other files in the object
>      directory).

This would change the order that the objects are processed. I doubt that
matters to anybody, but it's probably worth mentioning in the commit
message.

>   2. The original code had strange behavior when it found a
>      file of the form "[0-9a-f]{2}/.{38}" that did _not_
>      contain all hex digits. It executed a "break" from the
>      loop, meaning that we stopped pruning in that directory
>      (but still pruned other directories!). This was
>      probably a bug; we do not want to process the file as
>      an object, but we should keep going otherwise.
> 
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> I admit the speedup in (1) almost certainly doesn't matter. It is real,
> and I found out about it while writing a different program that was
> basically "count-objects" across a large number of repositories. However
> for a single repo it's probably not big enough to matter (calling
> count-objects in a loop while get dominated by the startup costs). The
> end result is a little more obvious IMHO, but that's subjective.
> 
>  builtin/prune.c | 87 ++++++++++++++++------------------------------------
>  cache.h         | 31 +++++++++++++++++++
>  sha1_file.c     | 95 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 152 insertions(+), 61 deletions(-)
> 
> [...]
> diff --git a/cache.h b/cache.h
> index cd16e25..7abe7f6 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1239,6 +1239,37 @@ extern unsigned long unpack_object_header_buffer(const unsigned char *buf, unsig
>  extern unsigned long get_size_from_delta(struct packed_git *, struct pack_window **, off_t);
>  extern int unpack_object_header(struct packed_git *, struct pack_window **, off_t *, unsigned long *);
>  
> +/*
> + * Iterate over the files in the loose-object parts of the object
> + * directory "path", triggering the following callbacks:
> + *
> + *  - loose_object is called for each loose object we find.
> + *
> + *  - loose_cruft is called for any files that do not appear to be
> + *    loose objects.
> + *
> + *  - loose_subdir is called for each top-level hashed subdirectory
> + *    of the object directory (e.g., "$OBJDIR/f0"). It is called
> + *    after the objects in the directory are processed.
> + *
> + * Any callback that is NULL will be ignored. Callbacks returning non-zero
> + * will end the iteration.
> + */
> +typedef int each_loose_object_fn(const unsigned char *sha1,
> +				 const char *path,
> +				 void *data);
> +typedef int each_loose_cruft_fn(const char *basename,
> +				const char *path,
> +				void *data);
> +typedef int each_loose_subdir_fn(const char *basename,
> +				 const char *path,
> +				 void *data);
> +int for_each_loose_file_in_objdir(const char *path,
> +				  each_loose_object_fn obj_cb,
> +				  each_loose_cruft_fn cruft_cb,
> +				  each_loose_subdir_fn subdir_cb,
> +				  void *data);
> +
>  struct object_info {
>  	/* Request */
>  	enum object_type *typep;
> diff --git a/sha1_file.c b/sha1_file.c
> index bae1c15..9fdad47 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -3218,3 +3218,98 @@ void assert_sha1_type(const unsigned char *sha1, enum object_type expect)
>  		die("%s is not a valid '%s' object", sha1_to_hex(sha1),
>  		    typename(expect));
>  }
> +
> +static int opendir_error(const char *path)
> +{
> +	if (errno == ENOENT)
> +		return 0;
> +	return error("unable to open %s: %s", path, strerror(errno));
> +}
> +
> +static int for_each_file_in_obj_subdir(struct strbuf *path,
> +				       const char *prefix,
> +				       each_loose_object_fn obj_cb,
> +				       each_loose_cruft_fn cruft_cb,
> +				       each_loose_subdir_fn subdir_cb,
> +				       void *data)
> +{
> +	size_t baselen = path->len;
> +	DIR *dir = opendir(path->buf);
> +	struct dirent *de;
> +	int r = 0;
> +
> +	if (!dir)
> +		return opendir_error(path->buf);

OK, so if there is a non-directory named $GIT_DIR/objects/33, then we
emit an "unable to open" error rather than treating it as cruft. I think
this is reasonable.

> +
> +	while ((de = readdir(dir))) {
> +		if (is_dot_or_dotdot(de->d_name))
> +			continue;
> +
> +		strbuf_setlen(path, baselen);
> +		strbuf_addf(path, "/%s", de->d_name);
> +
> +		if (strlen(de->d_name) == 38)  {
> +			char hex[41];
> +			unsigned char sha1[20];
> +
> +			memcpy(hex, prefix, 2);
> +			memcpy(hex + 2, de->d_name, 38);
> +			hex[40] = 0;
> +			if (!get_sha1_hex(hex, sha1)) {
> +				if (obj_cb) {
> +					r = obj_cb(sha1, path->buf, data);
> +					if (r)
> +						break;
> +				}
> +				continue;
> +			}
> +		}
> +
> +		if (cruft_cb) {
> +			r = cruft_cb(de->d_name, path->buf, data);

So, files *and* directories at the $GIT_DIR/objects/XX/ level are
reported as cruft (as opposed to, say, descending into the directories
and reporting any files found deeper in the hierarchy). This seems fine,
too.

> +			if (r)
> +				break;
> +		}
> +	}
> +	if (!r && subdir_cb)
> +		r = subdir_cb(de->d_name, path->buf, data);

By my reading, path->buf still contains the name of the last file in the
directory at this point. I assume you want to pass it the original
"baselen"-length path here.

> +	closedir(dir);
> +	return r;

...and anyway, it would be more polite to restore the path strbuf to its
original length before returning.

> +}
> +
> +int for_each_loose_file_in_objdir(const char *path,
> +			    each_loose_object_fn obj_cb,
> +			    each_loose_cruft_fn cruft_cb,
> +			    each_loose_subdir_fn subdir_cb,
> +			    void *data)
> +{
> +	struct strbuf buf = STRBUF_INIT;
> +	size_t baselen;
> +	DIR *dir = opendir(path);
> +	struct dirent *de;
> +	int r = 0;
> +
> +	if (!dir)
> +		return opendir_error(path);
> +
> +	strbuf_addstr(&buf, path);
> +	baselen = buf.len;
> +
> +	while ((de = readdir(dir))) {
> +		if (!isxdigit(de->d_name[0]) ||
> +		    !isxdigit(de->d_name[1]) ||
> +		    de->d_name[2])
> +			continue;

So other files or directories at the $GIT_DIR/objects/ level are just
ignored; they are not considered cruft. This is worth clarifying in the
docstring.

> +
> +		strbuf_addf(&buf, "/%s", de->d_name);
> +		r = for_each_file_in_obj_subdir(&buf, de->d_name, obj_cb,
> +						cruft_cb, subdir_cb, data);
> +		strbuf_setlen(&buf, baselen);
> +		if (r)
> +			break;
> +	}
> +
> +	closedir(dir);
> +	strbuf_release(&buf);
> +	return r;
> +}
> 

Other than my comments above, it looks good to me.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
  2014-10-03 21:47   ` Junio C Hamano
@ 2014-10-07 16:29   ` Michael Haggerty
  2014-10-08  7:19     ` Jeff King
  1 sibling, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-07 16:29 UTC (permalink / raw)
  To: Jeff King, git

On 10/03/2014 10:39 PM, Jeff King wrote:
> [...]
> Instead, this patch pushes the extra work onto prune, which
> runs less frequently (and has to look at the whole object
> graph anyway). It creates a new category of objects: objects
> which are not recent, but which are reachable from a recent
> object. We do not prune these objects, just like the
> reachable and recent ones.
> 
> This lets us avoid the recursive check above, because if we
> have an object, even if it is unreachable, we should have
> its referent:
> 
>   - if we are creating new objects, then we cannot create
>     the parent object without having the child
> 
>   - and if we are pruning objects, will not prune the child
>     if we are keeping the parent
> 
> The big exception would be if one were to write the object
> in a way that avoided referential integrity (e.g., using
> hash-object). But if you are in the habit of doing that, you
> deserve what you get.
> 
> Naively, the simplest way to implement this would be to add
> all recent objects as tips to the reachability traversal.
> However, this does not perform well. In a recently-packed
> repository, all reachable objects will also be recent, and
> therefore we have to consider each object twice (both as a
> tip, and when we reach it in the traversal). I tested this,
> and it added about 10s to a 30s prune on linux.git. This
> patch instead performs the normal reachability traversal
> first, then follows up with a second traversal for recent
> objects, skipping any that have already been marked.

I haven't read all of the old code, but if I understand correctly this
is your new algorithm:

1. Walk from all references etc., marking reachable objects.

2. Iterate over *all* objects, in no particular order, skipping the
   objects that are already known to be reachable. Use any unreachable
   object that has a recent mtime as a tip for a second traversal that
   marks all of its references as "to-keep".

3. Iterate over any objects that are not marked "to-keep". (I assume
   that this iteration is in no particular order.) For each object:

   * [Presumably] verify that its mtime is still "old"
   * If so, prune the object

I see some problems with this.

* The one that you mentioned in your cover letter, namely that prune's
  final mtime check is not atomic with the object deletion. I agree
  that this race is so much shorter than the others that we can accept
  a solution that doesn't eliminate it, so let's forget about this one.

* If the final mtime check fails, then the object is recognized as new
  and not pruned. But does that prevent its referents from being pruned?

  * When this situation is encountered, you would have to start another
    object traversal starting at the "renewed" object to mark its
    referents "to-keep". I don't see that you do this. Another, much
    less attractive alternative would be to abort the prune operation
    if this situation arises. But even if you do one of these...

  * ...assuming the iteration in step 3 is in no defined order, a
    referent might *already* have been pruned before you notice the
    "renewed" object.

So although your changes are a big improvement, it seems to me that they
still leave a race with a window approximately as long as the time it
takes to scan and prune the unreachable objects.

I think that the only way to get rid of that race is to delete objects
in parent-to-child order; in other words, *only prune an object after
all objects that refer to it have been pruned*. This could be done by
maintaining reference counts of the to-be-pruned objects and only
deleting an object once its reference count is zero.


The next point that I'm confused by is what happens when a new object or
reference is created while prune is running, and the new object or
reference refers to old objects. I think when we discussed this
privately I claimed that the following freshenings would not be
necessary, but now I think that they are (sorry about that!).


Let's take the simpler case first. Suppose I run the following command
between steps 1 and 3:

    git update-ref refs/heads/newbranch $COMMIT

, where $COMMIT is a previously-unreachable object. This doesn't affect
the mtime of $COMMIT, does it? So how does prune know that it shouldn't
delete $COMMIT?

-> So ISTM that updating a reference (or any other traversal starting
point, like the index) must freshen the mtime of any object newly
referred to.


A more complicated case: suppose I create a new $COMMIT referring to an
old $TREE during step 2, *after* prune has scanned the directory that
now contains $COMMIT. (I.e., the scan in step 2 never notices $COMMIT.)
Then I create a new reference pointing at $COMMIT. (I.e., the scan in
step 1 never noticed that the reference exists.) None of this affects
the mtime of $TREE, does it? So how does prune know that it mustn't
prune $TREE?

-> It seems to me that the creation of $COMMIT has to freshen the mtime
of $TREE, so that the final mtime check in step 3 realizes that it
shouldn't prune $TREE. Or to generalize, whenever a new object is
created which refers to existing objects, the direct referents of the
new object have to have their mtimes freshened. However, when an attempt
to write a new object accidentally coincides with an object that already
exists, *that* object needs to be freshened but *its* referents do *not*.


I hope I understood that all correctly...

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 13:21   ` Michael Haggerty
@ 2014-10-07 17:29     ` Junio C Hamano
  2014-10-07 20:18       ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-07 17:29 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Jeff King, git

Michael Haggerty <mhagger@alum.mit.edu> writes:

> I don't like the three-argument version of test_eq. Wouldn't using a
> comparison operator other than "=" would be very confusing, given that
> "eq" is in the name of the function? It also doesn't look like you use
> this feature.

> An alternative direction to go would be to specialize the function for
> equality testing and delegate to test_cmp to get better output for
> failures, but optimized to avoid excess process creation in the happy path:
>
> test_eq () {
> 	if test "$1" != "$2"
> 	then
> 		printf "%s" "$1" >expect &&
> 		printf "%s" "$2" >actual &&
> 		test_cmp expect actual
> 	fi
> }
>
> (but using properly-created temporary file names).

I agree that it would be a good idea to use a randomly generated
temporary filename that does not collide, as long as we make sure
not to leave cruft in the working tree of the test and we leave the
file there when the test script is run under "-i" or "-d" option.

The above superficially looks nice; "! test_eq a b" would give
useless output under "-v", and "test_ne a b" needs to be added if
you go that route, though.

Anyway, with the version posted, you cannot do "! test_eq a b",
either but with "test_eq a b !=", you do not have to.

	Side note. Yes, now I looked at it again, I agree that the
	three-arg form is awkwards in at least two ways.  The
	operator, if we are to take one, should be infix, and the
	name "eq" no longer matches what it does when it is given an
	operator.

The function is similar to test_cmp which takes two files but takes
two strings, so "test_cmp_str" or something perhaps (we already have
test_cmp_rev to compare two revisions, and the suggested name
follows that pattern)?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 17:29     ` Junio C Hamano
@ 2014-10-07 20:18       ` Jeff King
  2014-10-07 20:35         ` Junio C Hamano
  2014-10-07 21:16         ` Junio C Hamano
  0 siblings, 2 replies; 58+ messages in thread
From: Jeff King @ 2014-10-07 20:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Tue, Oct 07, 2014 at 10:29:59AM -0700, Junio C Hamano wrote:

> > test_eq () {
> > 	if test "$1" != "$2"
> > 	then
> > 		printf "%s" "$1" >expect &&
> > 		printf "%s" "$2" >actual &&
> > 		test_cmp expect actual
> > 	fi
> > }
> [...]
> 
> The above superficially looks nice; "! test_eq a b" would give
> useless output under "-v", and "test_ne a b" needs to be added if
> you go that route, though.

Yeah, that is why I ended up with the operator as a parameter. I modeled
after test_line_count, which faces the same problem (negation must
happen in the operator, not the full command).

> Anyway, with the version posted, you cannot do "! test_eq a b",
> either but with "test_eq a b !=", you do not have to.
> 
> 	Side note. Yes, now I looked at it again, I agree that the
> 	three-arg form is awkwards in at least two ways.  The
> 	operator, if we are to take one, should be infix, and the
> 	name "eq" no longer matches what it does when it is given an
> 	operator.

I made it postfix because it's optional, and my inclination is to handle
arguments left-to-right, since that extends to multiple optional
arguments. But of course we have just one optional argument and can
simply treat 2-arg and 3-arg forms differently. However, Michael noted
that this is really just 'test "$@"', and I think we should allow any
"test" parameters.

> The function is similar to test_cmp which takes two files but takes
> two strings, so "test_cmp_str" or something perhaps (we already have
> test_cmp_rev to compare two revisions, and the suggested name
> follows that pattern)?

Based on your responses, I'm leaning towards:

  test_cmp_str() {
	test "$@" && return 0
	echo >&2 "command failed: test $*"
	return 1
  }

since the point is really just to print _something_ when the test fails
(any quoting or whitespace may be wrong, of course, but that's OK; it's
for human consumption, and is just a hint).

Maybe "str" is not right here. Michael suggested "test_test" which is
semantically what we want, but just looks silly[1]. Maybe
"test_pred" or something? "test_cond"? Or I guess going the other way,
"sane_test" or "verbose_test" or something.

I think test_cond is my favorite of those.

-Peff

[1] Of course, we could always do "test_[". :)

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 20:18       ` Jeff King
@ 2014-10-07 20:35         ` Junio C Hamano
  2014-10-07 21:29           ` Jeff King
  2014-10-07 21:16         ` Junio C Hamano
  1 sibling, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-07 20:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> On Tue, Oct 07, 2014 at 10:29:59AM -0700, Junio C Hamano wrote:
> ...
>> The function is similar to test_cmp which takes two files but takes
>> two strings, so "test_cmp_str" or something perhaps (we already have
>> test_cmp_rev to compare two revisions, and the suggested name
>> follows that pattern)?
>
> Based on your responses, I'm leaning towards:
>
>   test_cmp_str() {
> 	test "$@" && return 0
> 	echo >&2 "command failed: test $*"
> 	return 1
>   }
>
> since the point is really just to print _something_ when the test fails
> (any quoting or whitespace may be wrong, of course, but that's OK; it's
> for human consumption, and is just a hint).

Yeah, if we are going to reduce it down to the above implementation,
intereseting things like "test -f $frotz" will become possible and
"cmp-str" stops making sense.  It really is about "We run test and
expect it to yield true.  Report the failure a bit more prominently
under the '-v' option to help us debug".

So among the ones you listed, test_verbose may be the least silly, I
would think.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 20:18       ` Jeff King
  2014-10-07 20:35         ` Junio C Hamano
@ 2014-10-07 21:16         ` Junio C Hamano
  1 sibling, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-07 21:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> Based on your responses, I'm leaning towards:
>
>   test_cmp_str() {
> 	test "$@" && return 0
> 	echo >&2 "command failed: test $*"
> 	return 1
>   }
>
> since the point is really just to print _something_ when the test fails
> (any quoting or whitespace may be wrong, of course, but that's OK; it's
> for human consumption, and is just a hint).

If we really cared, we could do

	echo >&2 "command failed: test $(git rev-parse --sq-quote "$@")"

perhaps?

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 20:35         ` Junio C Hamano
@ 2014-10-07 21:29           ` Jeff King
  2014-10-07 21:53             ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-07 21:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Michael Haggerty, git

On Tue, Oct 07, 2014 at 01:35:15PM -0700, Junio C Hamano wrote:

> Yeah, if we are going to reduce it down to the above implementation,
> intereseting things like "test -f $frotz" will become possible and
> "cmp-str" stops making sense.  It really is about "We run test and
> expect it to yield true.  Report the failure a bit more prominently
> under the '-v' option to help us debug".

We already have test_path_is_file to do the same thing just for "-f". We
could in theory switch all of those to this new, more generic wrapper. I
don't know if it is worth doing a mass-conversion, but we could
discourage test_path_is_file in new tests. We could also implement
test_path_is_{dir,file} on top of this.

There is also test_path_is_missing, which would need the negated form.
We'd either need a "test_not_cond", or to allow:

  test_cond ! -e path

That is specified by POSIX. I seem to recall that we ran into
problems using it with some shells, but I note there is currently some
use of it in t5304. So perhaps it is fine.

> So among the ones you listed, test_verbose may be the least silly, I
> would think.

Somehow test_verbose seems to me like checking the "verbose" option of
the test suite. I prefer "test_cond", but I do not feel too strongly, if
you want to override me.

> > (any quoting or whitespace may be wrong, of course, but that's OK; it's
> > for human consumption, and is just a hint).
> 
> If we really cared, we could do
> 
> 	echo >&2 "command failed: test $(git rev-parse --sq-quote "$@")"
> 
> perhaps?

Yeah, that would work. I am always a little hesitant sticking git
commands into our test infrastructure, since we may end up masking
errors due to our own bug. But we can probably rely on --sq-quote
working sanely (and anyway, we're not even affecting the test outcome
here).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 21:29           ` Jeff King
@ 2014-10-07 21:53             ` Junio C Hamano
  2014-10-07 22:17               ` Michael Haggerty
  0 siblings, 1 reply; 58+ messages in thread
From: Junio C Hamano @ 2014-10-07 21:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> On Tue, Oct 07, 2014 at 01:35:15PM -0700, Junio C Hamano wrote:
>
>> Yeah, if we are going to reduce it down to the above implementation,
>> intereseting things like "test -f $frotz" will become possible and
>> "cmp-str" stops making sense.  It really is about "We run test and
>> expect it to yield true.  Report the failure a bit more prominently
>> under the '-v' option to help us debug".
>
> We already have test_path_is_file to do the same thing just for "-f". We
> could in theory switch all of those to this new, more generic wrapper. I
> don't know if it is worth doing a mass-conversion, but we could
> discourage test_path_is_file in new tests. We could also implement
> test_path_is_{dir,file} on top of this.

Oh, I wasn't going in that direction when I mentioned "-f"; I just
wanted to say that 'test "$@"' is clearly about 'test' (/bin/test or
shell built-in) and less about 'compare string'.  I do not think it
is necessarily a good direction to go in to replace test-path-is-file
with the test_cond thing; after all, type specific tests have chance
to report breakage of expectation in type specifc ways, e.g.

	test_path_is_file () {
		test -f "$1" && return 0
		echo >&2 "expected '$1' to be file"
		if test -e "$1"
                then
	               	echo >&2 "but it is missing"
		else
                	echo >&2 "but it is a non-file"
			ls >&2 -ld "$1"
		fi
                return 1
	}

But that is also just in theory ;-).

>> So among the ones you listed, test_verbose may be the least silly, I
>> would think.
>
> Somehow test_verbose seems to me like checking the "verbose" option of
> the test suite. I prefer "test_cond", but I do not feel too strongly, if
> you want to override me.

Hmph, your 'test' in that name is a generic verb "we check that...",
which I think aligns better with the other test_foo functions.  When
I suggested 'test_verbose', 'test' in that name was specifically
meant to refer to the 'test' command.

Still "test_cond" feels somewhat funny, as "we check that..." always
validates some condition, but I don't think of anything better X-<.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 21:53             ` Junio C Hamano
@ 2014-10-07 22:17               ` Michael Haggerty
  2014-10-08  1:13                 ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-07 22:17 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: git

On 10/07/2014 11:53 PM, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
>> On Tue, Oct 07, 2014 at 01:35:15PM -0700, Junio C Hamano wrote:
>>
>>> Yeah, if we are going to reduce it down to the above implementation,
>>> intereseting things like "test -f $frotz" will become possible and
>>> "cmp-str" stops making sense.  It really is about "We run test and
>>> expect it to yield true.  Report the failure a bit more prominently
>>> under the '-v' option to help us debug".
>>
>> We already have test_path_is_file to do the same thing just for "-f". We
>> could in theory switch all of those to this new, more generic wrapper. I
>> don't know if it is worth doing a mass-conversion, but we could
>> discourage test_path_is_file in new tests. We could also implement
>> test_path_is_{dir,file} on top of this.
> 
> Oh, I wasn't going in that direction when I mentioned "-f"; I just
> wanted to say that 'test "$@"' is clearly about 'test' (/bin/test or
> shell built-in) and less about 'compare string'.  I do not think it
> is necessarily a good direction to go in to replace test-path-is-file
> with the test_cond thing; after all, type specific tests have chance
> to report breakage of expectation in type specifc ways, e.g.
> 
> 	test_path_is_file () {
> 		test -f "$1" && return 0
> 		echo >&2 "expected '$1' to be file"
> 		if test -e "$1"
>                 then
> 	               	echo >&2 "but it is missing"
> 		else
>                 	echo >&2 "but it is a non-file"
> 			ls >&2 -ld "$1"
> 		fi
>                 return 1
> 	}
> 
> But that is also just in theory ;-).
> 
>>> So among the ones you listed, test_verbose may be the least silly, I
>>> would think.
>>
>> Somehow test_verbose seems to me like checking the "verbose" option of
>> the test suite. I prefer "test_cond", but I do not feel too strongly, if
>> you want to override me.
> 
> Hmph, your 'test' in that name is a generic verb "we check that...",
> which I think aligns better with the other test_foo functions.  When
> I suggested 'test_verbose', 'test' in that name was specifically
> meant to refer to the 'test' command.
> 
> Still "test_cond" feels somewhat funny, as "we check that..." always
> validates some condition, but I don't think of anything better X-<.

I like "verbose_test $foo = $bar" because it puts the word "test" next
to the condition, where the built-in command "test" would otherwise be.

We could even define a command

	verbose () {
		"$@" && return 0
		echo >&2 "command failed: $*"
		return 1
	}

and use it like

	verbose test $foo = $bar

Somehow I feel like I'm reinventing something that must already exist...

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-07 22:17               ` Michael Haggerty
@ 2014-10-08  1:13                 ` Jeff King
  2014-10-08 16:58                   ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-08  1:13 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: Junio C Hamano, git

On Wed, Oct 08, 2014 at 12:17:07AM +0200, Michael Haggerty wrote:

> On 10/07/2014 11:53 PM, Junio C Hamano wrote:
> > Hmph, your 'test' in that name is a generic verb "we check that...",
> > which I think aligns better with the other test_foo functions.  When
> > I suggested 'test_verbose', 'test' in that name was specifically
> > meant to refer to the 'test' command.

I actually meant "test" as a namespace to indicate it is part of the
test suite (just like "test_seq" is not testing anything). I think that
is why the names are so silly. We are using the "test" command in our
"test" suite to "test" some conditions.

> I like "verbose_test $foo = $bar" because it puts the word "test" next
> to the condition, where the built-in command "test" would otherwise be.
> 
> We could even define a command
> 
> 	verbose () {
> 		"$@" && return 0
> 		echo >&2 "command failed: $*"
> 		return 1
> 	}
> 
> and use it like
> 
> 	verbose test $foo = $bar

I kind of like this. It is easy to see which shell command is being
invoked, and it would extend naturally to other silent commands.

> Somehow I feel like I'm reinventing something that must already exist...

Yes, we're basically reinventing "set -x" here, with the caveat that we
only _really_ care about showing failed commands. The problem with "set
-x" is that it also wants to apply itself to the test harness itself, so
you end up with a lot of cruft.

Below is my best attempt at keeping the cruft to a minimum. Here's
sample output using "-v" (the commands are all supplied by dummy
aliases):

    expecting success: 
            do_some_thing &&
            test "$(inspect_some_thing)" = "expected" &&
            do_some_other_thing
    
    + do_some_thing
    + echo doing some thing...
    doing some thing...
    + inspect_some_thing
    + echo foo
    + test foo = expected
    + eval_ret=1
    + set +x
    not ok 1 - experiment with set -x

It's not _too_ bad, because we turn on "set -x" for just the test eval
(mostly). But the rough edges are:

  1. We are stuck with the "eval_ret = 1; set +x" cleanup at the end. I
     don't think there's any way around that without a subshell, and
     many tests will not work in a subshell (they set environment
     variables they expect to persist).

  2. There's nothing highlighting the failed code. You just have to know
     that it was the last thing before the eval_ret call. We can set PS4
     to show $?, or even do something complicated like:

       PS4='+ $(test $? = 0 || say_color error "^^^ failure; ")'

     though note that running commands via PS4 works in bash, but
     causes dash to go into an infinite loop. :)

     But even with that, the output is still not great.

  3. The "-x" continues into any shell functions, which are hard to
     read. For example, notice above that we walked into the
     "do_some_thing" function and showed its implementation. Now imagine
     doing that for complicated test_* helpers.

So it's not great. The upside is that it Just Works everywhere without
even having to modify the tests themselves. I admit I have sometimes
used "sh -x" to debug a test script, but it is usually a giant pain due
to the verbosity of the harness code. The patch below cuts out _most_ of
that because at least we just use it during the eval, but I'm still not
sure it's a good default for "-v" (we could add it as "-vv" or
something, though, if others find it useful).

---
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 82095e3..af51868 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -517,10 +517,30 @@ maybe_setup_valgrind () {
 	fi
 }
 
+# This is a separate function because some tests use
+# "return" to end a test_expect_success block early
+# (and we want to make sure we do our "set +x" cleanup).
+test_eval_inner_2 () {
+	# We do the "set -x" inside the eval because we
+	# want to keep it as close to the actual test
+	# commands as possible to avoid harness cruft.
+	eval "set -x; $*"
+}
+
+# All of the "set +x" cleanup has to happen inside
+# here, because the output is redirected (otherwise
+# we leak "set -x" lines to stderr in non-verbose mode.
+test_eval_inner_1 () {
+	test_eval_inner_2 "$@"
+	eval_ret=$?
+	set +x
+	return $eval_ret
+}
+
+# This wrapper exists just to keep the I/O redirect
+# factored out into a single place.
 test_eval_ () {
-	# This is a separate function because some tests use
-	# "return" to end a test_expect_success block early.
-	eval </dev/null >&3 2>&4 "$*"
+	test_eval_inner_1 "$@" </dev/null >&3 2>&4
 }
 
 test_run_ () {
@@ -531,7 +551,10 @@ test_run_ () {
 	eval_ret=$?
 	teardown_malloc_check
 
-	if test -z "$immediate" || test $eval_ret = 0 || test -n "$expecting_failure"
+	# We avoid running a straight ":" because it is a noop, and it
+	# pollutes our "set -x" output.
+	if test -z "$immediate" || test $eval_ret = 0 ||
+	   test -n "$expecting_failure" && test "$test_cleanup" != ":"
 	then
 		setup_malloc_check
 		test_eval_ "$test_cleanup"

^ permalink raw reply related	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-07 16:29   ` Michael Haggerty
@ 2014-10-08  7:19     ` Jeff King
  2014-10-08 10:37       ` Michael Haggerty
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-08  7:19 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

On Tue, Oct 07, 2014 at 06:29:00PM +0200, Michael Haggerty wrote:

> I haven't read all of the old code, but if I understand correctly this
> is your new algorithm:
> 
> 1. Walk from all references etc., marking reachable objects.
> 
> 2. Iterate over *all* objects, in no particular order, skipping the
>    objects that are already known to be reachable. Use any unreachable
>    object that has a recent mtime as a tip for a second traversal that
>    marks all of its references as "to-keep".
> 
> 3. Iterate over any objects that are not marked "to-keep". (I assume
>    that this iteration is in no particular order.) For each object:
> 
>    * [Presumably] verify that its mtime is still "old"
>    * If so, prune the object

Yes, that's more or less accurate. The iteration is in readdir() order
on the filesystem.

We do verify that the mtime is still "old" in the final iteration, but
that is mostly because the existing check was left in. In theory any
recent objects would have been caught in step 2 and marked as "to-keep"
already. Anything we find in step 3 would have to have been racily
created or freshened.

> I see some problems with this.
> 
> * The one that you mentioned in your cover letter, namely that prune's
>   final mtime check is not atomic with the object deletion. I agree
>   that this race is so much shorter than the others that we can accept
>   a solution that doesn't eliminate it, so let's forget about this one.

Right, I don't see an easy way around this.

> * If the final mtime check fails, then the object is recognized as new
>   and not pruned. But does that prevent its referents from being pruned?
> 
>   * When this situation is encountered, you would have to start another
>     object traversal starting at the "renewed" object to mark its
>     referents "to-keep". I don't see that you do this. Another, much
>     less attractive alternative would be to abort the prune operation
>     if this situation arises. But even if you do one of these...
> 
>   * ...assuming the iteration in step 3 is in no defined order, a
>     referent might *already* have been pruned before you notice the
>     "renewed" object.
> 
> So although your changes are a big improvement, it seems to me that they
> still leave a race with a window approximately as long as the time it
> takes to scan and prune the unreachable objects.

Correct. There is a delay between marking objects and deleting them.
This goes for both the existing reachability checks and the new "recent
reachability" check.

As noted above, if we see a fresh object in the final check, then we
know that it was newly freshened (or created). We could then do an
additional traversal with it as the tip, to get a slightly more accurate
view of the world.

The obvious problem as you note is that we may already have deleted its
referents. But let's leave that aside for a moment.

But even if we fix that problem, I don't think traversing again can
eliminate the race. Another process may be freshening or creating
objects after we have processed them (or their containing directories).
So you can catch _some_ cases by re-traversing, but there will always be
cases where we delete an object that has just now become recent (or
reachable, for that matter, if somebody updates the refs).

The obvious solution is to have some atomic view of the object and ref
namespace (i.e., a global write lock that says "I'm pruning, nobody
write"). But that sucks for obvious reasons. I feel like there must be
some more clever solution, but it eludes me. Surely this is something
database people solved 30 years ago. :)

> I think that the only way to get rid of that race is to delete objects
> in parent-to-child order; in other words, *only prune an object after
> all objects that refer to it have been pruned*. This could be done by
> maintaining reference counts of the to-be-pruned objects and only
> deleting an object once its reference count is zero.

Yes, if you are going to traverse again, you would want to delete in
parent-child order. I'm not convinced that traversing again is worth it;
it's trying to shorten the window, but it can't eliminate it. And my
goal here was never to eliminate the race. It was to keep races to
"simultaneous reference and prune is a problem", and not "ongoing
unbounded operations and simultaneous prune are a problem". And I do not
claim to eliminate the possibility of referents going missing; only to
try to close some obvious and easy holes where it happens.

> Let's take the simpler case first. Suppose I run the following command
> between steps 1 and 3:
> 
>     git update-ref refs/heads/newbranch $COMMIT
> 
> , where $COMMIT is a previously-unreachable object. This doesn't affect
> the mtime of $COMMIT, does it? So how does prune know that it shouldn't
> delete $COMMIT?
> 
> -> So ISTM that updating a reference (or any other traversal starting
> point, like the index) must freshen the mtime of any object newly
> referred to.

_If_ the deletion of the object and the checking of its mtime were
atomic, that would be useful to do. But it's not. Before my patch, you
have one way of "saving" the object (and its referents): making it
reachable from a ref. After my patch, you have the additional option of
updating its mtime.

But why bother with the mtime? You can just make it reachable by
updating the ref. Both are racy, but we cannot help that, so one is as
good as the other.

> A more complicated case: suppose I create a new $COMMIT referring to an
> old $TREE during step 2, *after* prune has scanned the directory that
> now contains $COMMIT. (I.e., the scan in step 2 never notices $COMMIT.)
> Then I create a new reference pointing at $COMMIT. (I.e., the scan in
> step 1 never noticed that the reference exists.) None of this affects
> the mtime of $TREE, does it? So how does prune know that it mustn't
> prune $TREE?

It doesn't, and you are screwed. :)

You could freshen the referents here, but you are still racy. Just as
you might miss the creation of $COMMIT, you might miss the freshening of
$TREE and delete it.

Making the mtimes race-free requires an atomic check-timestamp-and-delete.
And without that, I'm not sure that shortening the race from 50 system
calls to 3 system calls is worth the additional complexity. If we had
such an atomic operation, even on only a subset of systems, it might
be worth it. But I do not know of any filesystem or system call that can
do so.

> I hope I understood that all correctly...

I think your analysis is all correct. The open question is whether it's
worth trying to shrink the last bits of raciness or not (or even whether
there is a clever way of eliminating them that I haven't considered).

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 09/16] prune: factor out loose-object directory traversal
  2014-10-07 14:07   ` Michael Haggerty
@ 2014-10-08  7:33     ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-08  7:33 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

On Tue, Oct 07, 2014 at 04:07:52PM +0200, Michael Haggerty wrote:

> On 10/03/2014 10:29 PM, Jeff King wrote:
> > Prune has to walk $GIT_DIR/objects/?? in order to find the
> > set of loose objects to prune. Other parts of the code
> > (e.g., count-objects) want to do the same. Let's factor it
> > out into a reusable for_each-style function.
> > 
> > Note that this is not quite a straight code movement. There
> > are two differences:
> > 
> >   1. The original code iterated from 0 to 256, trying to
> >      opendir("$GIT_DIR/%02x"). The new code just does a
> >      readdir() on the object directory, and descends into
> >      any matching directories. This is faster on
> >      already-pruned repositories, and should not ever be
> >      slower (nobody ever creates other files in the object
> >      directory).
> 
> This would change the order that the objects are processed. I doubt that
> matters to anybody, but it's probably worth mentioning in the commit
> message.

Yeah, I don't think it matters, but I'll mention it.

> > +	if (!dir)
> > +		return opendir_error(path->buf);
> 
> OK, so if there is a non-directory named $GIT_DIR/objects/33, then we
> emit an "unable to open" error rather than treating it as cruft. I think
> this is reasonable.

Correct. The original "prune" silently ignored this case, but I think
it makes sense to complain about oddities. We must treat ENOENT
as a noop, even though the value comes from readdir() and therefore
should exist. A simultaneous prune might have deleted it (hopefully
nobody is insane enough to run two prunes at once, but ignoring
directories that went away seems like the only sane behavior to me).

> > +		if (cruft_cb) {
> > +			r = cruft_cb(de->d_name, path->buf, data);
> 
> So, files *and* directories at the $GIT_DIR/objects/XX/ level are
> reported as cruft (as opposed to, say, descending into the directories
> and reporting any files found deeper in the hierarchy). This seems fine,
> too.

Yes, this matches the original prune behavior (and anyway, this is the
object database; anything that is not a loose object is cruft).

> > +			if (r)
> > +				break;
> > +		}
> > +	}
> > +	if (!r && subdir_cb)
> > +		r = subdir_cb(de->d_name, path->buf, data);
> 
> By my reading, path->buf still contains the name of the last file in the
> directory at this point. I assume you want to pass it the original
> "baselen"-length path here.

Ack, good catch. This was originally in the outer for_each_loose_file
loop, but I thought it was more clear to handle it in the subdir
function. Not only is path->buf wrong here, but de->d_name is totally
bogus here.

Will fix in the re-roll.

> > +int for_each_loose_file_in_objdir(const char *path,
> [...]
> So other files or directories at the $GIT_DIR/objects/ level are just
> ignored; they are not considered cruft. This is worth clarifying in the
> docstring.

I tried to clarify that by indicating that we iterated only over the
"loose-object parts of the object directory". I guess that needs a more
clear definition.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/16] object_array: factor out slopbuf-freeing logic
  2014-10-07 11:25   ` Michael Haggerty
@ 2014-10-08  7:36     ` Jeff King
  2014-10-08  8:40       ` Michael Haggerty
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-08  7:36 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

On Tue, Oct 07, 2014 at 01:25:54PM +0200, Michael Haggerty wrote:

> > +static void object_array_release_entry(struct object_array_entry *ent)
> > +{
> > +	if (ent->name != object_array_slopbuf)
> > +		free(ent->name);
> > +}
> > +
> 
> Would it be a little safer to set ent->name to NULL or to
> object_array_slopbuf after freeing the memory, to prevent accidents?

I considered that, but what about the other parts of object_array_entry?
Should we NULL the object context pointers, too?

The intent of this function is freeing memory, not clearing it for sane
reuse.  I think I'd be more in favor of a comment clarifying that. It is
a static function used only internally by the object-array code.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-06  1:42   ` Jeff King
@ 2014-10-08  8:31     ` Jeff King
  2014-10-08 17:03       ` Junio C Hamano
  0 siblings, 1 reply; 58+ messages in thread
From: Jeff King @ 2014-10-08  8:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: René Scharfe, git, Michael Haggerty

On Sun, Oct 05, 2014 at 09:42:49PM -0400, Jeff King wrote:

> On Sat, Oct 04, 2014 at 03:22:10PM -0700, Junio C Hamano wrote:
> 
> > This applied on top of 'maint' (which does have c40fdd01) makes the
> > test #9 (prune: do not prune detached HEAD with no reflog) fail.
> 
> I'll fix the bone-headed error returns that René noticed and double
> check that they were the complete culprit in the test failure you saw
> (and not just masking some other problem).

OK, I figured out what is going on. The short of it is that yes, it's
the return-value bug René noticed. Read on only if you are really
interested.  :)

The test runs "git prune" and intends to check that the detached HEAD
commit is not in the list. It does so by checking the whole output of
"git prune -n", which it expects to be empty (and it generally is,
because we've gc'd in previous tests and all objects were either packed
or pruned).

However, when the test fails, there is a pruned object. But it is not
the HEAD commit that we just wrote, but rather the tree that it points
to. When we run "git commit", it tries to write out the tree with
write_sha1_file. This in turn calls freshen_packed_object to check
whether we have the object packed. We do, but the return-value bug makes
us think we do not. As a result, we write out an extra copy of the tree
as a loose object, and that is what gets pruned (and not by prune's
normal is-it-old-and-unreachable code, but by its call to prune_packed).

So that explains that bug (as a side note, you might think that if we
are flipping return values, lots of things would fail when they ask "do
we have this packed object" and it erroneously says "yes". But that does
not happen. The wrong return value comes from freshening the file, so we
only flip "yes" to "no", and never the opposite).

> > If we merge 'dt/cache-tree-repair' (which in turn needs
> > 'jc/reopen-lock-file') to 'maint' and then apply these on top, the
> > said test passes.  But I do not see an apparent reason why X-<.

When dt/cache-tree-repair is merged, we have a valid cache tree when we
run "git commit", and we realize that we do not need to write out the
tree object at all. Thus we never hit the buggy code, the object isn't
created, and the subsequent prune reports that there is nothing to
prune.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/16] object_array: factor out slopbuf-freeing logic
  2014-10-08  7:36     ` Jeff King
@ 2014-10-08  8:40       ` Michael Haggerty
  2014-10-08  8:55         ` Jeff King
  0 siblings, 1 reply; 58+ messages in thread
From: Michael Haggerty @ 2014-10-08  8:40 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 10/08/2014 09:36 AM, Jeff King wrote:
> On Tue, Oct 07, 2014 at 01:25:54PM +0200, Michael Haggerty wrote:
> 
>>> +static void object_array_release_entry(struct object_array_entry *ent)
>>> +{
>>> +	if (ent->name != object_array_slopbuf)
>>> +		free(ent->name);
>>> +}
>>> +
>>
>> Would it be a little safer to set ent->name to NULL or to
>> object_array_slopbuf after freeing the memory, to prevent accidents?
> 
> I considered that, but what about the other parts of object_array_entry?
> Should we NULL the object context pointers, too?
> 
> The intent of this function is freeing memory, not clearing it for sane
> reuse.  I think I'd be more in favor of a comment clarifying that. It is
> a static function used only internally by the object-array code.

I guess the name reminded me of strbuf_release(), which returns the
strbuf to its newly-initialized state (contrary to what api-strbuf.txt
says, I just noticed). You're right that your function does no such
thing, so it is self-consistent for it not to set ent->name to NULL.

But maybe its name could be chosen better? Let's see if there is a
consensus naming policy for functions that free resources. I grepped for
short functions calling free() and visually inspected a bunch of them.

Functions *_release():

* strbuf_release(), range_set_release(), and diff_ranges_release()
completely reinitialize their arguments

* window_release() appears not to


Functions *_clear() and clear_*():

* All *_clear() functions that I looked at (e.g., argv_array_clear(),
clear_image(), credential_clear(), clear_exclude_list(),
signature_check_clear(), and clear_prio_queue()) completely reinitialize
their arguments


Functions *_free() and free_*():

* Almost all of these free their arguments plus anything that their
arguments point at.

* Confusingly, free_ref_list() and free_pathspec() don't free their
arguments, but rather only the things that their arguments points at.
(Perhaps they should be renamed.)


So while three out of four *_release() functions completely reinitialize
their arguments, there is one that doesn't. And I couldn't find enough
other functions that just free referenced memory without reinitializing
their whole argument to establish a naming pattern. So I guess your
function name is OK too.

So forget I said anything :-)

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 03/16] object_array: factor out slopbuf-freeing logic
  2014-10-08  8:40       ` Michael Haggerty
@ 2014-10-08  8:55         ` Jeff King
  0 siblings, 0 replies; 58+ messages in thread
From: Jeff King @ 2014-10-08  8:55 UTC (permalink / raw)
  To: Michael Haggerty; +Cc: git

On Wed, Oct 08, 2014 at 10:40:03AM +0200, Michael Haggerty wrote:

> > The intent of this function is freeing memory, not clearing it for sane
> > reuse.  I think I'd be more in favor of a comment clarifying that. It is
> > a static function used only internally by the object-array code.
> 
> I guess the name reminded me of strbuf_release(), which returns the
> strbuf to its newly-initialized state (contrary to what api-strbuf.txt
> says, I just noticed). You're right that your function does no such
> thing, so it is self-consistent for it not to set ent->name to NULL.

Yeah, I had the same thought while writing it (and ended up with the
same analysis you do below).

> Functions *_clear() and clear_*():

I think these ones very clearly are about reinitializing to empty (and
it looks like we follow that rule, which is good).

If we were designing it now, I think strbuf_release() should probably be
called strbuf_clear(). Or maybe that would be too confusing, as it might
imply it is the same thing as strbuf_reset(). Yeesh. Naming is hard.

> Functions *_free() and free_*():
> 
> * Almost all of these free their arguments plus anything that their
> arguments point at.

Yes, that's the rule I think we try to follow.

> * Confusingly, free_ref_list() and free_pathspec() don't free their
> arguments, but rather only the things that their arguments points at.
> (Perhaps they should be renamed.)

Yeah, I would almost say free_pathspec should be called clear_pathspec.
Except it _only_ NULLs the array. It leaves "nr" set, which means that
anybody looking at it will still dereference a bogus pointer (but at
least it's NULL and not freed memory!).

The free_ref_list() function is on my todo list to get rid of as part of
the for-each-ref/branch/tag merger I'd like to do. But somehow that
keeps slipping further down my todo list rather than actually getting
finished. :(

> So while three out of four *_release() functions completely reinitialize
> their arguments, there is one that doesn't. And I couldn't find enough
> other functions that just free referenced memory without reinitializing
> their whole argument to establish a naming pattern. So I guess your
> function name is OK too.

I'm open to suggestions for totally new names for this concept (free
associated memory, do not reinitialize, but do not free the passed
pointer). But in the absence of one, I think release() is the least-bad.

-Peff

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 13/16] prune: keep objects reachable from recent objects
  2014-10-08  7:19     ` Jeff King
@ 2014-10-08 10:37       ` Michael Haggerty
  0 siblings, 0 replies; 58+ messages in thread
From: Michael Haggerty @ 2014-10-08 10:37 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On 10/08/2014 09:19 AM, Jeff King wrote:
> On Tue, Oct 07, 2014 at 06:29:00PM +0200, Michael Haggerty wrote:
> 
>> I haven't read all of the old code, but if I understand correctly this
>> is your new algorithm:
>>
>> 1. Walk from all references etc., marking reachable objects.
>>
>> 2. Iterate over *all* objects, in no particular order, skipping the
>>    objects that are already known to be reachable. Use any unreachable
>>    object that has a recent mtime as a tip for a second traversal that
>>    marks all of its references as "to-keep".
>>
>> 3. Iterate over any objects that are not marked "to-keep". (I assume
>>    that this iteration is in no particular order.) For each object:
>>
>>    * [Presumably] verify that its mtime is still "old"
>>    * If so, prune the object
> 
> Yes, that's more or less accurate. The iteration is in readdir() order
> on the filesystem.
> 
> We do verify that the mtime is still "old" in the final iteration, but
> that is mostly because the existing check was left in. In theory any
> recent objects would have been caught in step 2 and marked as "to-keep"
> already. Anything we find in step 3 would have to have been racily
> created or freshened.

I *like* the mtime check in step 3 and think it should be kept.

>> I see some problems with this.
>>
>> * The one that you mentioned in your cover letter, namely that prune's
>>   final mtime check is not atomic with the object deletion. I agree
>>   that this race is so much shorter than the others that we can accept
>>   a solution that doesn't eliminate it, so let's forget about this one.
> 
> Right, I don't see an easy way around this.

I had an idea to mostly get around it; see below.

>> * If the final mtime check fails, then the object is recognized as new
>>   and not pruned. But does that prevent its referents from being pruned?
>>
>>   * When this situation is encountered, you would have to start another
>>     object traversal starting at the "renewed" object to mark its
>>     referents "to-keep". I don't see that you do this. Another, much
>>     less attractive alternative would be to abort the prune operation
>>     if this situation arises. But even if you do one of these...
>>
>>   * ...assuming the iteration in step 3 is in no defined order, a
>>     referent might *already* have been pruned before you notice the
>>     "renewed" object.
>>
>> So although your changes are a big improvement, it seems to me that they
>> still leave a race with a window approximately as long as the time it
>> takes to scan and prune the unreachable objects.
> 
> Correct. There is a delay between marking objects and deleting them.
> This goes for both the existing reachability checks and the new "recent
> reachability" check.
> 
> As noted above, if we see a fresh object in the final check, then we
> know that it was newly freshened (or created). We could then do an
> additional traversal with it as the tip, to get a slightly more accurate
> view of the world.
> 
> The obvious problem as you note is that we may already have deleted its
> referents. But let's leave that aside for a moment.
> 
> But even if we fix that problem, I don't think traversing again can
> eliminate the race. Another process may be freshening or creating
> objects after we have processed them (or their containing directories).

...then you would traverse again when you discovered an object freshened
by *that* process.

Please note that these would not be full traversals; you would only have
to traverse starting at the newly freshened object(s), and the traversal
could stop when it hit objects that are already known to be "to-keep".
So they should be quick. (But the length of the residual race wouldn't
depend on their being quick.)

> So you can catch _some_ cases by re-traversing, but there will always be
> cases where we delete an object that has just now become recent (or
> reachable, for that matter, if somebody updates the refs).
> 
> The obvious solution is to have some atomic view of the object and ref
> namespace (i.e., a global write lock that says "I'm pruning, nobody
> write"). But that sucks for obvious reasons. I feel like there must be
> some more clever solution, but it eludes me. Surely this is something
> database people solved 30 years ago. :)
> 
>> I think that the only way to get rid of that race is to delete objects
>> in parent-to-child order; in other words, *only prune an object after
>> all objects that refer to it have been pruned*. This could be done by
>> maintaining reference counts of the to-be-pruned objects and only
>> deleting an object once its reference count is zero.
> 
> Yes, if you are going to traverse again, you would want to delete in
> parent-child order. I'm not convinced that traversing again is worth it;
> it's trying to shorten the window, but it can't eliminate it. And my
> goal here was never to eliminate the race. It was to keep races to
> "simultaneous reference and prune is a problem", and not "ongoing
> unbounded operations and simultaneous prune are a problem". And I do not
> claim to eliminate the possibility of referents going missing; only to
> try to close some obvious and easy holes where it happens.
> 
>> Let's take the simpler case first. Suppose I run the following command
>> between steps 1 and 3:
>>
>>     git update-ref refs/heads/newbranch $COMMIT
>>
>> , where $COMMIT is a previously-unreachable object. This doesn't affect
>> the mtime of $COMMIT, does it? So how does prune know that it shouldn't
>> delete $COMMIT?
>>
>> -> So ISTM that updating a reference (or any other traversal starting
>> point, like the index) must freshen the mtime of any object newly
>> referred to.
> 
> _If_ the deletion of the object and the checking of its mtime were
> atomic, that would be useful to do. But it's not. Before my patch, you
> have one way of "saving" the object (and its referents): making it
> reachable from a ref. After my patch, you have the additional option of
> updating its mtime.
> 
> But why bother with the mtime? You can just make it reachable by
> updating the ref. Both are racy, but we cannot help that, so one is as
> good as the other.

Yes, but the race between the time prune starts reading the references
and the time that it scans all objects to find the ones that were not
marked reachable can be quite long--many seconds for a big repo. During
this whole time, if somebody creates a new reference that refers to a
previously-unreachable object, then prune will corrupt the repository.

On the other hand, if creating a new reference freshens the referent,
then the only race window is the one between prune's final check of the
object's mtime and its deletion of the file, which is only the time for
two consecutive system calls.

This is an enormous difference, and one is definitely not as good as the
other unless perfection is considered the only metric of success.

>> A more complicated case: suppose I create a new $COMMIT referring to an
>> old $TREE during step 2, *after* prune has scanned the directory that
>> now contains $COMMIT. (I.e., the scan in step 2 never notices $COMMIT.)
>> Then I create a new reference pointing at $COMMIT. (I.e., the scan in
>> step 1 never noticed that the reference exists.) None of this affects
>> the mtime of $TREE, does it? So how does prune know that it mustn't
>> prune $TREE?
> 
> It doesn't, and you are screwed. :)
> 
> You could freshen the referents here, but you are still racy. Just as
> you might miss the creation of $COMMIT, you might miss the freshening of
> $TREE and delete it.

But again, the race window for missing the freshening of $TREE would
only be the time of two consecutive system calls.

> Making the mtimes race-free requires an atomic check-timestamp-and-delete.
> And without that, I'm not sure that shortening the race from 50 system
> calls to 3 system calls is worth the additional complexity. If we had
> such an atomic operation, even on only a subset of systems, it might
> be worth it. But I do not know of any filesystem or system call that can
> do so.

It it were only 50 consecutive system calls, that would be one thing.
But can't the full object traversal take many seconds on a large repo?
That still seems like a pretty big race to me.


I think I found a way to make the final check-timestamp-and-delete safe,
albeit leaving open the possibility that the object is briefly moved to
a different filename. I don't think it's *necessary*, but I think it's
*possible*:

Any process that wants to create a new reference to an existing object
freshens the corresponding file as follows:

1. Write the file (if necessary)

2. Update the file's mtime.

   * If the mtime update succeeds, your object is safe against being
     pruned (though it might be renamed for a moment; see below)

   * If the mtime update fails, treat it as if the object had never
     been there (e.g., report an error or write a new copy of the
     object). This is the same situation as if the object had been
     pruned just before you sought it.

Prune, when it is pretty sure that it wants to delete a file in step 3,
does the following:

1. Check mtime to make sure the file is still stale (this is not
   strictly necessary, but prevents an object that has been freshened
   since the traversal in step 2 from being temporarily renamed).

2. Rename the file to "<filename>.to-be-deleted" (or, possibly, using
   the filename "<filename>.lock" would have good side-effects).

3. Check the mtime again to make sure the file is still stale.

   * If yes, then delete the object permanently

   * If no, then rename it back to its original name

4. If either of the staleness checks failed, then do an object
   traversal to mark this object and any of its (direct or indirect)
   referents "to-keep".

I think that this procedure, along with a change to deleting objects in
referrer-then-referent order, would guarantee that prune can never
permanently delete any object that is still be in use, though it might
move such a file to a different filename for a moment.

I'm not sure that this scheme would work on Windows (I seem to recall
that renaming a file on Windows changes its mtime). But nobody in their
right mind would run a Git server on a Windows computer anyway...

Also, to be really super anal about things, there should be a procedure
to reinstate any "<filename>.to-be-deleted" files if the program dies or
if the computer crashes during the final, FINAL mtime check. But that is
almost certainly overkill.

>> I hope I understood that all correctly...
> 
> I think your analysis is all correct. The open question is whether it's
> worth trying to shrink the last bits of raciness or not (or even whether
> there is a clever way of eliminating them that I haven't considered).

To be clear, I think your changes are a big improvement, and my ideas
here should not be seen as blockers to your patch series getting
accepted. But I think your changes still leave races that are big enough
that they will be observed on busy Git servers.

Michael

-- 
Michael Haggerty
mhagger@alum.mit.edu

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 08/16] t5304: use helper to report failure of "test foo = bar"
  2014-10-08  1:13                 ` Jeff King
@ 2014-10-08 16:58                   ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-08 16:58 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Haggerty, git

Jeff King <peff@peff.net> writes:

> On Wed, Oct 08, 2014 at 12:17:07AM +0200, Michael Haggerty wrote:
>
>> On 10/07/2014 11:53 PM, Junio C Hamano wrote:
>> > Hmph, your 'test' in that name is a generic verb "we check that...",
>> > which I think aligns better with the other test_foo functions.  When
>> > I suggested 'test_verbose', 'test' in that name was specifically
>> > meant to refer to the 'test' command.
>
> I actually meant "test" as a namespace to indicate it is part of the
> test suite (just like "test_seq" is not testing anything). I think that
> is why the names are so silly. We are using the "test" command in our
> "test" suite to "test" some conditions.
>
>> I like "verbose_test $foo = $bar" because it puts the word "test" next
>> to the condition, where the built-in command "test" would otherwise be.
>> 
>> We could even define a command
>> 
>> 	verbose () {
>> 		"$@" && return 0
>> 		echo >&2 "command failed: $*"
>> 		return 1
>> 	}
>> 
>> and use it like
>> 
>> 	verbose test $foo = $bar
>
> I kind of like this. It is easy to see which shell command is being
> invoked, and it would extend naturally to other silent commands.
>
>> Somehow I feel like I'm reinventing something that must already exist...
>
> Yes, we're basically reinventing "set -x" here, with the caveat that we
> only _really_ care about showing failed commands. The problem with "set
> -x" is that it also wants to apply itself to the test harness itself, so
> you end up with a lot of cruft.

As you have done, I've done my share of running tests under "sh -x",
and I agree that it gives too much noise. Which is almost impossible
to read through without a lot of patience, but at least it contains
everything we could ask, short of running a specific step under gdb.

The "verbose" thing that reports only when the expectation is violated
cuts down the cruft (so does your original test_cond/verbose_test),
so it may make it easier than the bare-metal "sh -x".

But it may cut down a bit too much.  Often in a test that consists
of multiple commands chained together with && inside a single
test_expect_success, what leads to the eventual unsatisfied
expectation is an earlier step that succeeds in the sense that it
satisfies the expectation of that step (e.g. "git branch foo &&" and
we only check it did not exit with non-zero status) but did not
actually do what we wanted it to do (e.g. that "git branch" did not
create a new branch 'foo' but created 'refs/foo' by mistake), and
then a later step is tripped by the unsatisified expectation that
was not explicitly tested (e.g. after that "git checkout foo" would
not quite fail but instead of checking out a branch to build upon,
it detaches the HEAD at that commit, because the earlier "git branch"
was faulty but we did not notice it).

So I dunno.  For harder cases we always have to resort to "sh -x",
but "verbose" thing may still help easier cases, just like the use
of test_cmp instead of cmp is helping us while viewing output from
running test with the "-v" option today, and that would be a good
enough improvement.

^ permalink raw reply	[flat|nested] 58+ messages in thread

* Re: [PATCH 0/16] make prune mtime-checking more careful
  2014-10-08  8:31     ` Jeff King
@ 2014-10-08 17:03       ` Junio C Hamano
  0 siblings, 0 replies; 58+ messages in thread
From: Junio C Hamano @ 2014-10-08 17:03 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git, Michael Haggerty

Jeff King <peff@peff.net> writes:

> So that explains that bug (as a side note, you might think that if we
> are flipping return values, lots of things would fail when they ask "do
> we have this packed object" and it erroneously says "yes". But that does
> not happen. The wrong return value comes from freshening the file, so we
> only flip "yes" to "no", and never the opposite).
> ...
> When dt/cache-tree-repair is merged, we have a valid cache tree when we
> run "git commit", and we realize that we do not need to write out the
> tree object at all. Thus we never hit the buggy code, the object isn't
> created, and the subsequent prune reports that there is nothing to
> prune.

Wow, that is a tricky "bug" (rather the series with the bug failed
to fail when applied to 'master', so it is an "unbug" that hides a
bug) to hunt down.  Thanks for digging.

^ permalink raw reply	[flat|nested] 58+ messages in thread

end of thread, other threads:[~2014-10-08 17:03 UTC | newest]

Thread overview: 58+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
2014-10-03 22:55   ` René Scharfe
2014-10-04  0:31     ` Jeff King
2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
2014-10-07 11:25   ` Michael Haggerty
2014-10-08  7:36     ` Jeff King
2014-10-08  8:40       ` Michael Haggerty
2014-10-08  8:55         ` Jeff King
2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
2014-10-03 22:12   ` Junio C Hamano
2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
2014-10-03 22:17   ` Junio C Hamano
2014-10-04  0:13     ` Jeff King
2014-10-07 13:21   ` Michael Haggerty
2014-10-07 17:29     ` Junio C Hamano
2014-10-07 20:18       ` Jeff King
2014-10-07 20:35         ` Junio C Hamano
2014-10-07 21:29           ` Jeff King
2014-10-07 21:53             ` Junio C Hamano
2014-10-07 22:17               ` Michael Haggerty
2014-10-08  1:13                 ` Jeff King
2014-10-08 16:58                   ` Junio C Hamano
2014-10-07 21:16         ` Junio C Hamano
2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
2014-10-03 22:19   ` Junio C Hamano
2014-10-04  0:24     ` Jeff King
2014-10-07 14:07   ` Michael Haggerty
2014-10-08  7:33     ` Jeff King
2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
2014-10-05  8:15   ` René Scharfe
2014-10-05 10:47     ` Ramsay Jones
2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
2014-10-03 21:47   ` Junio C Hamano
2014-10-04  0:09     ` Jeff King
2014-10-04  0:30     ` Jeff King
2014-10-04  3:04       ` Junio C Hamano
2014-10-07 16:29   ` Michael Haggerty
2014-10-08  7:19     ` Jeff King
2014-10-08 10:37       ` Michael Haggerty
2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
2014-10-03 21:29   ` Junio C Hamano
2014-10-04  0:01     ` Jeff King
2014-10-05  9:12   ` René Scharfe
2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
2014-10-04 22:22 ` Junio C Hamano
2014-10-05  9:19   ` René Scharfe
2014-10-06  1:42   ` Jeff King
2014-10-08  8:31     ` Jeff King
2014-10-08 17:03       ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.