All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout
@ 2014-07-06  4:06 David Turner
  2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: David Turner @ 2014-07-06  4:06 UTC (permalink / raw)
  To: git; +Cc: David Turner

When git checkout checks out a branch, create or update the
cache-tree so that subsequent operations are faster.

update_main_cache_tree learned a new flag, WRITE_TREE_REPAIR.  When
WRITE_TREE_REPAIR is set, portions of the cache-tree which do not
correspond to existing tree objects are invalidated (and portions which
do are marked as valid).  No new tree objects are created.

Signed-off-by: David Turner <dturner@twitter.com>
---
 builtin/checkout.c    |  8 ++++++++
 cache-tree.c          | 12 +++++++++++-
 cache-tree.h          |  1 +
 t/t0090-cache-tree.sh | 19 ++++++++++++++++---
 4 files changed, 36 insertions(+), 4 deletions(-)

diff --git a/builtin/checkout.c b/builtin/checkout.c
index 07cf555..054214f 100644
--- a/builtin/checkout.c
+++ b/builtin/checkout.c
@@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts *opts,
 		}
 	}
 
+	if (!active_cache_tree)
+		active_cache_tree = cache_tree();
+
+	if (!cache_tree_fully_valid(active_cache_tree))
+		cache_tree_update(active_cache_tree,
+				  (const struct cache_entry * const *)active_cache,
+				  active_nr, WRITE_TREE_SILENT | WRITE_TREE_REPAIR);
+
 	if (write_cache(newfd, active_cache, active_nr) ||
 	    commit_locked_index(lock_file))
 		die(_("unable to write new index file"));
diff --git a/cache-tree.c b/cache-tree.c
index 7fa524a..f951d7d 100644
--- a/cache-tree.c
+++ b/cache-tree.c
@@ -239,9 +239,12 @@ static int update_one(struct cache_tree *it,
 	struct strbuf buffer;
 	int missing_ok = flags & WRITE_TREE_MISSING_OK;
 	int dryrun = flags & WRITE_TREE_DRY_RUN;
+	int repair = flags & WRITE_TREE_REPAIR;
 	int to_invalidate = 0;
 	int i;
 
+	assert(!(dryrun && repair));
+
 	*skip_count = 0;
 
 	if (0 <= it->entry_count && has_sha1_file(it->sha1))
@@ -374,7 +377,14 @@ static int update_one(struct cache_tree *it,
 #endif
 	}
 
-	if (dryrun)
+	if (repair) {
+		unsigned char sha1[20];
+		hash_sha1_file(buffer.buf, buffer.len, tree_type, sha1);
+		if (has_sha1_file(sha1))
+			hashcpy(it->sha1, sha1);
+		else
+			to_invalidate = 1;
+	} else if (dryrun)
 		hash_sha1_file(buffer.buf, buffer.len, tree_type, it->sha1);
 	else if (write_sha1_file(buffer.buf, buffer.len, tree_type, it->sha1)) {
 		strbuf_release(&buffer);
diff --git a/cache-tree.h b/cache-tree.h
index f1923ad..666d18f 100644
--- a/cache-tree.h
+++ b/cache-tree.h
@@ -39,6 +39,7 @@ int update_main_cache_tree(int);
 #define WRITE_TREE_IGNORE_CACHE_TREE 2
 #define WRITE_TREE_DRY_RUN 4
 #define WRITE_TREE_SILENT 8
+#define WRITE_TREE_REPAIR 16
 
 /* error return codes */
 #define WRITE_TREE_UNREADABLE_INDEX (-1)
diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 6c33e28..98fb1ab 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -44,14 +44,14 @@ test_expect_success 'read-tree HEAD establishes cache-tree' '
 
 test_expect_success 'git-add invalidates cache-tree' '
 	test_when_finished "git reset --hard; git read-tree HEAD" &&
-	echo "I changed this file" > foo &&
+	echo "I changed this file" >foo &&
 	git add foo &&
 	test_invalid_cache_tree
 '
 
 test_expect_success 'update-index invalidates cache-tree' '
 	test_when_finished "git reset --hard; git read-tree HEAD" &&
-	echo "I changed this file" > foo &&
+	echo "I changed this file" >foo &&
 	git update-index --add foo &&
 	test_invalid_cache_tree
 '
@@ -85,9 +85,22 @@ test_expect_success 'reset --hard without index gives cache-tree' '
 	test_shallow_cache_tree
 '
 
-test_expect_failure 'checkout gives cache-tree' '
+test_expect_success 'checkout gives cache-tree' '
+	git tag current &&
 	git checkout HEAD^ &&
 	test_shallow_cache_tree
 '
 
+test_expect_success 'checkout -b gives cache-tree' '
+	git checkout current &&
+	git checkout -b prev HEAD^ &&
+	test_shallow_cache_tree
+'
+
+test_expect_success 'checkout -B gives cache-tree' '
+	git checkout current &&
+	git checkout -B prev HEAD^ &&
+	test_shallow_cache_tree
+'
+
 test_done
-- 
2.0.0.390.gcb682f8

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors
  2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
@ 2014-07-06  4:06 ` David Turner
  2014-07-07 19:27   ` Junio C Hamano
  2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: David Turner @ 2014-07-06  4:06 UTC (permalink / raw)
  To: git; +Cc: David Turner

Do not treat known-invalid trees as errors even when their count is
incorrect.  Because git already knows that these trees are invalid,
nothing depends on the count field.

Add a couple of comments.

Signed-off-by: David Turner <dturner@twitter.com>
---
 test-dump-cache-tree.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/test-dump-cache-tree.c b/test-dump-cache-tree.c
index 47eab97..cbbbd8e 100644
--- a/test-dump-cache-tree.c
+++ b/test-dump-cache-tree.c
@@ -26,16 +26,16 @@ static int dump_cache_tree(struct cache_tree *it,
 		return 0;
 
 	if (it->entry_count < 0) {
+		/* invalid */
 		dump_one(it, pfx, "");
 		dump_one(ref, pfx, "#(ref) ");
-		if (it->subtree_nr != ref->subtree_nr)
-			errs = 1;
 	}
 	else {
 		dump_one(it, pfx, "");
 		if (hashcmp(it->sha1, ref->sha1) ||
 		    ref->entry_count != it->entry_count ||
 		    ref->subtree_nr != it->subtree_nr) {
+			/* claims to be valid but is lying */
 			dump_one(ref, pfx, "#(ref) ");
 			errs = 1;
 		}
-- 
2.0.0.390.gcb682f8

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/4] cache-tree: subdirectory tests
  2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
  2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
@ 2014-07-06  4:06 ` David Turner
  2014-07-06  8:10   ` Eric Sunshine
  2014-07-07 19:15   ` Junio C Hamano
  2014-07-06  4:06 ` [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit David Turner
  2014-07-07 18:58 ` [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout Junio C Hamano
  3 siblings, 2 replies; 15+ messages in thread
From: David Turner @ 2014-07-06  4:06 UTC (permalink / raw)
  To: git; +Cc: David Turner

Add tests to confirm that invalidation of subdirectories nether over-
nor under-invalidates.

Signed-off-by: David Turner <dturner@twitter.com>
---
 t/t0090-cache-tree.sh | 28 +++++++++++++++++++++++++---
 1 file changed, 25 insertions(+), 3 deletions(-)

diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 98fb1ab..8437c5f 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -21,10 +21,13 @@ test_shallow_cache_tree () {
 	cmp_cache_tree expect
 }
 
+# Test that the cache-tree for a given directory is invalid.
+# If no directory is given, check that the root is invalid
 test_invalid_cache_tree () {
-	echo "invalid                                   (0 subtrees)" >expect &&
-	printf "SHA #(ref)  (%d entries, 0 subtrees)\n" $(git ls-files|wc -l) >>expect &&
-	cmp_cache_tree expect
+	test-dump-cache-tree >actual &&
+	sed -e "s/$_x40/SHA/" -e "s/[0-9]* subtrees//g" <actual >filtered &&
+	expect=$(printf "invalid                                  $1 ()\n") &&
+	fgrep "$expect" filtered
 }
 
 test_no_cache_tree () {
@@ -49,6 +52,25 @@ test_expect_success 'git-add invalidates cache-tree' '
 	test_invalid_cache_tree
 '
 
+test_expect_success 'git-add in subdir invalidates cache-tree' '
+	test_when_finished "git reset --hard; git read-tree HEAD" &&
+	mkdir dirx &&
+	echo "I changed this file" >dirx/foo &&
+	git add dirx/foo &&
+	test_invalid_cache_tree
+'
+
+test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' '
+	git tag no-children &&
+	test_when_finished "git reset --hard no-children; git read-tree HEAD" &&
+	mkdir dir1 dir2 &&
+	test_commit dir1/a &&
+	test_commit dir2/b &&
+	echo "I changed this file" >dir1/a &&
+	git add dir1/a &&
+	test_invalid_cache_tree dir1/
+'
+
 test_expect_success 'update-index invalidates cache-tree' '
 	test_when_finished "git reset --hard; git read-tree HEAD" &&
 	echo "I changed this file" >foo &&
-- 
2.0.0.390.gcb682f8

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
  2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
  2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
@ 2014-07-06  4:06 ` David Turner
  2014-07-07 20:03   ` Junio C Hamano
  2014-07-07 18:58 ` [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout Junio C Hamano
  3 siblings, 1 reply; 15+ messages in thread
From: David Turner @ 2014-07-06  4:06 UTC (permalink / raw)
  To: git; +Cc: David Turner

During the commit process, update the cache-tree. Write this updated
cache-tree so that it's ready for subsequent commands.

Add test code which demonstrates that git commit now writes the cache
tree.  Make all tests test the entire cache-tree, not just the root
level.

Signed-off-by: David Turner <dturner@twitter.com>
---
 builtin/commit.c      | 31 +++++++++++-------
 t/t0090-cache-tree.sh | 87 ++++++++++++++++++++++++++++++++++++++++++---------
 2 files changed, 91 insertions(+), 27 deletions(-)

diff --git a/builtin/commit.c b/builtin/commit.c
index 9cfef6c..5981755 100644
--- a/builtin/commit.c
+++ b/builtin/commit.c
@@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 
 		discard_cache();
 		read_cache_from(index_lock.filename);
+		if (update_main_cache_tree(WRITE_TREE_SILENT) >= 0)
+			write_cache(fd, active_cache, active_nr);
 
 		commit_style = COMMIT_NORMAL;
 		return index_lock.filename;
@@ -363,10 +365,18 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 		fd = hold_locked_index(&index_lock, 1);
 		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
 		refresh_cache_or_die(refresh_flags);
-		update_main_cache_tree(WRITE_TREE_SILENT);
-		if (write_cache(fd, active_cache, active_nr) ||
-		    close_lock_file(&index_lock))
-			die(_("unable to write new_index file"));
+		if (active_cache_changed
+		    || !cache_tree_fully_valid(active_cache_tree)) {
+			update_main_cache_tree(WRITE_TREE_SILENT);
+			active_cache_changed = 1;
+		}
+		if (active_cache_changed) {
+			if (write_cache(fd, active_cache, active_nr) ||
+			    close_lock_file(&index_lock))
+				die(_("unable to write new_index file"));
+		} else {
+			rollback_lock_file(&index_lock);
+		}
 		commit_style = COMMIT_NORMAL;
 		return index_lock.filename;
 	}
@@ -383,14 +393,10 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 	if (!only && !pathspec.nr) {
 		fd = hold_locked_index(&index_lock, 1);
 		refresh_cache_or_die(refresh_flags);
-		if (active_cache_changed) {
-			update_main_cache_tree(WRITE_TREE_SILENT);
-			if (write_cache(fd, active_cache, active_nr) ||
-			    commit_locked_index(&index_lock))
-				die(_("unable to write new_index file"));
-		} else {
-			rollback_lock_file(&index_lock);
-		}
+		update_main_cache_tree(WRITE_TREE_SILENT);
+		if (write_cache(fd, active_cache, active_nr) ||
+		    commit_locked_index(&index_lock))
+			die(_("unable to write new_index file"));
 		commit_style = COMMIT_AS_IS;
 		return get_index_file();
 	}
@@ -435,6 +441,7 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
 	fd = hold_locked_index(&index_lock, 1);
 	add_remove_files(&partial);
 	refresh_cache(REFRESH_QUIET);
+	update_main_cache_tree(WRITE_TREE_SILENT);
 	if (write_cache(fd, active_cache, active_nr) ||
 	    close_lock_file(&index_lock))
 		die(_("unable to write new_index file"));
diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
index 8437c5f..15f1484 100755
--- a/t/t0090-cache-tree.sh
+++ b/t/t0090-cache-tree.sh
@@ -7,8 +7,14 @@ cache-tree extension.
 "
  . ./test-lib.sh
 
+grep_nonmatch_ok () {
+    grep $@
+    test "$?" = "2" && return 1
+    return 0
+}
+
 cmp_cache_tree () {
-	test-dump-cache-tree >actual &&
+	test-dump-cache-tree | grep_nonmatch_ok -v \#\(ref\) >actual &&
 	sed "s/$_x40/SHA/" <actual >filtered &&
 	test_cmp "$1" filtered
 }
@@ -16,15 +22,33 @@ cmp_cache_tree () {
 # We don't bother with actually checking the SHA1:
 # test-dump-cache-tree already verifies that all existing data is
 # correct.
-test_shallow_cache_tree () {
-	printf "SHA  (%d entries, 0 subtrees)\n" $(git ls-files|wc -l) >expect &&
+generate_expected_cache_tree () {
+	dir="$1${1:+/}" &&
+	parent="$2" &&
+	# ls-files might have foo/bar, foo/bar/baz, and foo/bar/quux
+	# We want to count only foo because it's the only direct child
+	subtrees=$(git ls-files|grep /|cut -d / -f 1|uniq) &&
+	subtree_count=$(echo "$subtrees"|grep_nonmatch_ok -c .) &&
+	entries=$(git ls-files|wc -l) &&
+	printf "SHA $dir (%d entries, %d subtrees)\n" $entries $subtree_count &&
+	for subtree in $subtrees
+	do
+	    cd "$subtree"
+	    generate_expected_cache_tree "$dir$subtree" $dir || return 1
+	    cd ..
+	done &&
+	dir=$parent
+}
+
+test_cache_tree () {
+	generate_expected_cache_tree >expect &&
 	cmp_cache_tree expect
 }
 
 # Test that the cache-tree for a given directory is invalid.
 # If no directory is given, check that the root is invalid
 test_invalid_cache_tree () {
-	test-dump-cache-tree >actual &&
+	test-dump-cache-tree | grep_nonmatch_ok -v \#\(ref\) >actual &&
 	sed -e "s/$_x40/SHA/" -e "s/[0-9]* subtrees//g" <actual >filtered &&
 	expect=$(printf "invalid                                  $1 ()\n") &&
 	fgrep "$expect" filtered
@@ -35,14 +59,14 @@ test_no_cache_tree () {
 	cmp_cache_tree expect
 }
 
-test_expect_failure 'initial commit has cache-tree' '
+test_expect_success 'initial commit has cache-tree' '
 	test_commit foo &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'read-tree HEAD establishes cache-tree' '
 	git read-tree HEAD &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'git-add invalidates cache-tree' '
@@ -60,6 +84,18 @@ test_expect_success 'git-add in subdir invalidates cache-tree' '
 	test_invalid_cache_tree
 '
 
+cat >before <<\EOF
+SHA  (3 entries, 2 subtrees)
+SHA dir1/ (1 entries, 0 subtrees)
+SHA dir2/ (1 entries, 0 subtrees)
+EOF
+
+cat >expect <<\EOF
+invalid                                   (2 subtrees)
+invalid                                  dir1/ (0 subtrees)
+SHA dir2/ (1 entries, 0 subtrees)
+EOF
+
 test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' '
 	git tag no-children &&
 	test_when_finished "git reset --hard no-children; git read-tree HEAD" &&
@@ -67,8 +103,10 @@ test_expect_success 'git-add in subdir does not invalidate sibling cache-tree' '
 	test_commit dir1/a &&
 	test_commit dir2/b &&
 	echo "I changed this file" >dir1/a &&
+	cmp_cache_tree before &&
+	echo "I changed this file" >dir1/a &&
 	git add dir1/a &&
-	test_invalid_cache_tree dir1/
+	cmp_cache_tree expect
 '
 
 test_expect_success 'update-index invalidates cache-tree' '
@@ -81,7 +119,7 @@ test_expect_success 'update-index invalidates cache-tree' '
 test_expect_success 'write-tree establishes cache-tree' '
 	test-scrap-cache-tree &&
 	git write-tree &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'test-scrap-cache-tree works' '
@@ -92,37 +130,56 @@ test_expect_success 'test-scrap-cache-tree works' '
 
 test_expect_success 'second commit has cache-tree' '
 	test_commit bar &&
-	test_shallow_cache_tree
+	test_cache_tree
+'
+
+test_expect_success 'commit in child dir has cache-tree' '
+	mkdir dir &&
+	>dir/child.t &&
+	git add dir/child.t &&
+	git commit -m dir/child.t &&
+	test_cache_tree
 '
 
 test_expect_success 'reset --hard gives cache-tree' '
 	test-scrap-cache-tree &&
 	git reset --hard &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'reset --hard without index gives cache-tree' '
 	rm -f .git/index &&
 	git reset --hard &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'checkout gives cache-tree' '
 	git tag current &&
 	git checkout HEAD^ &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'checkout -b gives cache-tree' '
 	git checkout current &&
 	git checkout -b prev HEAD^ &&
-	test_shallow_cache_tree
+	test_cache_tree
 '
 
 test_expect_success 'checkout -B gives cache-tree' '
 	git checkout current &&
 	git checkout -B prev HEAD^ &&
-	test_shallow_cache_tree
+	test_cache_tree
+'
+
+test_expect_success 'partial commit gives cache-tree' '
+	git checkout -b partial no-children &&
+	test_commit one &&
+	test_commit two &&
+	echo "some change" >one.t &&
+	git add one.t &&
+	echo "some other change" >two.t &&
+	git commit two.t -m partial &&
+	test_cache_tree
 '
 
 test_done
-- 
2.0.0.390.gcb682f8

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/4] cache-tree: subdirectory tests
  2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
@ 2014-07-06  8:10   ` Eric Sunshine
  2014-07-07 19:15   ` Junio C Hamano
  1 sibling, 0 replies; 15+ messages in thread
From: Eric Sunshine @ 2014-07-06  8:10 UTC (permalink / raw)
  To: David Turner; +Cc: Git List, David Turner

On Sun, Jul 6, 2014 at 12:06 AM, David Turner <dturner@twopensource.com> wrote:
> Add tests to confirm that invalidation of subdirectories nether over-

s/nether/neither/

> nor under-invalidates.
>
> Signed-off-by: David Turner <dturner@twitter.com>
> ---

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout
  2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
                   ` (2 preceding siblings ...)
  2014-07-06  4:06 ` [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit David Turner
@ 2014-07-07 18:58 ` Junio C Hamano
  3 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2014-07-07 18:58 UTC (permalink / raw)
  To: David Turner; +Cc: git, David Turner

David Turner <dturner@twopensource.com> writes:

> When git checkout checks out a branch, create or update the
> cache-tree so that subsequent operations are faster.
>
> update_main_cache_tree learned a new flag, WRITE_TREE_REPAIR.  When
> WRITE_TREE_REPAIR is set, portions of the cache-tree which do not
> correspond to existing tree objects are invalidated (and portions which
> do are marked as valid).  No new tree objects are created.
>
> Signed-off-by: David Turner <dturner@twitter.com>
> ---

Looks safe, if a bit wasteful, to me.

Thanks; will queue.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 3/4] cache-tree: subdirectory tests
  2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
  2014-07-06  8:10   ` Eric Sunshine
@ 2014-07-07 19:15   ` Junio C Hamano
  1 sibling, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2014-07-07 19:15 UTC (permalink / raw)
  To: David Turner; +Cc: git, David Turner

David Turner <dturner@twopensource.com> writes:

> Add tests to confirm that invalidation of subdirectories nether over-
> nor under-invalidates.
>
> Signed-off-by: David Turner <dturner@twitter.com>
> ---
>  t/t0090-cache-tree.sh | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
> index 98fb1ab..8437c5f 100755
> --- a/t/t0090-cache-tree.sh
> +++ b/t/t0090-cache-tree.sh
> @@ -21,10 +21,13 @@ test_shallow_cache_tree () {
>  	cmp_cache_tree expect
>  }
>  
> +# Test that the cache-tree for a given directory is invalid.
> +# If no directory is given, check that the root is invalid
>  test_invalid_cache_tree () {
> -	echo "invalid                                   (0 subtrees)" >expect &&
> -	printf "SHA #(ref)  (%d entries, 0 subtrees)\n" $(git ls-files|wc -l) >>expect &&
> -	cmp_cache_tree expect
> +	test-dump-cache-tree >actual &&
> +	sed -e "s/$_x40/SHA/" -e "s/[0-9]* subtrees//g" <actual >filtered &&
> +	expect=$(printf "invalid                                  $1 ()\n") &&

It would be saner to do 'printf "string %s more string" "$1"' than
embedding caller-supplied "$1" inside the format specifier.

> +	fgrep "$expect" filtered

We'd actually want to see fewer uses of 'fgrep' in the tests for two
reasons.

Is having an entry that is invalidated the only thing we care about
in this test?  Shouldn't the caller expect "These subtrees and
nothing else must be invalidated", in which case the helper should
check not just the expected "invalid dir1/" appears in the output
but no other unexpected "invalid somethingelse/" appears (and this
"no other unexpected output" makes use of grep family in tests like
this less desirable).

In other words, wouldn't it be better to do the helper along the
lines of:

	test_invalidated_cache_tree () {
        	if test $# != 0
		then
	        	printf "invalid %s ()\n" "" "$@"
                fi >expect &&
		test-dump-cache-tree |
                sed -n -e '/^invalid /p' >actual &&
                test_cmp expect actual
	}

and use

	test_invalidated_cache_tree dir1

when we expect only dir1 and dir2 (but not dir2 or anything else) is
invalidated?

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors
  2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
@ 2014-07-07 19:27   ` Junio C Hamano
  0 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2014-07-07 19:27 UTC (permalink / raw)
  To: David Turner; +Cc: git, David Turner

David Turner <dturner@twopensource.com> writes:

> Do not treat known-invalid trees as errors even when their count is
> incorrect.  Because git already knows that these trees are invalid,
> nothing depends on the count field.

s/count/subtree_nr/; they are different.

"nothing depends on" is not quite correct.  The field keeps track of
the number of subdirectories in the directory recorded in the
cache-tree, and it *must* be maintained correctly (we iterate over
the it->down[] array up to that number).

The number of subdirectories the directory actually has, which we
can discover by counting entries in the main part of the index, may
be different from subtree_nr, if we haven't run update_one() on it,
e.g. we may have added a path in a new subdirectory, at which time
we would have invalidated the directory and its it->down[] array does
not know the new subdirectory.

While reading 3/4, I wondered if it makes sense to show the (N
subtrees) for invalidated node, but as a debugging measure it helped
me often while developping the framework, so we may not want to drop
the subtree_nr from the output for invalidated entries.

The change itself looks good.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-06  4:06 ` [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit David Turner
@ 2014-07-07 20:03   ` Junio C Hamano
  2014-07-08  0:26     ` Junio C Hamano
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2014-07-07 20:03 UTC (permalink / raw)
  To: David Turner; +Cc: git, David Turner

David Turner <dturner@twopensource.com> writes:

> During the commit process, update the cache-tree. Write this updated
> cache-tree so that it's ready for subsequent commands.
>
> Add test code which demonstrates that git commit now writes the cache
> tree.  Make all tests test the entire cache-tree, not just the root
> level.
>
> Signed-off-by: David Turner <dturner@twitter.com>
> ---
>  builtin/commit.c      | 31 +++++++++++-------
>  t/t0090-cache-tree.sh | 87 ++++++++++++++++++++++++++++++++++++++++++---------
>  2 files changed, 91 insertions(+), 27 deletions(-)
>
> diff --git a/builtin/commit.c b/builtin/commit.c
> index 9cfef6c..5981755 100644
> --- a/builtin/commit.c
> +++ b/builtin/commit.c
> @@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
>  
>  		discard_cache();
>  		read_cache_from(index_lock.filename);
> +		if (update_main_cache_tree(WRITE_TREE_SILENT) >= 0)
> +			write_cache(fd, active_cache, active_nr);
>  
>  		commit_style = COMMIT_NORMAL;
>  		return index_lock.filename;
> @@ -363,10 +365,18 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
>  		fd = hold_locked_index(&index_lock, 1);
>  		add_files_to_cache(also ? prefix : NULL, &pathspec, 0);
>  		refresh_cache_or_die(refresh_flags);
> -		update_main_cache_tree(WRITE_TREE_SILENT);
> -		if (write_cache(fd, active_cache, active_nr) ||
> -		    close_lock_file(&index_lock))
> -			die(_("unable to write new_index file"));
> +		if (active_cache_changed
> +		    || !cache_tree_fully_valid(active_cache_tree)) {
> +			update_main_cache_tree(WRITE_TREE_SILENT);
> +			active_cache_changed = 1;
> +		}
> +		if (active_cache_changed) {
> +			if (write_cache(fd, active_cache, active_nr) ||
> +			    close_lock_file(&index_lock))
> +				die(_("unable to write new_index file"));
> +		} else {
> +			rollback_lock_file(&index_lock);
> +		}
>  		commit_style = COMMIT_NORMAL;
>  		return index_lock.filename;
>  	}
> @@ -383,14 +393,10 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
>  	if (!only && !pathspec.nr) {
>  		fd = hold_locked_index(&index_lock, 1);
>  		refresh_cache_or_die(refresh_flags);
> -		if (active_cache_changed) {
> -			update_main_cache_tree(WRITE_TREE_SILENT);
> -			if (write_cache(fd, active_cache, active_nr) ||
> -			    commit_locked_index(&index_lock))
> -				die(_("unable to write new_index file"));
> -		} else {
> -			rollback_lock_file(&index_lock);
> -		}
> +		update_main_cache_tree(WRITE_TREE_SILENT);
> +		if (write_cache(fd, active_cache, active_nr) ||
> +		    commit_locked_index(&index_lock))
> +			die(_("unable to write new_index file"));
>  		commit_style = COMMIT_AS_IS;
>  		return get_index_file();
>  	}

Hmph.  The above two hunks somehow feel the other way around.  When
doing a non-partial, non as-is commit, i.e. "git commit $paths", the
original used to call 'write-cache' unconditionally, because it is
very unlikely for us to see !active_cache_changed after calling the
add-files-to-cache with user-supplied pathspec (if there is nothing
to change, the user is being silly to say "git commit $paths" in the
first place).  I would have expected that the patch would have left
that code path alone (it seems to be doing the right thing already).

On the other hand, "git commit" to commit the contents of the index
as-is is being cautious not to write things out unnecessarily, but
as you found out, it would be a good idea to fully populate the
cache-tree in this code path and write the otherwise already-good
index file out, even if we see !active_cache_changed after we called
refresh_cache_or_die().  So I would have expected that the patch
would have kept the "write only necessary" carefulness instead of
calling write-cache unconditionally.

That is, something like:

	fd = hold_locked_index();
        refresh_cache_or_die();
+	if (!cache_tree_fully_valid())
+               active_cache_changed = 1;
	if (active_cache_changed) {
		update_main_cache_tree();

> diff --git a/t/t0090-cache-tree.sh b/t/t0090-cache-tree.sh
> index 8437c5f..15f1484 100755
> --- a/t/t0090-cache-tree.sh
> +++ b/t/t0090-cache-tree.sh
> @@ -7,8 +7,14 @@ cache-tree extension.
>  "
>   . ./test-lib.sh
>  
> +grep_nonmatch_ok () {
> +    grep $@

dq around it, i.e. "$@".

> +    test "$?" = "2" && return 1

POSIX.1 only specifies >1 not necessarily 2 as an error status.  

> +    return 0
> +}

Having said all that I do not think the helper is unnecessary.  Just
use something like

	sed -e '/#(ref)/d'

>  cmp_cache_tree () {
> -	test-dump-cache-tree >actual &&
> +	test-dump-cache-tree | grep_nonmatch_ok -v \#\(ref\) >actual &&

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-07 20:03   ` Junio C Hamano
@ 2014-07-08  0:26     ` Junio C Hamano
  2014-07-08 10:32       ` Duy Nguyen
  0 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2014-07-08  0:26 UTC (permalink / raw)
  To: David Turner; +Cc: git, David Turner, Nguyễn Thái Ngọc Duy

Junio C Hamano <gitster@pobox.com> writes:

>> diff --git a/builtin/commit.c b/builtin/commit.c
>> index 9cfef6c..5981755 100644
>> --- a/builtin/commit.c
>> +++ b/builtin/commit.c
>> @@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
>>  
>>  		discard_cache();
>>  		read_cache_from(index_lock.filename);
>> +		if (update_main_cache_tree(WRITE_TREE_SILENT) >= 0)
>> +			write_cache(fd, active_cache, active_nr);
>>  
>>  		commit_style = COMMIT_NORMAL;
>>  		return index_lock.filename;

I'll push out a tentative result of merging this series (with some
proposed fix-ups) sometime later today, but this part interacts with
Duy's split-index topic in a funny way, so I'd exclude it from the
merge result for now.

Thanks.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-08  0:26     ` Junio C Hamano
@ 2014-07-08 10:32       ` Duy Nguyen
  2014-07-08 17:05         ` Junio C Hamano
                           ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Duy Nguyen @ 2014-07-08 10:32 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Turner, Git Mailing List, David Turner

On Tue, Jul 8, 2014 at 7:26 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Junio C Hamano <gitster@pobox.com> writes:
>
>>> diff --git a/builtin/commit.c b/builtin/commit.c
>>> index 9cfef6c..5981755 100644
>>> --- a/builtin/commit.c
>>> +++ b/builtin/commit.c
>>> @@ -342,6 +342,8 @@ static char *prepare_index(int argc, const char **argv, const char *prefix,
>>>
>>>              discard_cache();
>>>              read_cache_from(index_lock.filename);
>>> +            if (update_main_cache_tree(WRITE_TREE_SILENT) >= 0)
>>> +                    write_cache(fd, active_cache, active_nr);
>>>
>>>              commit_style = COMMIT_NORMAL;
>>>              return index_lock.filename;

I wonder if we need to update_main_cache_tree() so many times in this
function. If I read the code correctly, all roads must lead to
update_main_cache_tree(0) in prepare_to_commit(). If we find out that
we have incomplete cache-tree before that call, we could write the
index one more time after that call, instead of spreading
update_main_cache_tree() all over prepare_index(). I know the
"index_file" in prepare_to_commit() is probably "index.lock" or
something, but that does not stop us from locking again
("index.lock.lock") if we want to update it.

Writing cache tree early in prepare_index() does help hooks, but I
would say hooks are uncommon case and we could add an option to
update-index to explicitly rebuild cache-tree, then hooks that do diff
a lot (or other operations that use cache-tree) could rebuild
cache-tree by themselves. When the index file is large, every write
pays high penalty so I think trying to write less often is good.
-- 
Duy

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-08 10:32       ` Duy Nguyen
@ 2014-07-08 17:05         ` Junio C Hamano
  2014-07-09  1:58           ` Duy Nguyen
  2014-07-08 18:32         ` Junio C Hamano
  2014-07-08 19:15         ` Junio C Hamano
  2 siblings, 1 reply; 15+ messages in thread
From: Junio C Hamano @ 2014-07-08 17:05 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: David Turner, Git Mailing List, David Turner

Duy Nguyen <pclouds@gmail.com> writes:

> I wonder if we need to update_main_cache_tree() so many times in this
> function. If I read the code correctly, all roads must lead to
> update_main_cache_tree(0) in prepare_to_commit().

I think prepare-to-commit is too late; it does not want to know if
the index it was given to create a tree out of is the one that the
user will keep using after this invocation of "git commit" or just a
temporary one used for partial commit.  The cache-tree rebuilt there
is what is recorded with commit_tree_extended() in cmd_commit(), but
if you are doing a partial commit, these generic code paths are
given a temporary index file on the filesystem to work on, and
cache-tree in that will not help user's later operation.

For three main uses of "git commit", prepare_index() does these:

 - "git commit -a" and "git commit -i $paths" update the index with
   the new contents from the working tree, fully builds cache-tree
   in-core to write out the tree objects, and writes the index file
   to the filesystem.  Because this index is the one used after this
   invocation of "git commit" returns, we have a fully populated
   cache-tree after this happens.  This code path is perfect and
   there is no need to change.

 - "git commit" commits the contents of the index as-is, so
   technically there is no reason for it to update the index on the
   filesystem at all, but it refreshes the cached stat data to help
   the "status" part of the command, and if it finds that stat data
   was stale, updates the index on the filesystem because it is
   wasteful not to do so.  As we would be spending I/O cycles to
   update the index file in that case anyway, we also rebuild the
   cache-tree and include that in the updated index.

   When the cached stat data was already up-to-date, however, we do
   not update the index on the filesystem, so the series under
   discussion will change the trade-off by doing one more I/O to
   write out a new index to the filesystem only to update cache-tree.

 - "git commit $paths" updates the "real" index with given $paths
   and writes it out to the filesystem first.  This is the index the
   user will use after "git commit" finishes; traditionally our
   trade-off was "populate cache-tree only when we do not have to
   spend any cycle only to do so (i.e. when we are writing trees
   anyway, or when we read from a tree), and let it degrade as paths
   are added, removed and modified" and we avoided populating
   cache-tree in this codepath.  The series under discussion will
   change the trade-off here, too.

   After it updates this "real" index, it builds another temporary
   index that represents the tree state to be committed (starting
   from HEAD and updates with the given $paths), but that will be
   discarded and we do not want to spend any extra cycle to do
   anything only to make its later use more efficient (like writing
   updated cache-tree to it).

> If we find out that
> we have incomplete cache-tree before that call, we could write the
> index one more time after that call,

and that will make an extra I/O only to write out cache-tree to the
temporary index that we will discard immediately after for a partial
commit.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-08 10:32       ` Duy Nguyen
  2014-07-08 17:05         ` Junio C Hamano
@ 2014-07-08 18:32         ` Junio C Hamano
  2014-07-08 19:15         ` Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2014-07-08 18:32 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: David Turner, Git Mailing List, David Turner

Duy Nguyen <pclouds@gmail.com> writes:

> ... I know the
> "index_file" in prepare_to_commit() is probably "index.lock" or
> something, but that does not stop us from locking again
> ("index.lock.lock") if we want to update it.

We grabbed the lock on the real index and we have written out the
result of "update-index --refresh" to it (and closed), but we still
want to and do keep the lock while "add -i" works on it.  And then
after "add -i" returns, we still have the lock on the real index and
the patch wants to write to it again to store the refreshed cache-tree
under that lock.

It may be the case that the API suite currently lacks a way to allow
the caller to reopen the same "index.lock" file after calling
write-locked-index(CLOSE_LOCK), and taking a lock on "index.lock" to
write into "index.lock.lock" and renaming it to "index.lock" could
be a workaround for it, but doesn't that sound a wrong workaround?

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-08 10:32       ` Duy Nguyen
  2014-07-08 17:05         ` Junio C Hamano
  2014-07-08 18:32         ` Junio C Hamano
@ 2014-07-08 19:15         ` Junio C Hamano
  2 siblings, 0 replies; 15+ messages in thread
From: Junio C Hamano @ 2014-07-08 19:15 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: David Turner, Git Mailing List, David Turner

Duy Nguyen <pclouds@gmail.com> writes:

> Writing cache tree early in prepare_index() does help hooks, but I
> would say hooks are uncommon case and we could add an option to
> update-index to explicitly rebuild cache-tree, then hooks that do diff
> a lot (or other operations that use cache-tree) could rebuild
> cache-tree by themselves.

Yes, "update-index --update-cache-tree" would be a good addition for
completeness; scripts working with plumbing should be able to do
what built-in Porcelains can.  They can of course do "write-tree" in
the meantime so I do not see it as a very high priority, though.

This should apply on top of 'master', and if the series under
discussion turns out to be a good idea, the new call to
update-main-cache-tree I added to this code path should use the
option added by the series that only repairs parts of cache-trees
that can be repaird without writing out new trees, so it is just to
give hints to future people (iow I am not going to apply this patch
myself right now).

 builtin/update-index.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/builtin/update-index.c b/builtin/update-index.c
index ebea285..1ce2274 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -26,6 +26,7 @@ static int allow_remove;
 static int allow_replace;
 static int info_only;
 static int force_remove;
+static int update_cache_tree;
 static int verbose;
 static int mark_valid_only;
 static int mark_skip_worktree_only;
@@ -762,6 +763,8 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		OPT_BIT(0, "unmerged", &refresh_args.flags,
 			N_("refresh even if index contains unmerged entries"),
 			REFRESH_UNMERGED),
+		OPT_BOOL(0, "update-cache-tree", &update_cache_tree,
+			 N_("update cache-tree before writing the result out")),
 		{OPTION_CALLBACK, 0, "refresh", &refresh_args, NULL,
 			N_("refresh stat information"),
 			PARSE_OPT_NOARG | PARSE_OPT_NONEG,
@@ -918,6 +921,11 @@ int cmd_update_index(int argc, const char **argv, const char *prefix)
 		strbuf_release(&buf);
 	}
 
+	if (update_cache_tree && !unmerged_cache()) {
+		update_main_cache_tree(0);
+		active_cache_changed = 1; /* force write-out */
+	}
+
 	if (active_cache_changed) {
 		if (newfd < 0) {
 			if (refresh_args.flags & REFRESH_QUIET)

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit
  2014-07-08 17:05         ` Junio C Hamano
@ 2014-07-09  1:58           ` Duy Nguyen
  0 siblings, 0 replies; 15+ messages in thread
From: Duy Nguyen @ 2014-07-09  1:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Turner, Git Mailing List, David Turner

On Wed, Jul 9, 2014 at 12:05 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> I wonder if we need to update_main_cache_tree() so many times in this
>> function. If I read the code correctly, all roads must lead to
>> update_main_cache_tree(0) in prepare_to_commit().
>
> I think prepare-to-commit is too late; it does not want to know if
> the index it was given to create a tree out of is the one that the
> user will keep using after this invocation of "git commit" or just a
> temporary one used for partial commit.  The cache-tree rebuilt there
> is what is recorded with commit_tree_extended() in cmd_commit(), but
> if you are doing a partial commit, these generic code paths are
> given a temporary index file on the filesystem to work on, and
> cache-tree in that will not help user's later operation.

Right. Thanks for checking.
-- 
Duy

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2014-07-09  1:59 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-07-06  4:06 [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout David Turner
2014-07-06  4:06 ` [PATCH v4 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
2014-07-07 19:27   ` Junio C Hamano
2014-07-06  4:06 ` [PATCH v4 3/4] cache-tree: subdirectory tests David Turner
2014-07-06  8:10   ` Eric Sunshine
2014-07-07 19:15   ` Junio C Hamano
2014-07-06  4:06 ` [PATCH v4 4/4] cache-tree: Write updated cache-tree after commit David Turner
2014-07-07 20:03   ` Junio C Hamano
2014-07-08  0:26     ` Junio C Hamano
2014-07-08 10:32       ` Duy Nguyen
2014-07-08 17:05         ` Junio C Hamano
2014-07-09  1:58           ` Duy Nguyen
2014-07-08 18:32         ` Junio C Hamano
2014-07-08 19:15         ` Junio C Hamano
2014-07-07 18:58 ` [PATCH v4 1/4] cache-tree: Create/update cache-tree on checkout Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.