All of lore.kernel.org
 help / color / mirror / Atom feed
* [WIP PATCH 0/6] Merging with D/F conflicts
@ 2009-09-20  5:06 Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 1/6] diff-lib.c: fix misleading comments on oneway_diff() Junio C Hamano
                   ` (5 more replies)
  0 siblings, 6 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

I have been working on this on-and-off and it still is not finished, but I
thought it would be a good place to stop, especially since this will not
be part of the upcoming 1.6.5 anyway.

The first three patches are preparatory clean-ups.

The fourth patch roughly corresponds to Linus's 880386c (Prepare
'traverse_trees()' for D/F conflict lookahead, 2009-09-06), but it does
implement multi-entry lookahead.  It reveals why lookahead on the tree
side alone is not sufficient by breaking a few tests.

The fifth one starts to compensate for the change in the tree side by
preparing the side that walks the index for a similar lookahead mechanism,
but it does not actually implement the lookahead yet.

The last one is a debugging patch.

This change has to break the output order (but not content) of the
diff-index somewhat.  If you had this:

    Index    Tree
    b        b-2/c
    b-2      b/d

the expected output order from diff-index is b, b-2, b-2/c, then b/d.  But
if you walk the tree and the index in parallel, we would end up showing b,
b/d, b-2 and then b-2/c.  A sad part of the story is that diff-index
always emits D/F conflicted entries as two independent records, so it is
rather a bad match to the unpack_trees() framework to begin with.

The patches are designed to apply on 79b4fde (Merge branch 'maint',
2009-09-03).

Junio C Hamano (6):
  diff-lib.c: fix misleading comments on oneway_diff()
  unpack-trees: typofix
  unpack_callback(): use unpack_failed() consistently
  traverse_trees(): handle D/F conflict case sanely
  unpack-trees.c: prepare for looking ahead in the index
  read-tree --debug-unpack

 builtin-read-tree.c |   36 +++++++
 cache.h             |    2 +
 diff-lib.c          |   20 +----
 tree-walk.c         |  277 +++++++++++++++++++++++++++++++++++++++++++--------
 unpack-trees.c      |  275 ++++++++++++++++++++++++++++++++++++++------------
 unpack-trees.h      |    3 +-
 6 files changed, 484 insertions(+), 129 deletions(-)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [WIP PATCH 1/6] diff-lib.c: fix misleading comments on oneway_diff()
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 2/6] unpack-trees: typofix Junio C Hamano
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

20a16eb (unpack_trees(): fix diff-index regression., 2008-03-10) adjusted
diff-index to the new world order since 34110cd (Make 'unpack_trees()'
have a separate source and destination index, 2008-03-06).  Callbacks are
expected to return anything non-negative as "success", and instead of
reporting how many index entries they have processed, they are expected to
advance o->pos themselves.  The code did so, but a stale comment was left
behind.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 diff-lib.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index 0c74ef5..adf1c5f 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -383,7 +383,7 @@ static inline void skip_same_name(struct cache_entry *ce, struct unpack_trees_op
  * For diffing, the index is more important, and we only have a
  * single tree.
  *
- * We're supposed to return how many index entries we want to skip.
+ * We're supposed to advance o->pos to skip what we have already processed.
  *
  * This wrapper makes it all more readable, and takes care of all
  * the fairly complex unpack_trees() semantic requirements, including
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [WIP PATCH 2/6] unpack-trees: typofix
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 1/6] diff-lib.c: fix misleading comments on oneway_diff() Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 3/6] unpack_callback(): use unpack_failed() consistently Junio C Hamano
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

I am not good at subject-verb concordance.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 unpack-trees.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index 720f7a1..d174fe0 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -617,7 +617,7 @@ static int verify_absent(struct cache_entry *ce, const char *action,
 			 * found "foo/." in the working tree.
 			 * This is tricky -- if we have modified
 			 * files that are in "foo/" we would lose
-			 * it.
+			 * them.
 			 */
 			ret = verify_clean_subdirectory(ce, action, o);
 			if (ret < 0)
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [WIP PATCH 3/6] unpack_callback(): use unpack_failed() consistently
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 1/6] diff-lib.c: fix misleading comments on oneway_diff() Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 2/6] unpack-trees: typofix Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 4/6] traverse_trees(): handle D/F conflict case sanely Junio C Hamano
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

When unpack_index_entry() failed, consistently call unpack_failed(),
instead of silently returning -1.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 unpack-trees.c |   24 ++++++++++++------------
 1 files changed, 12 insertions(+), 12 deletions(-)

diff --git a/unpack-trees.c b/unpack-trees.c
index d174fe0..c424bab 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -277,6 +277,17 @@ static int unpack_nondirectories(int n, unsigned long mask,
 	return 0;
 }
 
+static int unpack_failed(struct unpack_trees_options *o, const char *message)
+{
+	discard_index(&o->result);
+	if (!o->gently) {
+		if (message)
+			return error("%s", message);
+		return -1;
+	}
+	return -1;
+}
+
 static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, struct name_entry *names, struct traverse_info *info)
 {
 	struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
@@ -294,7 +305,7 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 			int cmp = compare_entry(ce, info, p);
 			if (cmp < 0) {
 				if (unpack_index_entry(ce, o) < 0)
-					return -1;
+					return unpack_failed(o, NULL);
 				continue;
 			}
 			if (!cmp) {
@@ -352,17 +363,6 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	return mask;
 }
 
-static int unpack_failed(struct unpack_trees_options *o, const char *message)
-{
-	discard_index(&o->result);
-	if (!o->gently) {
-		if (message)
-			return error("%s", message);
-		return -1;
-	}
-	return -1;
-}
-
 /*
  * N-way merge "len" trees.  Returns 0 on success, -1 on failure to manipulate the
  * resulting index, -2 on failure to reflect the changes to the work tree.
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [WIP PATCH 4/6] traverse_trees(): handle D/F conflict case sanely
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
                   ` (2 preceding siblings ...)
  2009-09-20  5:06 ` [WIP PATCH 3/6] unpack_callback(): use unpack_failed() consistently Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 5/6] unpack-trees.c: prepare for looking ahead in the index Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 6/6] read-tree --debug-unpack Junio C Hamano
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

traverse_trees() is supposed to call its callback with all the matching
entries from the given trees.  The current algorithm keeps a pointer to
each of the tree being traversed, and feeds the entry with the earliest
name to the callback.

This breaks down if the trees being traversed looks like this:

    A    B
    t-1  t
    t-2  u
    t/a  v

When we are currently looking at an entry "t-1" in tree A, and tree B has
returned "t", feeding "t" from the B and not feeding anything from A, only
because "t-1" sorts later than "t", will miss an entry for a subtree "t"
behind the current entry in tree A.

This introduces extended_entry_extract() helper function that gives what
name is expected from the tree, and implements a mechanism to look-ahead
in the tree object using it, to make sure such a case is handled sanely.
Traversal in tree A in the above example will first return "t" to match
that of B, and then the next request for an entry to A then returns "t-1".

This roughly corresponds to what Linus's "prepare for one-entry lookahead"
wanted to do, but because this does implement look ahead, t6035 and t7003
reveal that the approach would not work without adjusting the side that
walks the index in unpack_trees() as well.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 tree-walk.c |  277 +++++++++++++++++++++++++++++++++++++++++++++++++---------
 1 files changed, 234 insertions(+), 43 deletions(-)

diff --git a/tree-walk.c b/tree-walk.c
index 02e2aed..08796c2 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -60,13 +60,6 @@ void *fill_tree_descriptor(struct tree_desc *desc, const unsigned char *sha1)
 	return buf;
 }
 
-static int entry_compare(struct name_entry *a, struct name_entry *b)
-{
-	return df_name_compare(
-			a->path, tree_entry_len(a->path, a->sha1), a->mode,
-			b->path, tree_entry_len(b->path, b->sha1), b->mode);
-}
-
 static void entry_clear(struct name_entry *a)
 {
 	memset(a, 0, sizeof(*a));
@@ -138,66 +131,264 @@ char *make_traverse_path(char *path, const struct traverse_info *info, const str
 	return path;
 }
 
+struct tree_desc_skip {
+	struct tree_desc_skip *prev;
+	const void *ptr;
+};
+
+struct tree_desc_x {
+	struct tree_desc d;
+	struct tree_desc_skip *skip;
+};
+
+static int name_compare(const char *a, int a_len,
+			const char *b, int b_len)
+{
+	int len = (a_len < b_len) ? a_len : b_len;
+	int cmp = memcmp(a, b, len);
+	if (cmp)
+		return cmp;
+	return (a_len - b_len);
+}
+
+static int check_entry_match(const char *a, int a_len, const char *b, int b_len)
+{
+	/*
+	 * The caller wants to pick *a* from a tree or nothing.
+	 * We are looking at *b* in a tree.
+	 *
+	 * (0) If a and b are the same name, we are trivially happy.
+	 *
+	 * There are three possibilities where *a* could be hiding
+	 * behind *b*.
+	 *
+	 * (1) *a* == "t",   *b* == "ab"  i.e. *b* sorts earlier than *a* no
+	 *                                matter what.
+	 * (2) *a* == "t",   *b* == "t-2" and "t" is a subtree in the tree;
+	 * (3) *a* == "t-2", *b* == "t"   and "t-2" is a blob in the tree.
+	 *
+	 * Otherwise we know *a* won't appear in the tree without
+	 * scanning further.
+	 */
+
+	int cmp = name_compare(a, a_len, b, b_len);
+
+	/* Most common case first -- reading sync'd trees */
+	if (!cmp)
+		return cmp;
+
+	if (0 < cmp) {
+		/* a comes after b; it does not matter if it is case (3)
+		if (b_len < a_len && !memcmp(a, b, b_len) && a[b_len] < '/')
+			return 1;
+		*/
+		return 1; /* keep looking */
+	}
+
+	/* b comes after a; are we looking at case (2)? */
+	if (a_len < b_len && !memcmp(a, b, a_len) && b[a_len] < '/')
+		return 1; /* keep looking */
+
+	return -1; /* a cannot appear in the tree */
+}
+
+/*
+ * From the extended tree_desc, extract the first name entry, while
+ * paying attention to the candidate "first" name.  Most importantly,
+ * when looking for an entry, if there are entries that sorts earlier
+ * in the tree object representation than that name, skip them and
+ * process the named entry first.  We will remember that we haven't
+ * processed the first entry yet, and in the later call skip the
+ * entry we processed early when update_extended_entry() is called.
+ *
+ * E.g. if the underlying tree object has these entries:
+ *
+ *    blob    "t-1"
+ *    blob    "t-2"
+ *    tree    "t"
+ *    blob    "t=1"
+ *
+ * and the "first" asks for "t", remember that we still need to
+ * process "t-1" and "t-2" but extract "t".  After processing the
+ * entry "t" from this call, the caller will let us know by calling
+ * update_extended_entry() that we can remember "t" has been processed
+ * already.
+ */
+
+static void extended_entry_extract(struct tree_desc_x *t,
+				   struct name_entry *a,
+				   const char *first,
+				   int first_len)
+{
+	const char *path;
+	int len;
+	struct tree_desc probe;
+	struct tree_desc_skip *skip;
+
+	/*
+	 * Extract the first entry from the tree_desc, but skip the
+	 * ones that we already returned in earlier rounds.
+	 */
+	while (1) {
+		if (!t->d.size) {
+			entry_clear(a);
+			break; /* not found */
+		}
+		entry_extract(&t->d, a);
+		for (skip = t->skip; skip; skip = skip->prev)
+			if (a->path == skip->ptr)
+				break; /* found */
+		if (!skip)
+			break;
+		/* We have processed this entry already. */
+		update_tree_entry(&t->d);
+	}
+
+	if (!first || !a->path)
+		return;
+
+	/*
+	 * The caller wants "first" from this tree, or nothing.
+	 */
+	path = a->path;
+	len = tree_entry_len(a->path, a->sha1);
+	switch (check_entry_match(first, first_len, path, len)) {
+	case -1:
+		entry_clear(a);
+	case 0:
+		return;
+	default:
+		break;
+	}
+
+	/*
+	 * We need to look-ahead -- we suspect that a subtree whose
+	 * name is "first" may be hiding behind the current entry "path".
+	 */
+	probe = t->d;
+	while (probe.size) {
+		entry_extract(&probe, a);
+		path = a->path;
+		len = tree_entry_len(a->path, a->sha1);
+		switch (check_entry_match(first, first_len, path, len)) {
+		case -1:
+			entry_clear(a);
+		case 0:
+			return;
+		default:
+			update_tree_entry(&probe);
+			break;
+		}
+		/* keep looking */
+	}
+	entry_clear(a);
+}
+
+static void update_extended_entry(struct tree_desc_x *t, struct name_entry *a)
+{
+	if (t->d.entry.path == a->path) {
+		update_tree_entry(&t->d);
+	} else {
+		/* we have returned this entry early */
+		struct tree_desc_skip *skip = xmalloc(sizeof(*skip));
+		skip->ptr = a->path;
+		skip->prev = t->skip;
+		t->skip = skip;
+	}
+}
+
+static void free_extended_entry(struct tree_desc_x *t)
+{
+	struct tree_desc_skip *p, *s;
+
+	for (s = t->skip; s; s = p) {
+		p = s->prev;
+		free(s);
+	}
+}
+
 int traverse_trees(int n, struct tree_desc *t, struct traverse_info *info)
 {
 	int ret = 0;
 	struct name_entry *entry = xmalloc(n*sizeof(*entry));
+	int i;
+	struct tree_desc_x *tx = xcalloc(n, sizeof(*tx));
+
+	for (i = 0; i < n; i++)
+		tx[i].d = t[i];
 
 	for (;;) {
-		unsigned long mask = 0;
-		unsigned long dirmask = 0;
-		int i, last;
+		unsigned long mask, dirmask;
+		const char *first = NULL;
+		int first_len = 0;
+		struct name_entry *e;
+		int len;
 
-		last = -1;
 		for (i = 0; i < n; i++) {
-			if (!t[i].size)
+			e = entry + i;
+			extended_entry_extract(tx + i, e, NULL, 0);
+		}
+
+		/*
+		 * A tree may have "t-2" at the current location even
+		 * though it may have "t" that is a subtree behind it,
+		 * and another tree may return "t".  We want to grab
+		 * all "t" from all trees to match in such a case.
+		 */
+		for (i = 0; i < n; i++) {
+			e = entry + i;
+			if (!e->path)
 				continue;
-			entry_extract(t+i, entry+i);
-			if (last >= 0) {
-				int cmp = entry_compare(entry+i, entry+last);
-
-				/*
-				 * Is the new name bigger than the old one?
-				 * Ignore it
-				 */
-				if (cmp > 0)
+			len = tree_entry_len(e->path, e->sha1);
+			if (!first) {
+				first = e->path;
+				first_len = len;
+				continue;
+			}
+			if (name_compare(e->path, len, first, first_len) < 0) {
+				first = e->path;
+				first_len = len;
+			}
+		}
+
+		if (first) {
+			for (i = 0; i < n; i++) {
+				e = entry + i;
+				extended_entry_extract(tx + i, e, first, first_len);
+				/* Cull the ones that are not the earliest */
+				if (!e->path)
 					continue;
-				/*
-				 * Is the new name smaller than the old one?
-				 * Ignore all old ones
-				 */
-				if (cmp < 0)
-					mask = 0;
+				len = tree_entry_len(e->path, e->sha1);
+				if (name_compare(e->path, len, first, first_len))
+					entry_clear(e);
 			}
+		}
+
+		/* Now we have in entry[i] the earliest name from the trees */
+		mask = 0;
+		dirmask = 0;
+		for (i = 0; i < n; i++) {
+			if (!entry[i].path)
+				continue;
 			mask |= 1ul << i;
 			if (S_ISDIR(entry[i].mode))
 				dirmask |= 1ul << i;
-			last = i;
 		}
 		if (!mask)
 			break;
-		dirmask &= mask;
-
-		/*
-		 * Clear all the unused name-entries.
-		 */
-		for (i = 0; i < n; i++) {
-			if (mask & (1ul << i))
-				continue;
-			entry_clear(entry + i);
-		}
 		ret = info->fn(n, mask, dirmask, entry, info);
 		if (ret < 0)
 			break;
-		if (ret)
-			mask &= ret;
+		mask &= ret;
 		ret = 0;
-		for (i = 0; i < n; i++) {
+		for (i = 0; i < n; i++)
 			if (mask & (1ul << i))
-				update_tree_entry(t + i);
-		}
+				update_extended_entry(tx + i, entry + i);
 	}
 	free(entry);
+	for (i = 0; i < n; i++)
+		free_extended_entry(tx + i);
+	free(tx);
 	return ret;
 }
 
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [WIP PATCH 5/6] unpack-trees.c: prepare for looking ahead in the index
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
                   ` (3 preceding siblings ...)
  2009-09-20  5:06 ` [WIP PATCH 4/6] traverse_trees(): handle D/F conflict case sanely Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  2009-09-20  5:06 ` [WIP PATCH 6/6] read-tree --debug-unpack Junio C Hamano
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

This prepares but does not yet implement a look-ahead in the index entries
when traverse-trees.c decides to give us tree entries in an order that
does not match what is in the index.

A case where a look-ahead in the index is necessary happens when merging
branch B into branch A while the index matches the current branch A, using
a tree O as their common ancestor, and these three trees looks like this:

   O        A       B
   t                t
   t-i      t-i     t-i
   t-j      t-j
            t/1
            t/2

The traverse_trees() function gets "t", "t-i" and "t" from trees O, A and
B first, and notices that A may have a matching "t" behind "t-i" and "t-j"
(indeed it does), and tells A to give that entry instead.  After unpacking
blob "t" from tree B (as it hasn't changed since O in B and A removed it,
it will result in its removal), it descends into directory "t/".

The side that walked index in parallel to the tree traversal used to be
implemented with one pointer, o->pos, that points at the next index entry
to be processed.  When this happens, the pointer o->pos still points at
"t-i" that is the first entry.  We should be able to skip "t-i" and "t-j"
and locate "t/1" from the index while the recursive invocation of
traverse_trees() walks and match entries found there, and later come back
to process "t-i".

While that look-ahead is not implemented yet, this adds a flag bit,
CE_UNPACKED, to mark the entries in the index that has already been
processed.  o->pos pointer has been renamed to o->cache_bottom and it
points at the first entry that may still need to be processed.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * t6035 is still broken with this patch, although t7003 is fixed.

 cache.h        |    2 +
 diff-lib.c     |   18 -----
 unpack-trees.c |  214 ++++++++++++++++++++++++++++++++++++++++++--------------
 unpack-trees.h |    2 +-
 4 files changed, 164 insertions(+), 72 deletions(-)

diff --git a/cache.h b/cache.h
index 5fad24c..18a3e13 100644
--- a/cache.h
+++ b/cache.h
@@ -177,6 +177,8 @@ struct cache_entry {
 #define CE_HASHED    (0x100000)
 #define CE_UNHASHED  (0x200000)
 
+#define CE_UNPACKED  (0x400000)
+
 /*
  * Extended on-disk flags
  */
diff --git a/diff-lib.c b/diff-lib.c
index adf1c5f..f759917 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -359,21 +359,6 @@ static void do_oneway_diff(struct unpack_trees_options *o,
 	show_modified(revs, tree, idx, 1, cached, match_missing);
 }
 
-static inline void skip_same_name(struct cache_entry *ce, struct unpack_trees_options *o)
-{
-	int len = ce_namelen(ce);
-	const struct index_state *index = o->src_index;
-
-	while (o->pos < index->cache_nr) {
-		struct cache_entry *next = index->cache[o->pos];
-		if (len != ce_namelen(next))
-			break;
-		if (memcmp(ce->name, next->name, len))
-			break;
-		o->pos++;
-	}
-}
-
 /*
  * The unpack_trees() interface is designed for merging, so
  * the different source entries are designed primarily for
@@ -395,9 +380,6 @@ static int oneway_diff(struct cache_entry **src, struct unpack_trees_options *o)
 	struct cache_entry *tree = src[1];
 	struct rev_info *revs = o->unpack_data;
 
-	if (idx && ce_stage(idx))
-		skip_same_name(idx, o);
-
 	/*
 	 * Unpack-trees generates a DF/conflict entry if
 	 * there was a directory in the index and a tree
diff --git a/unpack-trees.c b/unpack-trees.c
index c424bab..a6f2187 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -126,18 +126,109 @@ static inline int call_unpack_fn(struct cache_entry **src, struct unpack_trees_o
 	return ret;
 }
 
-static int unpack_index_entry(struct cache_entry *ce, struct unpack_trees_options *o)
+static void mark_ce_used(struct cache_entry *ce, struct unpack_trees_options *o)
+{
+	ce->ce_flags |= CE_UNPACKED;
+
+	if (o->cache_bottom < o->src_index->cache_nr &&
+	    o->src_index->cache[o->cache_bottom] == ce) {
+		int bottom = o->cache_bottom;
+		while (bottom < o->src_index->cache_nr &&
+		       o->src_index->cache[bottom]->ce_flags & CE_UNPACKED)
+			bottom++;
+		o->cache_bottom = bottom;
+	}
+}
+
+static void mark_all_ce_unused(struct index_state *index)
+{
+	int i;
+	for (i = 0; i < index->cache_nr; i++)
+		index->cache[i]->ce_flags &= ~CE_UNPACKED;
+}
+
+static int locate_in_src_index(struct cache_entry *ce,
+			       struct unpack_trees_options *o)
+{
+	struct index_state *index = o->src_index;
+	int len = ce_namelen(ce);
+	int pos = index_name_pos(index, ce->name, len);
+	if (pos < 0)
+		pos = -1 - pos;
+	return pos;
+}
+
+/*
+ * We call unpack_index_entry() with an unmerged cache entry
+ * only in diff-index, and it wants a single callback.  Skip
+ * the other unmerged entry with the same name.
+ */
+static void mark_ce_used_same_name(struct cache_entry *ce,
+				   struct unpack_trees_options *o)
+{
+	struct index_state *index = o->src_index;
+	int len = ce_namelen(ce);
+	int pos;
+
+	for (pos = locate_in_src_index(ce, o); pos < index->cache_nr; pos++) {
+		struct cache_entry *next = index->cache[pos];
+		if (len != ce_namelen(next) ||
+		    memcmp(ce->name, next->name, len))
+			break;
+		mark_ce_used(next, o);
+	}
+}
+
+static struct cache_entry *next_cache_entry(struct unpack_trees_options *o)
+{
+	const struct index_state *index = o->src_index;
+	int pos = o->cache_bottom;
+
+	while (pos < index->cache_nr) {
+		struct cache_entry *ce = index->cache[pos];
+		if (!(ce->ce_flags & CE_UNPACKED))
+			return ce;
+		pos++;
+	}
+	return NULL;
+}
+
+static void add_same_unmerged(struct cache_entry *ce,
+			      struct unpack_trees_options *o)
+{
+	struct index_state *index = o->src_index;
+	int len = ce_namelen(ce);
+	int pos = index_name_pos(index, ce->name, len);
+
+	if (0 <= pos)
+		die("programming error in a caller of mark_ce_used_same_name");
+	for (pos = -pos - 1; pos < index->cache_nr; pos++) {
+		struct cache_entry *next = index->cache[pos];
+		if (len != ce_namelen(next) ||
+		    memcmp(ce->name, next->name, len))
+			break;
+		add_entry(o, next, 0, 0);
+		mark_ce_used(next, o);
+	}
+}
+
+static int unpack_index_entry(struct cache_entry *ce,
+			      struct unpack_trees_options *o)
 {
 	struct cache_entry *src[5] = { ce, NULL, };
+	int ret;
 
-	o->pos++;
+	mark_ce_used(ce, o);
 	if (ce_stage(ce)) {
 		if (o->skip_unmerged) {
 			add_entry(o, ce, 0, 0);
 			return 0;
 		}
 	}
-	return call_unpack_fn(src, o);
+	ret = call_unpack_fn(src, o);
+	if (ce_stage(ce))
+		mark_ce_used_same_name(ce, o);
+	return ret;
 }
 
 static int traverse_trees_recursive(int n, unsigned long dirmask, unsigned long df_conflicts, struct name_entry *names, struct traverse_info *info)
@@ -212,6 +303,20 @@ static int compare_entry(const struct cache_entry *ce, const struct traverse_inf
 	return ce_namelen(ce) > traverse_path_len(info, n);
 }
 
+static int ce_in_traverse_path(const struct cache_entry *ce,
+			       const struct traverse_info *info)
+{
+	if (!info->prev)
+		return 1;
+	if (do_compare_entry(ce, info->prev, &info->name))
+		return 0;
+	/*
+	 * If ce (blob) is the same name as the path (which is a tree
+	 * we will be descending into), it won't be inside it.
+	 */
+	return (info->pathlen < ce_namelen(ce));
+}
+
 static struct cache_entry *create_ce_entry(const struct traverse_info *info, const struct name_entry *n, int stage)
 {
 	int len = traverse_path_len(info, n);
@@ -300,23 +405,27 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 
 	/* Are we supposed to look at the index too? */
 	if (o->merge) {
-		while (o->pos < o->src_index->cache_nr) {
-			struct cache_entry *ce = o->src_index->cache[o->pos];
-			int cmp = compare_entry(ce, info, p);
+		while (1) {
+			struct cache_entry *ce = next_cache_entry(o);
+			int cmp;
+			if (!ce)
+				break;
+			cmp = compare_entry(ce, info, p);
 			if (cmp < 0) {
 				if (unpack_index_entry(ce, o) < 0)
 					return unpack_failed(o, NULL);
 				continue;
 			}
 			if (!cmp) {
-				o->pos++;
 				if (ce_stage(ce)) {
 					/*
-					 * If we skip unmerged index entries, we'll skip this
-					 * entry *and* the tree entries associated with it!
+					 * If we skip unmerged index
+					 * entries, we'll skip this
+					 * entry *and* the tree
+					 * entries associated with it!
 					 */
 					if (o->skip_unmerged) {
-						add_entry(o, ce, 0, 0);
+						add_same_unmerged(ce, o);
 						return mask;
 					}
 				}
@@ -329,6 +438,13 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	if (unpack_nondirectories(n, mask, dirmask, src, names, info) < 0)
 		return -1;
 
+	if (src[0]) {
+		if (ce_stage(src[0]))
+			mark_ce_used_same_name(src[0], o);
+		else
+			mark_ce_used(src[0], o);
+	}
+
 	/* Now handle any directories.. */
 	if (dirmask) {
 		unsigned long conflicts = mask & ~dirmask;
@@ -338,22 +454,6 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 				conflicts |= 1;
 		}
 
-		/* special case: "diff-index --cached" looking at a tree */
-		if (o->diff_index_cached &&
-		    n == 1 && dirmask == 1 && S_ISDIR(names->mode)) {
-			int matches;
-			matches = cache_tree_matches_traversal(o->src_index->cache_tree,
-							       names, info);
-			/*
-			 * Everything under the name matches.  Adjust o->pos to
-			 * skip the entire hierarchy.
-			 */
-			if (matches) {
-				o->pos += matches;
-				return mask;
-			}
-		}
-
 		if (traverse_trees_recursive(n, dirmask, conflicts,
 					     names, info) < 0)
 			return -1;
@@ -400,14 +500,33 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 		info.fn = unpack_callback;
 		info.data = o;
 
+		if (o->prefix) {
+			/*
+			 * Unpack existing index entries that sort before the
+			 * prefix the tree is spliced into.  Note that o->merge
+			 * is always true in this case.
+			 */
+			while (1) {
+				struct cache_entry *ce = next_cache_entry(o);
+				if (!ce)
+					break;
+				if (ce_in_traverse_path(ce, &info))
+					break;
+				if (unpack_index_entry(ce, o) < 0)
+					return unpack_failed(o, NULL);
+			}
+		}
+
 		if (traverse_trees(len, t, &info) < 0)
 			return unpack_failed(o, NULL);
 	}
 
 	/* Any left-over entries in the index? */
 	if (o->merge) {
-		while (o->pos < o->src_index->cache_nr) {
-			struct cache_entry *ce = o->src_index->cache[o->pos];
+		while (1) {
+			struct cache_entry *ce = next_cache_entry(o);
+			if (!ce)
+				break;
 			if (unpack_index_entry(ce, o) < 0)
 				return unpack_failed(o, NULL);
 		}
@@ -416,6 +535,12 @@ int unpack_trees(unsigned len, struct tree_desc *t, struct unpack_trees_options
 	if (o->trivial_merges_only && o->nontrivial_merge)
 		return unpack_failed(o, "Merge requires file-level merging");
 
+	/*
+	 * some callers, most notably "git status -v" runs unpack_trees()
+	 * multiple times; clean everything up after us.
+	 */
+	mark_all_ce_unused(o->src_index);
+
 	o->src_index = NULL;
 	ret = check_updates(o) ? (-2) : 0;
 	if (o->dst_index)
@@ -522,7 +647,9 @@ static int verify_clean_subdirectory(struct cache_entry *ce, const char *action,
 	 * in that directory.
 	 */
 	namelen = strlen(ce->name);
-	for (i = o->pos; i < o->src_index->cache_nr; i++) {
+	for (i = locate_in_src_index(ce, o);
+	     i < o->src_index->cache_nr;
+	     i++) {
 		struct cache_entry *ce2 = o->src_index->cache[i];
 		int len = ce_namelen(ce2);
 		if (len < namelen ||
@@ -530,12 +657,14 @@ static int verify_clean_subdirectory(struct cache_entry *ce, const char *action,
 		    ce2->name[namelen] != '/')
 			break;
 		/*
-		 * ce2->name is an entry in the subdirectory.
+		 * ce2->name is an entry in the subdirectory to be
+		 * removed.
 		 */
 		if (!ce_stage(ce2)) {
 			if (verify_uptodate(ce2, o))
 				return -1;
 			add_entry(o, ce2, CE_REMOVE, 0);
+			mark_ce_used(ce2, o);
 		}
 		cnt++;
 	}
@@ -591,7 +720,6 @@ static int verify_absent(struct cache_entry *ce, const char *action,
 		return 0;
 
 	if (!lstat(ce->name, &st)) {
-		int ret;
 		int dtype = ce_to_dtype(ce);
 		struct cache_entry *result;
 
@@ -619,28 +747,8 @@ static int verify_absent(struct cache_entry *ce, const char *action,
 			 * files that are in "foo/" we would lose
 			 * them.
 			 */
-			ret = verify_clean_subdirectory(ce, action, o);
-			if (ret < 0)
-				return ret;
-
-			/*
-			 * If this removed entries from the index,
-			 * what that means is:
-			 *
-			 * (1) the caller unpack_callback() saw path/foo
-			 * in the index, and it has not removed it because
-			 * it thinks it is handling 'path' as blob with
-			 * D/F conflict;
-			 * (2) we will return "ok, we placed a merged entry
-			 * in the index" which would cause o->pos to be
-			 * incremented by one;
-			 * (3) however, original o->pos now has 'path/foo'
-			 * marked with "to be removed".
-			 *
-			 * We need to increment it by the number of
-			 * deleted entries here.
-			 */
-			o->pos += ret;
+			if (verify_clean_subdirectory(ce, action, o) < 0)
+				return -1;
 			return 0;
 		}
 
diff --git a/unpack-trees.h b/unpack-trees.h
index d19df44..9a0733e 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -30,7 +30,7 @@ struct unpack_trees_options {
 		     diff_index_cached,
 		     gently;
 	const char *prefix;
-	int pos;
+	int cache_bottom;
 	struct dir_struct *dir;
 	merge_fn_t fn;
 	struct unpack_trees_error_msgs msgs;
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [WIP PATCH 6/6] read-tree --debug-unpack
  2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
                   ` (4 preceding siblings ...)
  2009-09-20  5:06 ` [WIP PATCH 5/6] unpack-trees.c: prepare for looking ahead in the index Junio C Hamano
@ 2009-09-20  5:06 ` Junio C Hamano
  5 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-20  5:06 UTC (permalink / raw)
  To: git

Debugging patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 builtin-read-tree.c |   36 ++++++++++++++++++++++++++++++++++++
 unpack-trees.c      |   35 +++++++++++++++++++++++++++++++++++
 unpack-trees.h      |    1 +
 3 files changed, 72 insertions(+), 0 deletions(-)

diff --git a/builtin-read-tree.c b/builtin-read-tree.c
index 14c836b..a9788e5 100644
--- a/builtin-read-tree.c
+++ b/builtin-read-tree.c
@@ -64,6 +64,34 @@ static int exclude_per_directory_cb(const struct option *opt, const char *arg,
 	return 0;
 }
 
+static void debug_stage(const char *label, struct cache_entry *ce,
+			struct unpack_trees_options *o)
+{
+	printf("%s ", label);
+	if (!ce)
+		printf("(missing)\n");
+	else if (ce == o->df_conflict_entry)
+		printf("(conflict)\n");
+	else
+		printf("%06o #%d %s %.8s\n",
+		       ce->ce_mode, ce_stage(ce), ce->name,
+		       sha1_to_hex(ce->sha1));
+}
+
+static int debug_merge(struct cache_entry **stages, struct unpack_trees_options *o)
+{
+	int i;
+
+	printf("* %d-way merge\n", o->merge_size);
+	debug_stage("index", stages[0], o);
+	for (i = 1; i <= o->merge_size; i++) {
+		char buf[24];
+		sprintf(buf, "ent#%d", i);
+		debug_stage(buf, stages[i], o);
+	}
+	return 0;
+}
+
 static struct lock_file lock_file;
 
 int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
@@ -98,6 +126,8 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 		  PARSE_OPT_NONEG, exclude_per_directory_cb },
 		OPT_SET_INT('i', NULL, &opts.index_only,
 			    "don't check the working tree after merging", 1),
+		OPT_SET_INT(0, "debug-unpack", &opts.debug_unpack,
+			    "debug unpack-trees", 1),
 		OPT_END()
 	};
 
@@ -165,6 +195,9 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 			opts.head_idx = 1;
 	}
 
+	if (opts.debug_unpack)
+		opts.fn = debug_merge;
+
 	cache_tree_free(&active_cache_tree);
 	for (i = 0; i < nr_trees; i++) {
 		struct tree *tree = trees[i];
@@ -174,6 +207,9 @@ int cmd_read_tree(int argc, const char **argv, const char *unused_prefix)
 	if (unpack_trees(nr_trees, t, &opts))
 		return 128;
 
+	if (opts.debug_unpack)
+		return 0; /* do not write the index out */
+
 	/*
 	 * When reading only one tree (either the most basic form,
 	 * "-m ent" or "--reset ent" form), we can obtain a fully
diff --git a/unpack-trees.c b/unpack-trees.c
index a6f2187..b1ad2d9 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -393,6 +393,38 @@ static int unpack_failed(struct unpack_trees_options *o, const char *message)
 	return -1;
 }
 
+static void debug_path(struct traverse_info *info)
+{
+	if (info->prev) {
+		debug_path(info->prev);
+		if (*info->prev->name.path)
+			putchar('/');
+	}
+	printf("%s", info->name.path);
+}
+
+static void debug_name_entry(int i, struct name_entry *n)
+{
+	printf("ent#%d %06o %s\n", i,
+	       n->path ? n->mode : 0,
+	       n->path ? n->path : "(missing)");
+}
+
+static void debug_unpack_callback(int n,
+				  unsigned long mask,
+				  unsigned long dirmask,
+				  struct name_entry *names,
+				  struct traverse_info *info)
+{
+	int i;
+	printf("* unpack mask %lu, dirmask %lu, cnt %d ",
+	       mask, dirmask, n);
+	debug_path(info);
+	putchar('\n');
+	for (i = 0; i < n; i++)
+		debug_name_entry(i, names + i);
+}
+
 static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, struct name_entry *names, struct traverse_info *info)
 {
 	struct cache_entry *src[MAX_UNPACK_TREES + 1] = { NULL, };
@@ -403,6 +435,9 @@ static int unpack_callback(int n, unsigned long mask, unsigned long dirmask, str
 	while (!p->mode)
 		p++;
 
+	if (o->debug_unpack)
+		debug_unpack_callback(n, mask, dirmask, names, info);
+
 	/* Are we supposed to look at the index too? */
 	if (o->merge) {
 		while (1) {
diff --git a/unpack-trees.h b/unpack-trees.h
index 9a0733e..701dca5 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -28,6 +28,7 @@ struct unpack_trees_options {
 		     skip_unmerged,
 		     initial_checkout,
 		     diff_index_cached,
+		     debug_unpack,
 		     gently;
 	const char *prefix;
 	int cache_bottom;
-- 
1.6.5.rc1.90.ga3b1b

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-20  5:06 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-20  5:06 [WIP PATCH 0/6] Merging with D/F conflicts Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 1/6] diff-lib.c: fix misleading comments on oneway_diff() Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 2/6] unpack-trees: typofix Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 3/6] unpack_callback(): use unpack_failed() consistently Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 4/6] traverse_trees(): handle D/F conflict case sanely Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 5/6] unpack-trees.c: prepare for looking ahead in the index Junio C Hamano
2009-09-20  5:06 ` [WIP PATCH 6/6] read-tree --debug-unpack Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.