All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] diff --cc: relax path filtering
@ 2015-04-02 20:34 Max Kirillov
  2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
                   ` (5 more replies)
  0 siblings, 6 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-02 20:34 UTC (permalink / raw)
  To: Kirill Smelkov, Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

For diff --cc, paths fitering used to select only paths which have
changed in all parents, while diffing itself output hunks which are
changed in as few as 2 parents.

Fix intersect_paths() to add paths which have at least 2 changed
parents.

Intersects with branch 'bc/object-id' which is not yet in master. This is
rebased on top of it.

Max Kirillov (4):
  Add test for showing discarded changes with diff --cc
  combine-diff.c: refactor: extract insert_path()
  diff --cc: relax too strict paths picking
  t4059: rewrite to be adaptive to hunk filtering

 combine-diff.c                  |  95 ++++++++++++++++++-----------
 t/t4059-diff-merge-with-base.sh | 132 ++++++++++++++++++++++++++++++++++++++++
 2 files changed, 193 insertions(+), 34 deletions(-)
 create mode 100755 t/t4059-diff-merge-with-base.sh

-- 
2.3.4.2801.g3d0809b

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH 1/4] Add test for showing discarded changes with diff --cc
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
@ 2015-04-02 20:34 ` Max Kirillov
  2015-04-02 20:55   ` Junio C Hamano
  2015-04-02 20:34 ` [PATCH 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Max Kirillov @ 2015-04-02 20:34 UTC (permalink / raw)
  To: Kirill Smelkov, Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

If a hunk has been changed in both branches and conflict resolution
fully takes one of branches, discarding all changes that are in the
other, then the merge is not shown in 3-way diff which git uses by
default.

The advice is to use flag --cc and and explicitly add the mergebase was
given in $gmane/191557. It was discovered though that it does not always
work. If the whole file has not changed at all compared to one of
branches then no difference is shown for this file.

This looks inconsistent and for particular scenario of viewing discarded
changes is undesirable.

Add the test which demonstrates the issue.

Signed-off-by: Max Kirillov <max@max630.net>
---
 t/t4059-diff-merge-with-base.sh | 100 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 100 insertions(+)
 create mode 100755 t/t4059-diff-merge-with-base.sh

diff --git a/t/t4059-diff-merge-with-base.sh b/t/t4059-diff-merge-with-base.sh
new file mode 100755
index 0000000..e650a33
--- /dev/null
+++ b/t/t4059-diff-merge-with-base.sh
@@ -0,0 +1,100 @@
+#!/bin/sh
+
+test_description='Diff aware of merge base'
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	mkdir short &&
+	mkdir long &&
+	for fn in win1 win2 merge delete base only1 only2; do
+		test_seq 3 >short/$fn
+		git add short/$fn &&
+		test_seq 11 >long/$fn &&
+		git add long/$fn
+	done &&
+	git commit -m mergebase &&
+	git branch mergebase &&
+
+	for fn in win1 win2 merge delete base only1; do
+		for dir in short long; do
+			sed -e "s/^2/2change1/" -e "s/^7/7change1/" $dir/$fn >sed.new &&
+			mv sed.new $dir/$fn &&
+			git add $dir/$fn
+		done
+	done &&
+	sed -e "s/^7/7change1/" long/only2 >sed.new &&
+	mv sed.new long/only2 &&
+	git add long/only2 &&
+	git commit -m branch1 &&
+	git branch branch1 &&
+
+	git reset --hard mergebase &&
+	for fn in win1 win2 merge delete base only2; do
+		for dir in short long; do
+			sed -e "s/^2/2change2/" -e "s/^11/11change2/" $dir/$fn >sed.new &&
+			mv sed.new $dir/$fn &&
+			git add $dir/$fn
+		done
+	done &&
+	sed -e "s/^11/11change2/" long/only1 >sed.new &&
+	mv sed.new long/only1 &&
+	git add long/only1 &&
+	git commit -m branch2 &&
+	git branch branch2 &&
+
+	test_must_fail git merge branch1 &&
+	git checkout mergebase -- . &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
+	git add long/base &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/win1 &&
+	git add long/win1 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/win2 &&
+	git add long/win2 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2merged/" >long/merge &&
+	git add long/merge &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "/^2/d" >long/delete &&
+	git add long/delete &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/only1 &&
+	git add long/only1 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/only2 &&
+	git add long/only2 &&
+	test_seq 3 >short/base &&
+	git add short/base &&
+	test_seq 3 | sed -e "s/^2/2change1/" >short/win1 &&
+	git add short/win1 &&
+	test_seq 3 | sed -e "s/^2/2change2/" >short/win2 &&
+	git add short/win2 &&
+	test_seq 3 | sed -e "s/^2/2merged/" >short/merge &&
+	git add short/merge &&
+	test_seq 3 | sed -e "/^2/d" >short/delete &&
+	git add short/delete &&
+	test_seq 3 | sed -e "s/^2/2change1/" >short/only1 &&
+	git add short/only1 &&
+	test_seq 3 | sed -e "s/^2/2change2/" >short/only2 &&
+	git add short/only2 &&
+	git commit -m merge &&
+	git branch merge
+'
+
+test_expect_success 'diff with mergebase shows discarded change from parent 2 in merged file' '
+	git diff --cc merge branch1 branch2 mergebase -- long/win1 >actual &&
+	test -s actual
+'
+
+test_expect_success 'diff with mergebase shows discarded change from parent 1 in merged file' '
+	git diff --cc merge branch1 branch2 mergebase -- long/win2 >actual &&
+	test -s actual
+'
+
+test_expect_failure 'diff with mergebase shows fully discarded file from parent 2' '
+	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
+	test -s actual
+'
+
+test_expect_failure 'diff with mergebase shows fully discarded file from parent 1' '
+	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
+	test -s actual
+'
+
+test_done
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 2/4] combine-diff.c: refactor: extract insert_path()
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
  2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
@ 2015-04-02 20:34 ` Max Kirillov
  2015-04-02 20:34 ` [PATCH 3/4] diff --cc: relax too strict paths picking Max Kirillov
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-02 20:34 UTC (permalink / raw)
  To: Kirill Smelkov, Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

Signed-off-by: Max Kirillov <max@max630.net>
---
 combine-diff.c | 43 ++++++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 8eb7278..a236bb5 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -22,6 +22,28 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
+static void insert_path(struct combine_diff_path **pos, const char* path, int n, int num_parent, struct diff_filepair *queue_item)
+{
+	int len;
+	struct combine_diff_path *p;
+
+	len = strlen(path);
+	p = xmalloc(combine_diff_path_size(num_parent, len));
+	p->path = (char *) &(p->parent[num_parent]);
+	memcpy(p->path, path, len);
+	p->path[len] = 0;
+	p->next = *pos;
+	memset(p->parent, 0,
+	       sizeof(p->parent[0]) * num_parent);
+
+	hashcpy(p->oid.hash, queue_item->two->sha1);
+	p->mode = queue_item->two->mode;
+	hashcpy(p->parent[n].oid.hash, queue_item->one->sha1);
+	p->parent[n].mode = queue_item->one->mode;
+	p->parent[n].status = queue_item->status;
+	*pos = p;
+}
+
 static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
@@ -30,27 +52,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
-			int len;
-			const char *path;
 			if (diff_unmodified_pair(q->queue[i]))
 				continue;
-			path = q->queue[i]->two->path;
-			len = strlen(path);
-			p = xmalloc(combine_diff_path_size(num_parent, len));
-			p->path = (char *) &(p->parent[num_parent]);
-			memcpy(p->path, path, len);
-			p->path[len] = 0;
-			p->next = NULL;
-			memset(p->parent, 0,
-			       sizeof(p->parent[0]) * num_parent);
-
-			hashcpy(p->oid.hash, q->queue[i]->two->sha1);
-			p->mode = q->queue[i]->two->mode;
-			hashcpy(p->parent[n].oid.hash, q->queue[i]->one->sha1);
-			p->parent[n].mode = q->queue[i]->one->mode;
-			p->parent[n].status = q->queue[i]->status;
-			*tail = p;
-			tail = &p->next;
+			insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
+			tail = &(*tail)->next;
 		}
 		return curr;
 	}
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 3/4] diff --cc: relax too strict paths picking
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
  2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
  2015-04-02 20:34 ` [PATCH 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
@ 2015-04-02 20:34 ` Max Kirillov
  2015-04-02 20:59   ` Junio C Hamano
  2015-04-02 20:34 ` [PATCH 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 25+ messages in thread
From: Max Kirillov @ 2015-04-02 20:34 UTC (permalink / raw)
  To: Kirill Smelkov, Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

For diff --cc, paths fitering used to select only paths which have
changed in all parents, while diffing itself output hunks which are
changed in as few as 2 parents.

Fix intersect_paths() to add paths which have at least 2 changed
parents. In this new functionality does not require special treatment of
first pass, because path can be added from any parent, not just the
first one.

Signed-off-by: Max Kirillov <max@max630.net>
---
 combine-diff.c                  | 56 ++++++++++++++++++++++++++++-------------
 t/t4059-diff-merge-with-base.sh |  4 +--
 2 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index a236bb5..fd752e7 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -25,6 +25,7 @@ static int compare_paths(const struct combine_diff_path *one,
 static void insert_path(struct combine_diff_path **pos, const char* path, int n, int num_parent, struct diff_filepair *queue_item)
 {
 	int len;
+	int parent_idx;
 	struct combine_diff_path *p;
 
 	len = strlen(path);
@@ -41,43 +42,64 @@ static void insert_path(struct combine_diff_path **pos, const char* path, int n,
 	hashcpy(p->parent[n].oid.hash, queue_item->one->sha1);
 	p->parent[n].mode = queue_item->one->mode;
 	p->parent[n].status = queue_item->status;
+	for (parent_idx = 0; parent_idx < n; ++parent_idx) {
+		hashcpy(p->parent[parent_idx].oid.hash, p->oid.hash);
+		p->parent[parent_idx].mode = p->mode;
+		p->parent[parent_idx].status = ' ';
+	}
 	*pos = p;
 }
 
+static int changed_parents(struct combine_diff_path *p, int n)
+{
+	int parent_idx;
+	int result = 0;
+
+	for (parent_idx = 0; parent_idx < n; ++parent_idx) {
+		if (p->parent[parent_idx].status != ' ')
+			++result;
+	}
+
+	return result;
+}
+
 static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
 	int i, cmp;
 
-	if (!n) {
-		for (i = 0; i < q->nr; i++) {
-			if (diff_unmodified_pair(q->queue[i]))
-				continue;
-			insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
-			tail = &(*tail)->next;
-		}
-		return curr;
-	}
-
 	/*
 	 * paths in curr (linked list) and q->queue[] (array) are
 	 * both sorted in the tree order.
 	 */
 	i = 0;
-	while ((p = *tail) != NULL) {
-		cmp = ((i >= q->nr)
-		       ? -1 : compare_paths(p, q->queue[i]->two));
+	while ((p = *tail) != NULL || i < q->nr) {
+		cmp = (i >= q->nr) ? -1
+		      : (p == NULL) ? 1
+		      : compare_paths(p, q->queue[i]->two);
 
 		if (cmp < 0) {
-			/* p->path not in q->queue[]; drop it */
-			*tail = p->next;
-			free(p);
+			/* p->path not in q->queue[] */
+			if (num_parent > 2 && 2 - changed_parents(p, n) <= num_parent - n - 1) {
+				/* still can get 2 changed parents */
+				hashcpy(p->parent[n].oid.hash, p->oid.hash);
+				p->parent[n].mode = p->mode;
+				p->parent[n].status = ' ';
+				tail = &p->next;
+			} else {
+				*tail = p->next;
+				free(p);
+			}
 			continue;
 		}
 
 		if (cmp > 0) {
-			/* q->queue[i] not in p->path; skip it */
+			/* q->queue[i] not in p->path */
+			if (1 <= num_parent - n - 1) {
+				insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
+				tail = &(*tail)->next;
+			}
 			i++;
 			continue;
 		}
diff --git a/t/t4059-diff-merge-with-base.sh b/t/t4059-diff-merge-with-base.sh
index e650a33..6341f7c 100755
--- a/t/t4059-diff-merge-with-base.sh
+++ b/t/t4059-diff-merge-with-base.sh
@@ -87,12 +87,12 @@ test_expect_success 'diff with mergebase shows discarded change from parent 1 in
 	test -s actual
 '
 
-test_expect_failure 'diff with mergebase shows fully discarded file from parent 2' '
+test_expect_success 'diff with mergebase shows fully discarded file from parent 2' '
 	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
 	test -s actual
 '
 
-test_expect_failure 'diff with mergebase shows fully discarded file from parent 1' '
+test_expect_success 'diff with mergebase shows fully discarded file from parent 1' '
 	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
 	test -s actual
 '
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH 4/4] t4059: rewrite to be adaptive to hunk filtering
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
                   ` (2 preceding siblings ...)
  2015-04-02 20:34 ` [PATCH 3/4] diff --cc: relax too strict paths picking Max Kirillov
@ 2015-04-02 20:34 ` Max Kirillov
  2015-04-02 21:13 ` [PATCH 0/4] diff --cc: relax path filtering Jeff King
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
  5 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-02 20:34 UTC (permalink / raw)
  To: Kirill Smelkov, Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

Looks like there is no exact specification what `diff -c` and
`diff --cc` should output. Considering this it is not reasonable to
hardcode any specific output in test. Rather, it should verify that file
selection behaves the same as hunk selection.

Rewrite test so that it makes sure that a "short" file is shown if and
only if the corresponding "long" file's contains the changed hunk.

Signed-off-by: Max Kirillov <max@max630.net>
---
 t/t4059-diff-merge-with-base.sh | 84 ++++++++++++++++++++++++++++-------------
 1 file changed, 58 insertions(+), 26 deletions(-)

diff --git a/t/t4059-diff-merge-with-base.sh b/t/t4059-diff-merge-with-base.sh
index 6341f7c..75a3820 100755
--- a/t/t4059-diff-merge-with-base.sh
+++ b/t/t4059-diff-merge-with-base.sh
@@ -1,13 +1,22 @@
 #!/bin/sh
 
-test_description='Diff aware of merge base'
+test_description='combined diff filtering is not affected by preliminary path filtering'
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY"/diff-lib.sh
 
+# history is:
+# (mergebase) --> (branch1) --\
+#  |                          V
+#  \ --> (branch2) ----------(merge)
+# there are files in 2 subdirectories, "long" and "short"
+# each file in "long" subdirecty has exactly same history as same file in "short" one,
+# but it has added lines with changes in both branches which merge without conflict
+# so the long files are always selected at path filtering
 test_expect_success setup '
 	mkdir short &&
 	mkdir long &&
-	for fn in win1 win2 merge delete base only1 only2; do
+	for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
 		test_seq 3 >short/$fn
 		git add short/$fn &&
 		test_seq 11 >long/$fn &&
@@ -23,9 +32,11 @@ test_expect_success setup '
 			git add $dir/$fn
 		done
 	done &&
-	sed -e "s/^7/7change1/" long/only2 >sed.new &&
-	mv sed.new long/only2 &&
-	git add long/only2 &&
+	for fn in only2 only2discard; do
+	    sed -e "s/^7/7change1/" long/$fn >sed.new &&
+	    mv sed.new long/$fn &&
+	    git add long/$fn
+	done &&
 	git commit -m branch1 &&
 	git branch branch1 &&
 
@@ -37,9 +48,11 @@ test_expect_success setup '
 			git add $dir/$fn
 		done
 	done &&
-	sed -e "s/^11/11change2/" long/only1 >sed.new &&
-	mv sed.new long/only1 &&
-	git add long/only1 &&
+	for fn in only1 only1discard; do
+	    sed -e "s/^11/11change2/" long/$fn >sed.new &&
+	    mv sed.new long/$fn &&
+	    git add long/$fn
+	done &&
 	git commit -m branch2 &&
 	git branch branch2 &&
 
@@ -47,6 +60,10 @@ test_expect_success setup '
 	git checkout mergebase -- . &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
 	git add long/base &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/only1discard &&
+	git add long/only1discard &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/only2discard &&
+	git add long/only2discard &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/win1 &&
 	git add long/win1 &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/win2 &&
@@ -61,6 +78,10 @@ test_expect_success setup '
 	git add long/only2 &&
 	test_seq 3 >short/base &&
 	git add short/base &&
+	test_seq 3 >short/only1discard &&
+	git add short/only1discard &&
+	test_seq 3 >short/only2discard &&
+	git add short/only2discard &&
 	test_seq 3 | sed -e "s/^2/2change1/" >short/win1 &&
 	git add short/win1 &&
 	test_seq 3 | sed -e "s/^2/2change2/" >short/win2 &&
@@ -77,24 +98,35 @@ test_expect_success setup '
 	git branch merge
 '
 
-test_expect_success 'diff with mergebase shows discarded change from parent 2 in merged file' '
-	git diff --cc merge branch1 branch2 mergebase -- long/win1 >actual &&
-	test -s actual
-'
-
-test_expect_success 'diff with mergebase shows discarded change from parent 1 in merged file' '
-	git diff --cc merge branch1 branch2 mergebase -- long/win2 >actual &&
-	test -s actual
-'
+# the difference in short file must be returned if and only if it is shown in long file
+for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
+	if git diff --cc merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
+	then
+		test_expect_success "diff --cc contains short/$fn" '
+			git diff --cc merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			test -s actual
+		'
+	else
+		test_expect_success "diff --cc does not contain short/$fn" '
+			git diff --cc merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			! test -s actual
+		'
+	fi
+done
 
-test_expect_success 'diff with mergebase shows fully discarded file from parent 2' '
-	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
-	test -s actual
-'
-
-test_expect_success 'diff with mergebase shows fully discarded file from parent 1' '
-	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
-	test -s actual
-'
+for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
+	if git diff -c merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
+	then
+		test_expect_success "diff -c contains short/$fn" '
+			git diff -c merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			test -s actual
+		'
+	else
+		test_expect_success "diff -c does not contain short/$fn" '
+			git diff -c merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			! test -s actual
+		'
+	fi
+done
 
 test_done
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] Add test for showing discarded changes with diff --cc
  2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
@ 2015-04-02 20:55   ` Junio C Hamano
  2015-04-03 16:03     ` Max Kirillov
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-04-02 20:55 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Kirill Smelkov, Jeff King, git

Max Kirillov <max@max630.net> writes:

> If a hunk has been changed in both branches and conflict resolution
> fully takes one of branches, discarding all changes that are in the
> other, then the merge is not shown in 3-way diff which git uses by
> default.
>
> The advice is to use flag --cc and and explicitly add the mergebase was
> given in $gmane/191557. It was discovered though that it does not always
> work. If the whole file has not changed at all compared to one of
> branches then no difference is shown for this file.
>
> This looks inconsistent and for particular scenario of viewing discarded
> changes is undesirable.
>
> Add the test which demonstrates the issue.
>
> Signed-off-by: Max Kirillov <max@max630.net>

Thanks.  I will not have time to review this properly at the moment
while preparing 2.4-rc1, and I do not want to spend time figuring
out if the parent culling you are chaning was done deliberately (in
which case this patch may be breaking things) or not.

So I'll give a preliminary nitpicks first ;-)

>> Subject: Re: [PATCH 1/4] Add test for showing discarded changes with diff --cc

Subject: t4059: test 'diff --cc' with a change from only few parents

or something?  "<area>: <what you did>" without capitalization of
the beginning of what you did and the full-stop at the end.

> ---
>  t/t4059-diff-merge-with-base.sh | 100 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 100 insertions(+)
>  create mode 100755 t/t4059-diff-merge-with-base.sh
>
> diff --git a/t/t4059-diff-merge-with-base.sh b/t/t4059-diff-merge-with-base.sh
> new file mode 100755
> index 0000000..e650a33
> --- /dev/null
> +++ b/t/t4059-diff-merge-with-base.sh
> @@ -0,0 +1,100 @@
> +#!/bin/sh
> +
> +test_description='Diff aware of merge base'
> +
> +. ./test-lib.sh
> +
> +test_expect_success setup '
> +	mkdir short &&
> +	mkdir long &&
> +	for fn in win1 win2 merge delete base only1 only2; do
> +		test_seq 3 >short/$fn
> +		git add short/$fn &&
> +		test_seq 11 >long/$fn &&
> +		git add long/$fn
> +	done &&

Two nits:

 - Style: have "do" on its own line.  A good rule of thumb is that
   you shouldn't have to type ';' when you are writing a shell
   script, unless you are terminating a case arm with ';;'.

	for fn in ...
        do
        	...
	done

 - Correctness: aside from missing && after "test_seq 3", if the
   last step "git add long/$fn" failed in an earlier iteration, you
   would not notice any breakage.  Either

	(
                for fn in ...
                do
                        ... &&
                        ... &&
                        ... || exit $?
                done
	)

    or

	for fn in ...
        do
        	... &&
                ... &&
                ... || return $?
	done

    The latter is only valid in our test scripts, where the test
    framework gives you an artificial "caller" of the "subroutine"
    you are writing as the test body.

> +	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
> +	git add long/base &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change1/" >long/win1 &&
> +	git add long/win1 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change2/" >long/win2 &&
> +	git add long/win2 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2merged/" >long/merge &&
> +	git add long/merge &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "/^2/d" >long/delete &&
> +	git add long/delete &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change1/" >long/only1 &&
> +	git add long/only1 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change2/" >long/only2 &&
> +	git add long/only2 &&

Patch is line-wrapped around here?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 3/4] diff --cc: relax too strict paths picking
  2015-04-02 20:34 ` [PATCH 3/4] diff --cc: relax too strict paths picking Max Kirillov
@ 2015-04-02 20:59   ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2015-04-02 20:59 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

Again, just surface nitpicks.

[side: Kirill Smelkov <kirr@mns.spb.ru> removed from CC, as the
ddress just bounced for me]

> +static int changed_parents(struct combine_diff_path *p, int n)
> +{
> +	int parent_idx;
> +	int result = 0;
> +
> +	for (parent_idx = 0; parent_idx < n; ++parent_idx) {
> +		if (p->parent[parent_idx].status != ' ')
> +			++result;

We write C (not C++) and favor post_increment++ over ++pre_increment
when there is no valid reason to do otherwise (e.g. the result of
increment getting used in a larger expression).

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/4] diff --cc: relax path filtering
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
                   ` (3 preceding siblings ...)
  2015-04-02 20:34 ` [PATCH 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
@ 2015-04-02 21:13 ` Jeff King
  2015-04-03 16:29   ` Max Kirillov
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
  5 siblings, 1 reply; 25+ messages in thread
From: Jeff King @ 2015-04-02 21:13 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Junio C Hamano, git

On Thu, Apr 02, 2015 at 11:34:09PM +0300, Max Kirillov wrote:

> For diff --cc, paths fitering used to select only paths which have
> changed in all parents, while diffing itself output hunks which are
> changed in as few as 2 parents.

I'm confused about "used to" here. Is this a regression due to the
combine-diff rewriting that happened in 2.0, or do you just mean "before
this patch series, we used to do this other thing".

> Fix intersect_paths() to add paths which have at least 2 changed
> parents.

I'd worry a little that this is increasing the cost to do "log --cc", as
it means we will have to open and look at extra files, and we may find
in many cases that there aren't any interesting hunks. Which would imply
we might want to put it behind a flag, rather than as the default
("--cc-me-harder").

But if I'm understanding the issue correctly, this should only matter
for octopus merges.  That is, the old rule for looking at a path was "is
there at least one parent whose content we took verbatim", but the new
one is "are there are at least 2 parents whose content we did not take
verbatim". With only two parents, those would be the same thing, I
think.

-Peff

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 0/4] diff --cc: relax path filtering
  2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
                   ` (4 preceding siblings ...)
  2015-04-02 21:13 ` [PATCH 0/4] diff --cc: relax path filtering Jeff King
@ 2015-04-03 15:58 ` Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
                     ` (3 more replies)
  5 siblings, 4 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

Fixes:
* test renamed and commit message rewritten, so that
  it mentions 3-parent merge rather than other uses
* test: fix && chaining and do at new line
* use postincrement in C code

Max Kirillov (4):
  t4059: test 'diff --cc' with a change from only few parents
  combine-diff.c: refactor: extract insert_path()
  diff --cc: relax too strict paths picking
  t4059: rewrite to be adaptive to hunk filtering

 combine-diff.c                                    |  95 +++++++++------
 t/t4059-diff-cc-not-affected-by-path-filtering.sh | 142 ++++++++++++++++++++++
 2 files changed, 203 insertions(+), 34 deletions(-)
 create mode 100755 t/t4059-diff-cc-not-affected-by-path-filtering.sh

-- 
2.3.4.2801.g3d0809b

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
@ 2015-04-03 15:58   ` Max Kirillov
  2015-04-11 20:04     ` Junio C Hamano
  2015-04-11 21:07     ` Junio C Hamano
  2015-04-03 15:58   ` [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
                     ` (2 subsequent siblings)
  3 siblings, 2 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

If `git diff --cc` is used with 2 or more parents, then
it shows all hunks which have changed compared to at least 2 parents.
Which is reasonable, because those places are likely places for
conflicts, and it should be displayed how they were resolved.
But, preliminary path filtering passes a path only it was changed
compared to each parent. So, if a hunk which has not changed compared to
some of parents is shown if there are more changed hunks in the file,
but not shown if it is the only change.

This looks inconsistent and for some scenarios it is desirable to show
such changes.

Add the test which demonstrates the issue.

Signed-off-by: Max Kirillov <max@max630.net>
---
 t/t4059-diff-cc-not-affected-by-path-filtering.sh | 108 ++++++++++++++++++++++
 1 file changed, 108 insertions(+)
 create mode 100755 t/t4059-diff-cc-not-affected-by-path-filtering.sh

diff --git a/t/t4059-diff-cc-not-affected-by-path-filtering.sh b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
new file mode 100755
index 0000000..3e6e59b
--- /dev/null
+++ b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
@@ -0,0 +1,108 @@
+#!/bin/sh
+
+test_description='combined diff filtering is not affected by preliminary path filtering'
+# Since diff --cc allows use not only real parents but any commits, use merge
+# base here as the 3rd "parent". The trick was suggested in $gmane/191557 to
+# spot changes which were discarded during conflict resolution.
+
+. ./test-lib.sh
+
+test_expect_success setup '
+	mkdir short &&
+	mkdir long &&
+	for fn in win1 win2 merge delete base only1 only2
+	do
+		test_seq 3 >short/$fn &&
+		git add short/$fn &&
+		test_seq 11 >long/$fn &&
+		git add long/$fn || return $?
+	done &&
+	git commit -m mergebase &&
+	git branch mergebase &&
+
+	for fn in win1 win2 merge delete base only1
+	do
+		for dir in short long
+		do
+			sed -e "s/^2/2change1/" -e "s/^7/7change1/" $dir/$fn >sed.new &&
+			mv sed.new $dir/$fn &&
+			git add $dir/$fn || return $?
+		done || return $?
+	done &&
+	sed -e "s/^7/7change1/" long/only2 >sed.new &&
+	mv sed.new long/only2 &&
+	git add long/only2 &&
+	git commit -m branch1 &&
+	git branch branch1 &&
+
+	git reset --hard mergebase &&
+	for fn in win1 win2 merge delete base only2
+	do
+		for dir in short long
+		do
+			sed -e "s/^2/2change2/" -e "s/^11/11change2/" $dir/$fn >sed.new &&
+			mv sed.new $dir/$fn &&
+			git add $dir/$fn || return $?
+		done || return $?
+	done &&
+	sed -e "s/^11/11change2/" long/only1 >sed.new &&
+	mv sed.new long/only1 &&
+	git add long/only1 &&
+	git commit -m branch2 &&
+	git branch branch2 &&
+
+	test_must_fail git merge branch1 &&
+	git checkout mergebase -- . &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
+	git add long/base &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/win1 &&
+	git add long/win1 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/win2 &&
+	git add long/win2 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2merged/" >long/merge &&
+	git add long/merge &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "/^2/d" >long/delete &&
+	git add long/delete &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/only1 &&
+	git add long/only1 &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/only2 &&
+	git add long/only2 &&
+	test_seq 3 >short/base &&
+	git add short/base &&
+	test_seq 3 | sed -e "s/^2/2change1/" >short/win1 &&
+	git add short/win1 &&
+	test_seq 3 | sed -e "s/^2/2change2/" >short/win2 &&
+	git add short/win2 &&
+	test_seq 3 | sed -e "s/^2/2merged/" >short/merge &&
+	git add short/merge &&
+	test_seq 3 | sed -e "/^2/d" >short/delete &&
+	git add short/delete &&
+	test_seq 3 | sed -e "s/^2/2change1/" >short/only1 &&
+	git add short/only1 &&
+	test_seq 3 | sed -e "s/^2/2change2/" >short/only2 &&
+	git add short/only2 &&
+	git commit -m merge &&
+	git branch merge
+'
+
+test_expect_success 'diff with mergebase shows discarded change from parent 2 in merged file' '
+	git diff --cc merge branch1 branch2 mergebase -- long/win1 >actual &&
+	test -s actual
+'
+
+test_expect_success 'diff with mergebase shows discarded change from parent 1 in merged file' '
+	git diff --cc merge branch1 branch2 mergebase -- long/win2 >actual &&
+	test -s actual
+'
+
+test_expect_failure 'diff with mergebase shows fully discarded file from parent 2' '
+	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
+	test -s actual
+'
+
+test_expect_failure 'diff with mergebase shows fully discarded file from parent 1' '
+	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
+	test -s actual
+'
+
+test_done
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path()
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
@ 2015-04-03 15:58   ` Max Kirillov
  2015-04-11 20:14     ` Junio C Hamano
  2015-04-03 15:58   ` [PATCH v2 3/4] diff --cc: relax too strict paths picking Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
  3 siblings, 1 reply; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

Signed-off-by: Max Kirillov <max@max630.net>
---
 combine-diff.c | 43 ++++++++++++++++++++++++-------------------
 1 file changed, 24 insertions(+), 19 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 8eb7278..a236bb5 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -22,6 +22,28 @@ static int compare_paths(const struct combine_diff_path *one,
 				 two->path, strlen(two->path), two->mode);
 }
 
+static void insert_path(struct combine_diff_path **pos, const char* path, int n, int num_parent, struct diff_filepair *queue_item)
+{
+	int len;
+	struct combine_diff_path *p;
+
+	len = strlen(path);
+	p = xmalloc(combine_diff_path_size(num_parent, len));
+	p->path = (char *) &(p->parent[num_parent]);
+	memcpy(p->path, path, len);
+	p->path[len] = 0;
+	p->next = *pos;
+	memset(p->parent, 0,
+	       sizeof(p->parent[0]) * num_parent);
+
+	hashcpy(p->oid.hash, queue_item->two->sha1);
+	p->mode = queue_item->two->mode;
+	hashcpy(p->parent[n].oid.hash, queue_item->one->sha1);
+	p->parent[n].mode = queue_item->one->mode;
+	p->parent[n].status = queue_item->status;
+	*pos = p;
+}
+
 static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
@@ -30,27 +52,10 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 
 	if (!n) {
 		for (i = 0; i < q->nr; i++) {
-			int len;
-			const char *path;
 			if (diff_unmodified_pair(q->queue[i]))
 				continue;
-			path = q->queue[i]->two->path;
-			len = strlen(path);
-			p = xmalloc(combine_diff_path_size(num_parent, len));
-			p->path = (char *) &(p->parent[num_parent]);
-			memcpy(p->path, path, len);
-			p->path[len] = 0;
-			p->next = NULL;
-			memset(p->parent, 0,
-			       sizeof(p->parent[0]) * num_parent);
-
-			hashcpy(p->oid.hash, q->queue[i]->two->sha1);
-			p->mode = q->queue[i]->two->mode;
-			hashcpy(p->parent[n].oid.hash, q->queue[i]->one->sha1);
-			p->parent[n].mode = q->queue[i]->one->mode;
-			p->parent[n].status = q->queue[i]->status;
-			*tail = p;
-			tail = &p->next;
+			insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
+			tail = &(*tail)->next;
 		}
 		return curr;
 	}
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 3/4] diff --cc: relax too strict paths picking
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
@ 2015-04-03 15:58   ` Max Kirillov
  2015-04-03 15:58   ` [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
  3 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

For diff --cc, paths fitering used to select only paths which have
changed in all parents, while diffing itself output hunks which are
changed in as few as 2 parents.

Fix intersect_paths() to add paths which have at least 2 changed
parents. In this new functionality does not require special treatment of
first pass, because path can be added from any parent, not just the
first one.

Signed-off-by: Max Kirillov <max@max630.net>
---
 combine-diff.c                                    | 56 ++++++++++++++++-------
 t/t4059-diff-cc-not-affected-by-path-filtering.sh |  4 +-
 2 files changed, 41 insertions(+), 19 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index a236bb5..2285c7c 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -25,6 +25,7 @@ static int compare_paths(const struct combine_diff_path *one,
 static void insert_path(struct combine_diff_path **pos, const char* path, int n, int num_parent, struct diff_filepair *queue_item)
 {
 	int len;
+	int parent_idx;
 	struct combine_diff_path *p;
 
 	len = strlen(path);
@@ -41,43 +42,64 @@ static void insert_path(struct combine_diff_path **pos, const char* path, int n,
 	hashcpy(p->parent[n].oid.hash, queue_item->one->sha1);
 	p->parent[n].mode = queue_item->one->mode;
 	p->parent[n].status = queue_item->status;
+	for (parent_idx = 0; parent_idx < n; parent_idx++) {
+		hashcpy(p->parent[parent_idx].oid.hash, p->oid.hash);
+		p->parent[parent_idx].mode = p->mode;
+		p->parent[parent_idx].status = ' ';
+	}
 	*pos = p;
 }
 
+static int changed_parents(struct combine_diff_path *p, int n)
+{
+	int parent_idx;
+	int result = 0;
+
+	for (parent_idx = 0; parent_idx < n; parent_idx++) {
+		if (p->parent[parent_idx].status != ' ')
+			result++;
+	}
+
+	return result;
+}
+
 static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
 	int i, cmp;
 
-	if (!n) {
-		for (i = 0; i < q->nr; i++) {
-			if (diff_unmodified_pair(q->queue[i]))
-				continue;
-			insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
-			tail = &(*tail)->next;
-		}
-		return curr;
-	}
-
 	/*
 	 * paths in curr (linked list) and q->queue[] (array) are
 	 * both sorted in the tree order.
 	 */
 	i = 0;
-	while ((p = *tail) != NULL) {
-		cmp = ((i >= q->nr)
-		       ? -1 : compare_paths(p, q->queue[i]->two));
+	while ((p = *tail) != NULL || i < q->nr) {
+		cmp = (i >= q->nr) ? -1
+		      : (p == NULL) ? 1
+		      : compare_paths(p, q->queue[i]->two);
 
 		if (cmp < 0) {
-			/* p->path not in q->queue[]; drop it */
-			*tail = p->next;
-			free(p);
+			/* p->path not in q->queue[] */
+			if (num_parent > 2 && 2 - changed_parents(p, n) <= num_parent - n - 1) {
+				/* still can get 2 changed parents */
+				hashcpy(p->parent[n].oid.hash, p->oid.hash);
+				p->parent[n].mode = p->mode;
+				p->parent[n].status = ' ';
+				tail = &p->next;
+			} else {
+				*tail = p->next;
+				free(p);
+			}
 			continue;
 		}
 
 		if (cmp > 0) {
-			/* q->queue[i] not in p->path; skip it */
+			/* q->queue[i] not in p->path */
+			if (1 <= num_parent - n - 1) {
+				insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
+				tail = &(*tail)->next;
+			}
 			i++;
 			continue;
 		}
diff --git a/t/t4059-diff-cc-not-affected-by-path-filtering.sh b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
index 3e6e59b..ab3dbd2 100755
--- a/t/t4059-diff-cc-not-affected-by-path-filtering.sh
+++ b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
@@ -95,12 +95,12 @@ test_expect_success 'diff with mergebase shows discarded change from parent 1 in
 	test -s actual
 '
 
-test_expect_failure 'diff with mergebase shows fully discarded file from parent 2' '
+test_expect_success 'diff with mergebase shows fully discarded file from parent 2' '
 	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
 	test -s actual
 '
 
-test_expect_failure 'diff with mergebase shows fully discarded file from parent 1' '
+test_expect_success 'diff with mergebase shows fully discarded file from parent 1' '
 	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
 	test -s actual
 '
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering
  2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
                     ` (2 preceding siblings ...)
  2015-04-03 15:58   ` [PATCH v2 3/4] diff --cc: relax too strict paths picking Max Kirillov
@ 2015-04-03 15:58   ` Max Kirillov
  2015-04-12  5:48     ` Junio C Hamano
  3 siblings, 1 reply; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 15:58 UTC (permalink / raw)
  To: Junio C Hamano, Jeff King; +Cc: Max Kirillov, git

Looks like there is no exact specification what `diff -c` and
`diff --cc` should output. Considering this it is not reasonable to
hardcode any specific output in test. Rather, it should verify that file
selection behaves the same as hunk selection.

Rewrite test so that it makes sure that a "short" file is shown if and
only if the corresponding "long" file's contains the changed hunk.

Signed-off-by: Max Kirillov <max@max630.net>
---
 t/t4059-diff-cc-not-affected-by-path-filtering.sh | 84 ++++++++++++++++-------
 1 file changed, 59 insertions(+), 25 deletions(-)

diff --git a/t/t4059-diff-cc-not-affected-by-path-filtering.sh b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
index ab3dbd2..f4232b0 100755
--- a/t/t4059-diff-cc-not-affected-by-path-filtering.sh
+++ b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
@@ -6,11 +6,20 @@ test_description='combined diff filtering is not affected by preliminary path fi
 # spot changes which were discarded during conflict resolution.
 
 . ./test-lib.sh
+. "$TEST_DIRECTORY"/diff-lib.sh
 
+# history is:
+# (mergebase) --> (branch1) --\
+#  |                          V
+#  \ --> (branch2) ----------(merge)
+# there are files in 2 subdirectories, "long" and "short"
+# each file in "long" subdirecty has exactly same history as same file in "short" one,
+# but it has added lines with changes in both branches which merge without conflict
+# so the long files are always selected at path filtering
 test_expect_success setup '
 	mkdir short &&
 	mkdir long &&
-	for fn in win1 win2 merge delete base only1 only2
+	for fn in win1 win2 merge delete base only1 only2 only1discard only2discard
 	do
 		test_seq 3 >short/$fn &&
 		git add short/$fn &&
@@ -29,9 +38,12 @@ test_expect_success setup '
 			git add $dir/$fn || return $?
 		done || return $?
 	done &&
-	sed -e "s/^7/7change1/" long/only2 >sed.new &&
-	mv sed.new long/only2 &&
-	git add long/only2 &&
+	for fn in only2 only2discard
+	do
+	    sed -e "s/^7/7change1/" long/$fn >sed.new &&
+	    mv sed.new long/$fn &&
+	    git add long/$fn || return $?
+	done &&
 	git commit -m branch1 &&
 	git branch branch1 &&
 
@@ -45,9 +57,12 @@ test_expect_success setup '
 			git add $dir/$fn || return $?
 		done || return $?
 	done &&
-	sed -e "s/^11/11change2/" long/only1 >sed.new &&
-	mv sed.new long/only1 &&
-	git add long/only1 &&
+	for fn in only1 only1discard
+	do
+	    sed -e "s/^11/11change2/" long/$fn >sed.new &&
+	    mv sed.new long/$fn &&
+	    git add long/$fn || return $?
+	done &&
 	git commit -m branch2 &&
 	git branch branch2 &&
 
@@ -55,6 +70,10 @@ test_expect_success setup '
 	git checkout mergebase -- . &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
 	git add long/base &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/only1discard &&
+	git add long/only1discard &&
+	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/only2discard &&
+	git add long/only2discard &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change1/" >long/win1 &&
 	git add long/win1 &&
 	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e "s/^2/2change2/" >long/win2 &&
@@ -69,6 +88,10 @@ test_expect_success setup '
 	git add long/only2 &&
 	test_seq 3 >short/base &&
 	git add short/base &&
+	test_seq 3 >short/only1discard &&
+	git add short/only1discard &&
+	test_seq 3 >short/only2discard &&
+	git add short/only2discard &&
 	test_seq 3 | sed -e "s/^2/2change1/" >short/win1 &&
 	git add short/win1 &&
 	test_seq 3 | sed -e "s/^2/2change2/" >short/win2 &&
@@ -85,24 +108,35 @@ test_expect_success setup '
 	git branch merge
 '
 
-test_expect_success 'diff with mergebase shows discarded change from parent 2 in merged file' '
-	git diff --cc merge branch1 branch2 mergebase -- long/win1 >actual &&
-	test -s actual
-'
-
-test_expect_success 'diff with mergebase shows discarded change from parent 1 in merged file' '
-	git diff --cc merge branch1 branch2 mergebase -- long/win2 >actual &&
-	test -s actual
-'
+# the difference in short file must be returned if and only if it is shown in long file
+for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
+	if git diff --cc merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
+	then
+		test_expect_success "diff --cc contains short/$fn" '
+			git diff --cc merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			test -s actual
+		'
+	else
+		test_expect_success "diff --cc does not contain short/$fn" '
+			git diff --cc merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			! test -s actual
+		'
+	fi
+done
 
-test_expect_success 'diff with mergebase shows fully discarded file from parent 2' '
-	git diff --cc merge branch1 branch2 mergebase -- short/win1 >actual &&
-	test -s actual
-'
-
-test_expect_success 'diff with mergebase shows fully discarded file from parent 1' '
-	git diff --cc merge branch1 branch2 mergebase -- short/win2 >actual &&
-	test -s actual
-'
+for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
+	if git diff -c merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
+	then
+		test_expect_success "diff -c contains short/$fn" '
+			git diff -c merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			test -s actual
+		'
+	else
+		test_expect_success "diff -c does not contain short/$fn" '
+			git diff -c merge branch1 branch2 mergebase -- short/'"$fn"' >actual &&
+			! test -s actual
+		'
+	fi
+done
 
 test_done
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH 1/4] Add test for showing discarded changes with diff --cc
  2015-04-02 20:55   ` Junio C Hamano
@ 2015-04-03 16:03     ` Max Kirillov
  0 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 16:03 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

On Thu, Apr 02, 2015 at 01:55:58PM -0700, Junio C Hamano wrote:
> So I'll give a preliminary nitpicks first ;-)

Thank you; fixed the issues you mentioned.

>> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
>> "s/^2/2change2/" >long/only2 &&
>> +	git add long/only2 &&
> 
> Patch is line-wrapped around here?

I hope it was you email program which wrapped it (maybe at
quoting). It looks ok in my mutt and in source, and I uses
same git format-patch+git send-email as usual.

-- 
Max

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH 0/4] diff --cc: relax path filtering
  2015-04-02 21:13 ` [PATCH 0/4] diff --cc: relax path filtering Jeff King
@ 2015-04-03 16:29   ` Max Kirillov
  0 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-03 16:29 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On Thu, Apr 02, 2015 at 05:13:10PM -0400, Jeff King wrote:
> On Thu, Apr 02, 2015 at 11:34:09PM +0300, Max Kirillov wrote:
> 
>> For diff --cc, paths fitering used to select only paths which have
>> changed in all parents, while diffing itself output hunks which are
>> changed in as few as 2 parents.
> 
> I'm confused about "used to" here. Is this a regression due to the
> combine-diff rewriting that happened in 2.0, or do you just mean "before
> this patch series, we used to do this other thing".

As far as I can see it was "always", at least since 1.8.0;
the test script did not work before that.

>> Fix intersect_paths() to add paths which have at least 2 changed
>> parents.
> 
> I'd worry a little that this is increasing the cost to do "log --cc", as
> it means we will have to open and look at extra files, and we may find
> in many cases that there aren't any interesting hunks. Which would imply
> we might want to put it behind a flag, rather than as the default
> ("--cc-me-harder").
> 
> But if I'm understanding the issue correctly, this should only matter
> for octopus merges.  That is, the old rule for looking at a path was "is
> there at least one parent whose content we took verbatim", but the new
> one is "are there are at least 2 parents whose content we did not take
> verbatim". With only two parents, those would be the same thing, I
> think.

Yes, I hope so. I tried to reproduce benchamrk which is in 8518ff8fabc
(git log --raw --no-abbrev --no-renames (-c|--cc) v3.10..v3.11),
and saw no difference. But my times was about 3 seconds, not
20 as there, andI cannot say my computer is very fast, so
probably I've done something wrong.

-- 
Max

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
@ 2015-04-11 20:04     ` Junio C Hamano
  2015-04-11 21:07     ` Junio C Hamano
  1 sibling, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2015-04-11 20:04 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

> If `git diff --cc` is used with 2 or more parents, then
> it shows all hunks which have changed compared to at least 2 parents.
> Which is reasonable, because those places are likely places for
> conflicts, and it should be displayed how they were resolved.
> But, preliminary path filtering passes a path only it was changed
> compared to each parent. So, if a hunk which has not changed compared to
> some of parents is shown if there are more changed hunks in the file,
> but not shown if it is the only change.
>
> This looks inconsistent and for some scenarios it is desirable to show
> such changes.
>
> Add the test which demonstrates the issue.
>
> Signed-off-by: Max Kirillov <max@max630.net>
> ---
>  t/t4059-diff-cc-not-affected-by-path-filtering.sh | 108 ++++++++++++++++++++++
>  1 file changed, 108 insertions(+)
>  create mode 100755 t/t4059-diff-cc-not-affected-by-path-filtering.sh
>
> diff --git a/t/t4059-diff-cc-not-affected-by-path-filtering.sh
> b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
> new file mode 100755
> index 0000000..3e6e59b
> --- /dev/null
> +++ b/t/t4059-diff-cc-not-affected-by-path-filtering.sh
> @@ -0,0 +1,108 @@
...
> +	test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" >long/base &&
> +	git add long/base &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change1/" >long/win1 &&
> +	git add long/win1 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change2/" >long/win2 &&
> +	git add long/win2 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2merged/" >long/merge &&
> +	git add long/merge &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "/^2/d" >long/delete &&
> +	git add long/delete &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change1/" >long/only1 &&
> +	git add long/only1 &&
> + test_seq 11 | sed -e "s/^7/7change1/" -e "s/^11/11change2/" -e
> "s/^2/2change2/" >long/only2 &&

Hmph.  Is it gmane who is munging these lines?

The other copy of this message I received in my mbox (which I read
in the same MUA) does not seem to have this corruption, and I do not
expect vger.kernel.org to do this kind of munging without getting
yelled at by the kernel folks.

Anyway, thanks; I'll take a deeper look once I got a chance to.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path()
  2015-04-03 15:58   ` [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
@ 2015-04-11 20:14     ` Junio C Hamano
  0 siblings, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2015-04-11 20:14 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

> Signed-off-by: Max Kirillov <max@max630.net>
> ---
>  combine-diff.c | 43 ++++++++++++++++++++++++-------------------
>  1 file changed, 24 insertions(+), 19 deletions(-)
>
> diff --git a/combine-diff.c b/combine-diff.c
> index 8eb7278..a236bb5 100644
> --- a/combine-diff.c
> +++ b/combine-diff.c
> @@ -22,6 +22,28 @@ static int compare_paths(const struct combine_diff_path *one,
>  				 two->path, strlen(two->path), two->mode);
>  }
>  
> +static void insert_path(struct combine_diff_path **pos, const char* path, int n, int num_parent, struct diff_filepair *queue_item)

This is overlong (and I wonder why I am not seeing the linewrap
problem with this one).  I'd suggest to split them perhaps in the
same way that compare_paths() match_string_spaces() and other
existing functions are defined?

Also the pointer-asterisk sticks to the variable name, not type,
i.e.

	static void insert_path(struct combine_diff_path **pos, const char *path,
				int n, int num_parent,
                                struct diff_filepair *queue_item)


> +{

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
  2015-04-11 20:04     ` Junio C Hamano
@ 2015-04-11 21:07     ` Junio C Hamano
  2015-04-11 21:20       ` Junio C Hamano
  2015-04-12  5:43       ` Max Kirillov
  1 sibling, 2 replies; 25+ messages in thread
From: Junio C Hamano @ 2015-04-11 21:07 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

> If `git diff --cc` is used with 2 or more parents, then it shows
> all hunks which have changed compared to at least 2 parents.
> Which is reasonable, because those places are likely places for
> conflicts, and it should be displayed how they were resolved.

OK.

> But, preliminary path filtering passes a path only it was changed
> compared to each parent.

That is true, but I am a bit confused by the above, especially the
word "But" that begins the sentence.  Are you talking about this
comment that describes what the caller wants to do?

    /* find set of paths that every parent touches */
    static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
            const struct sha1_array *parents, struct diff_options *opt)

When the result of a merge exactly matches one (or more) of the
parent of the merge, we do not want to show it in the combined
format, so intersect_paths() does want to find paths that are
different from all parents.  Isn't that a good thing?

> So, if a hunk which has not changed compared to some of parents is
> shown if there are more changed hunks in the file, but not shown
> if it is the only change.
>
> This looks inconsistent and for some scenarios it is desirable to show
> such changes.

Hmm, that may be true.  So help me see if I understand your goal by
checking if I rephrased you correctly below:

 - We want to show a combined hunk when the number of parent
   variants for the hunk is more than 2 (i.e. interesting octopus)
   or the result does not match any of the parents (i.e. conflict
   resolution of a pairwise merge).  We want to drop a hunk whose
   result came from only one set of parent and the other parents had
   the same original that is different from the result.

 - The current code filters out a path that matches one of the
   parents very early.  This is OK for a two-way merge.  If the
   result matches one of the parent's, then any hunk we might
   produce by not pre-filtering would have the result that came from
   one parent (i.e. the one identical to the result) and there is
   only one other parent, which cannot make it an interesting
   octopus by definition.

   But an octopus may merge three variants and pick the result from
   one of the parents as a whole.  With the pre-filtering, no hunk
   from such a path is shown, even when the other two variants in
   "discarded" parents are not identical.

The original to refer to are two commits bf1c32bd (combine-diff:
update --cc "uninteresting hunks" logic., 2006-02-02) and fd4b1d21
(combine-diff: add safety check to --cc., 2006-02-02).

Especially, we need to pay close attention to the discussion that
germinated the current behaviour:

  http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519

I recall that the "diff --cc" before that change was not discarding
uninteresting merges sufficiently and the two commits were a
deliberate attempt to reject what your series wants to show as
uninteresting hunks.  

Two suggestions.

 - This is primarily for 2/4, but can we make it more clear in the
   code that we do this "include more" change only on Octopus
   merges?  This change should not make any difference for two-way
   merges and I'd prefer to avoid extra processing of finding
   matching hunks and combining, only to discard the result.

 - Can you run "diff --cc" with and without your patches to the
   "merge from hell" commit mentioned in the original thread and see
   if we show more hunks with your patches, and make sure what are
   shown additionally looked really "interesting"?

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-11 21:07     ` Junio C Hamano
@ 2015-04-11 21:20       ` Junio C Hamano
  2015-04-12  5:43       ` Max Kirillov
  1 sibling, 0 replies; 25+ messages in thread
From: Junio C Hamano @ 2015-04-11 21:20 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, Git Mailing List

On Sat, Apr 11, 2015 at 2:07 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Especially, we need to pay close attention to the discussion that
> germinated the current behaviour:
>
>   http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519
>
> I recall that the "diff --cc" before that change was not discarding
> uninteresting merges sufficiently and the two commits were a
> deliberate attempt to reject what your series wants to show as
> uninteresting hunks.
>
> Two suggestions.
>
>  - This is primarily for 2/4, but can we make it more clear in the

Eh, sorry, I obviously meant 3/4. And let me half-retract this suggestion
and put it on hold; the code-flow has become sufficiently different that
special-casing two-way merge may not be worth it after all.

>    code that we do this "include more" change only on Octopus
>    merges?  This change should not make any difference for two-way
>    merges and I'd prefer to avoid extra processing of finding
>    matching hunks and combining, only to discard the result.
>
>  - Can you run "diff --cc" with and without your patches to the
>    "merge from hell" commit mentioned in the original thread and see
>    if we show more hunks with your patches, and make sure what are
>    shown additionally looked really "interesting"?

This one still stands. We'd be interested to see what difference this
makes in the real world project history.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-11 21:07     ` Junio C Hamano
  2015-04-11 21:20       ` Junio C Hamano
@ 2015-04-12  5:43       ` Max Kirillov
  2015-04-12  5:51         ` Junio C Hamano
  2015-04-14  4:09         ` [PATCH/RFC] combine-diff.c: make intersect_paths() behave like hunk filtering Max Kirillov
  1 sibling, 2 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-12  5:43 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, git

On Sat, Apr 11, 2015 at 02:07:25PM -0700, Junio C Hamano wrote:
> Max Kirillov <max@max630.net> writes:
> 
>> If `git diff --cc` is used with 2 or more parents, then it shows
>> all hunks which have changed compared to at least 2 parents.
>> Which is reasonable, because those places are likely places for
>> conflicts, and it should be displayed how they were resolved.
> 
> OK.
> 
>> But, preliminary path filtering passes a path only it was changed
>> compared to each parent.
> 
> That is true, but I am a bit confused by the above, especially the
> word "But" that begins the sentence.  Are you talking about this
> comment that describes what the caller wants to do?
> 
>     /* find set of paths that every parent touches */
>     static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
>             const struct sha1_array *parents, struct diff_options *opt)
> 
> When the result of a merge exactly matches one (or more) of the
> parent of the merge, we do not want to show it in the combined
> format, so intersect_paths() does want to find paths that are
> different from all parents.  Isn't that a good thing?

If it is a good thing, then probably for file which has not
been filtered out it is also a good thing, then diff --cc
should have been hiding hunks which fully preserve one of
the parent.

>> So, if a hunk which has not changed compared to some of parents is
>> shown if there are more changed hunks in the file, but not shown
>> if it is the only change.
>>
>> This looks inconsistent and for some scenarios it is desirable to show
>> such changes.
> 
> Hmm, that may be true.  So help me see if I understand your goal by
> checking if I rephrased you correctly below:
> 
>  - We want to show a combined hunk when the number of parent
>    variants for the hunk is more than 2 (i.e. interesting octopus)
>    or the result does not match any of the parents (i.e. conflict
>    resolution of a pairwise merge).  We want to drop a hunk whose
>    result came from only one set of parent and the other parents had
>    the same original that is different from the result.
> 
>  - The current code filters out a path that matches one of the
>    parents very early.  This is OK for a two-way merge.  If the
>    result matches one of the parent's, then any hunk we might
>    produce by not pre-filtering would have the result that came from
>    one parent (i.e. the one identical to the result) and there is
>    only one other parent, which cannot make it an interesting
>    octopus by definition.
> 
>    But an octopus may merge three variants and pick the result from
>    one of the parents as a whole.  With the pre-filtering, no hunk
>    from such a path is shown, even when the other two variants in
>    "discarded" parents are not identical.

Yes, I meant that.

> 
> The original to refer to are two commits bf1c32bd (combine-diff:
> update --cc "uninteresting hunks" logic., 2006-02-02) and fd4b1d21
> (combine-diff: add safety check to --cc., 2006-02-02).
> 
> Especially, we need to pay close attention to the discussion that
> germinated the current behaviour:
> 
>   http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519
> 
> I recall that the "diff --cc" before that change was not discarding
> uninteresting merges sufficiently and the two commits were a
> deliberate attempt to reject what your series wants to show as
> uninteresting hunks.  
> 
> Two suggestions.
> 
>  - This is primarily for 2/4, but can we make it more clear in the
>    code that we do this "include more" change only on Octopus
>    merges?  This change should not make any difference for two-way
>    merges and I'd prefer to avoid extra processing of finding
>    matching hunks and combining, only to discard the result.
> 
>  - Can you run "diff --cc" with and without your patches to the
>    "merge from hell" commit mentioned in the original thread and see
>    if we show more hunks with your patches, and make sure what are
>    shown additionally looked really "interesting"?


The diffs:
older version (without my patch):
https://gist.githubusercontent.com/max630/70080d38e8e9951b58a4/raw/diffcc_old.txt
newer version:
https://gist.githubusercontent.com/max630/70080d38e8e9951b58a4/raw/diffcc_new.txt

The outcome is quite different, yes.

First, I can see that file removal/deletion are passed
without filtering, even if file was only added/deleted
compared to some parents and not changed compared to any, so
it's in some way "2 versions" diff which should not be
shown. This I think is something should be fixed in addition
to this series.

Other than that, all other changes look "interesting",
meaning they contain more than 2 versions. Sometimes it's not
easy to spot because a long hunk is almost inherited from
one parent with all others being the same, and only in
couple of lines there when they differ. But I always find
some difference.

It was also quite slow, 11 seconds compared to some 0.4.
But for 2-way merges speed did not change.



My exact case was that there was a change in one branch
which was overwritten during merge conflict resolution by
fully acepting the other branch - in a 2-parent merge. I
started looking for a way to visualize such cases. They
are not visible in usual diff, because they look same as
accepting change compared to the unchange branch. To
recognize them you should consider the mergebase. I googled
to your suggestion in [1], and prepared a trivial testcase
for such a merge - in contained a base commit, change in
both branches and children which fully repeat one of the
branches. But then I discovered that such merge is not shown
in "diff --cc" because of the path filtering. So here am I.

Actually I think this is the very true rule which I would
like to be used: if there is a parent which is changed
compared to the merge base and child commit differs from
that parent. Because this is where there had been a valuable
change, and it disappeared. Linus in [2] wrote that those
change was fixing the same problem and it's good that they
are not shown, by it my case there were fixing different
problems, they just happened to be in the same place, and
after merge the bug reappeared.

[1] http://thread.gmane.org/gmane.comp.version-control.git/191553/focus=191557
[2] http://thread.gmane.org/gmane.comp.version-control.git/15486/focus=15519

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering
  2015-04-03 15:58   ` [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
@ 2015-04-12  5:48     ` Junio C Hamano
  2015-04-14  4:18       ` Max Kirillov
  0 siblings, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-04-12  5:48 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

> Looks like there is no exact specification what `diff -c` and
> `diff --cc` should output. Considering this it is not reasonable to
> hardcode any specific output in test.

OK.

> Rather, it should verify that file
> selection behaves the same as hunk selection.

Hrm, really?  "git diff --raw" and "git diff -p -w" on two trees
would not show identical set of paths, when the blob object are
different byte-wise but are equivalent at the content level per
given definition of equivalence (e.g. "-w").  Given that --cc is
to look at the differences at hunk/content level to combine and omit
uninteresting ones from the output, relative to -c output, I would
imagine that the file selection and the hunk selection are expected
to behave differently.

So, having said that I am a bit skeptical about the description of
the goals, there are a few nits on the implementation, too.

> +# the difference in short file must be returned if and only if it is shown in long file
> +for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
> +	if git diff --cc merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
> +	then

Just like the earlier parts of this patch, write a newline when you
do not have to write a semicolon, and split lines after pipe when
your pipeline gets long, i.e.

	for fn in win2 win2 ...
        do
		if git diff --cc ... |
			grep 'pattern'
		then

Do I smell some GNUism in your "grep" patterns?

You have

    '^[ +-]\{3\}2\(change[12]|merge\)\?$'

but matching zero-or-one repetition with ? is not in BRE, and \? to
use it in BRE is a GNU extension.

Also in BRE , '|' is not an alternation (and making it into such
with '\|' in BRE is a GNU extension IIRC.

Worse, you are not using backslash here to invoke GNU extension, so
I suspect the grep invocations in the patch may not be working as
you expect.

Doubly worse, what is shown by "diff -c" for line 2 seems to be
three [- +] columns, followed by one of

    2
    2change1
    2change2
    2merged

followed by the end of line, so your "|merge" part may never match.

We encourage people to avoid \{m,n\}, because, it is not supported
by some implementations even though it is specified in POSIX for
BRE.

Perhaps spelling out what you are expecting a bit more explicitly,
e.g.

	grep -e "^[-+ ][-+ ][-+ ]2$" \
             -e "^[-+ ][-+ ][-+ ]2change[12]$" \
             -e "^[-+ ][-+ ][-+ ]2merged$"

might be a reasonable way to write this in a more portable and
readable way.  And turn that into a helper shell function to make it
more readable and maintainable for a bonus point.

Thanks.

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-12  5:43       ` Max Kirillov
@ 2015-04-12  5:51         ` Junio C Hamano
  2015-04-14  4:22           ` Max Kirillov
  2015-04-14  4:09         ` [PATCH/RFC] combine-diff.c: make intersect_paths() behave like hunk filtering Max Kirillov
  1 sibling, 1 reply; 25+ messages in thread
From: Junio C Hamano @ 2015-04-12  5:51 UTC (permalink / raw)
  To: Max Kirillov; +Cc: Jeff King, git

Max Kirillov <max@max630.net> writes:

> My exact case was that there was a change in one branch
> which was overwritten during merge conflict resolution by
> fully acepting the other branch - in a 2-parent merge. I
> started looking for a way to visualize such cases. They
> are not visible in usual diff, because they look same as
> accepting change compared to the unchange branch.

Hmph, isn't that exactly why "diff -c" exists, not "diff --cc"
that omits (usually) uninteresting hunks?

^ permalink raw reply	[flat|nested] 25+ messages in thread

* [PATCH/RFC] combine-diff.c: make intersect_paths() behave like hunk filtering
  2015-04-12  5:43       ` Max Kirillov
  2015-04-12  5:51         ` Junio C Hamano
@ 2015-04-14  4:09         ` Max Kirillov
  1 sibling, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-14  4:09 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Max Kirillov, Jeff King, git

* for `diff --cc` 2 cases:
  * the path must be changed since at least 2 parents, which
    should have the path different. In other words, the child and its
    parents must contain at least 3 different versions of file.
    Non-existing in the commit path counts as one version.
  * All parents are the same, but child commit differs from them.
* for `diff -c`: the path must be changed since at least 1 parent.

Signed-off-by: Max Kirillov <max@max630.net>
---
This what could be done to hide the added and removed files.
It also makes it work faster - diff --cc on the evil merge runs now 2.5 seconds
instead of 11 (or 0.4 without the series).

This also fails t4057, but I really don't see the logic of `diff -c`
othwerwise. It should show all changes, with trivial merges also, and it
is shown by long files in my t4059. But it uses same path filtering as
`diff --cc`, which is even more restrictive.
 combine-diff.c | 91 ++++++++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 66 insertions(+), 25 deletions(-)

diff --git a/combine-diff.c b/combine-diff.c
index 2285c7c..f44032a 100644
--- a/combine-diff.c
+++ b/combine-diff.c
@@ -50,20 +50,49 @@ static void insert_path(struct combine_diff_path **pos, const char* path, int n,
 	*pos = p;
 }
 
-static int changed_parents(struct combine_diff_path *p, int n)
+static int path_not_interesting(struct combine_diff_path *p, int n,
+				struct diff_filespec *new_parent)
 {
 	int parent_idx;
-	int result = 0;
+	struct object_id first_parent;
+	int found_first = 0;
+	int found_same_parent = 0;
 
 	for (parent_idx = 0; parent_idx < n; parent_idx++) {
-		if (p->parent[parent_idx].status != ' ')
-			result++;
+		if (p->parent[parent_idx].status != ' ') {
+			if (found_first) {
+				if (hashcmp(p->parent[parent_idx].oid.hash, first_parent.hash)) {
+					/* found second different unique parent - non-trivial merge */
+					return 0;
+				}
+			} else {
+				found_first = 1;
+				hashcpy(first_parent.hash,
+					p->parent[parent_idx].oid.hash);
+			}
+		} else {
+			/* the new commit repeats some of parents */
+			found_same_parent = 1;
+		}
 	}
 
-	return result;
+	if (new_parent) {
+		if (hashcmp(p->oid.hash, new_parent->sha1)) {
+			if (!found_same_parent || hashcmp(first_parent.hash, new_parent->sha1)) {
+				return 0;
+			} else {
+				return 1;
+			}
+		} else {
+			found_same_parent = 1;
+		}
+	}
+
+	return found_same_parent;
 }
 
-static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr, int n, int num_parent)
+static struct combine_diff_path *intersect_paths(
+	struct combine_diff_path *curr, int n, int num_parent, int dense)
 {
 	struct diff_queue_struct *q = &diff_queued_diff;
 	struct combine_diff_path *p, **tail = &curr;
@@ -81,35 +110,46 @@ static struct combine_diff_path *intersect_paths(struct combine_diff_path *curr,
 
 		if (cmp < 0) {
 			/* p->path not in q->queue[] */
-			if (num_parent > 2 && 2 - changed_parents(p, n) <= num_parent - n - 1) {
-				/* still can get 2 changed parents */
+			if (dense &&
+			    n == num_parent - 1 &&
+			    path_not_interesting(p, n, NULL)) {
+				/* only 1 unique different parent
+				   not interesting change */
+				*tail = p->next;
+				free(p);
+			} else {
+				/* already has or still can get 2 changed parents */
 				hashcpy(p->parent[n].oid.hash, p->oid.hash);
 				p->parent[n].mode = p->mode;
 				p->parent[n].status = ' ';
 				tail = &p->next;
-			} else {
-				*tail = p->next;
-				free(p);
 			}
 			continue;
-		}
-
-		if (cmp > 0) {
+		} else if (cmp > 0) {
 			/* q->queue[i] not in p->path */
-			if (1 <= num_parent - n - 1) {
-				insert_path(tail, q->queue[i]->two->path, n, num_parent, q->queue[i]);
+			if (!dense || n < num_parent - 1) {
+				insert_path(tail, q->queue[i]->two->path,
+					    n, num_parent, q->queue[i]);
 				tail = &(*tail)->next;
 			}
 			i++;
 			continue;
-		}
+		} else {
+			if (dense &&
+			    n == num_parent - 1 &&
+			    path_not_interesting(p, n, q->queue[i]->one)) {
+				*tail = p->next;
+				free(p);
+			} else {
+				hashcpy(p->parent[n].oid.hash, q->queue[i]->one->sha1);
+				p->parent[n].mode = q->queue[i]->one->mode;
+				p->parent[n].status = q->queue[i]->status;
 
-		hashcpy(p->parent[n].oid.hash, q->queue[i]->one->sha1);
-		p->parent[n].mode = q->queue[i]->one->mode;
-		p->parent[n].status = q->queue[i]->status;
+				tail = &p->next;
+			}
+			i++;
+		}
 
-		tail = &p->next;
-		i++;
 	}
 	return curr;
 }
@@ -1341,7 +1381,8 @@ static const char *path_path(void *obj)
 
 /* find set of paths that every parent touches */
 static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
-	const struct sha1_array *parents, struct diff_options *opt)
+	const struct sha1_array *parents, struct diff_options *opt,
+	int dense)
 {
 	struct combine_diff_path *paths = NULL;
 	int i, num_parent = parents->nr;
@@ -1367,7 +1408,7 @@ static struct combine_diff_path *find_paths_generic(const unsigned char *sha1,
 			opt->output_format = DIFF_FORMAT_NO_OUTPUT;
 		diff_tree_sha1(parents->sha1[i], sha1, "", opt);
 		diffcore_std(opt);
-		paths = intersect_paths(paths, i, num_parent);
+		paths = intersect_paths(paths, i, num_parent, dense);
 
 		/* if showing diff, show it in requested order */
 		if (opt->output_format != DIFF_FORMAT_NO_OUTPUT &&
@@ -1477,7 +1518,7 @@ void diff_tree_combined(const unsigned char *sha1,
 		 * diff(sha1,parent_i) for all i to do the job, specifically
 		 * for parent0.
 		 */
-		paths = find_paths_generic(sha1, parents, &diffopts);
+		paths = find_paths_generic(sha1, parents, &diffopts, dense);
 	}
 	else {
 		int stat_opt;
-- 
2.3.4.2801.g3d0809b

^ permalink raw reply related	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering
  2015-04-12  5:48     ` Junio C Hamano
@ 2015-04-14  4:18       ` Max Kirillov
  0 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-14  4:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Max Kirillov, Jeff King, git

On Sat, Apr 11, 2015 at 10:48:22PM -0700, Junio C Hamano wrote:
> Max Kirillov <max@max630.net> writes:
>> Rather, it should verify that file
>> selection behaves the same as hunk selection.
> 
> Hrm, really?  "git diff --raw" and "git diff -p -w" on two trees
> would not show identical set of paths, when the blob object are
> different byte-wise but are equivalent at the content level per
> given definition of equivalence (e.g. "-w").  Given that --cc is
> to look at the differences at hunk/content level to combine and omit
> uninteresting ones from the output, relative to -c output, I would
> imagine that the file selection and the hunk selection are expected
> to behave differently.

I mean, that if a hunk is shown in some long file, which
have other changes, with some option (-c or --cc), it should
also be shown if it is the only change in file.

> So, having said that I am a bit skeptical about the description of
> the goals, there are a few nits on the implementation, too.
> 
> > +# the difference in short file must be returned if and only if it is shown in long file
> > +for fn in win1 win2 merge delete base only1 only2 only1discard only2discard; do
> > +	if git diff --cc merge branch1 branch2 mergebase -- long/$fn | grep -q '^[ +-]\{3\}2\(change[12]|merge\)\?$'
> > +	then
> 
> Just like the earlier parts of this patch, write a newline when you
> do not have to write a semicolon, and split lines after pipe when
> your pipeline gets long, i.e.
> 
> 	for fn in win2 win2 ...
>         do
> 		if git diff --cc ... |
> 			grep 'pattern'
> 		then
> 
> Do I smell some GNUism in your "grep" patterns?
> 
> You have
> 
>     '^[ +-]\{3\}2\(change[12]|merge\)\?$'
> 
> but matching zero-or-one repetition with ? is not in BRE, and \? to
> use it in BRE is a GNU extension.
> 
> Also in BRE , '|' is not an alternation (and making it into such
> with '\|' in BRE is a GNU extension IIRC.
> 
> Worse, you are not using backslash here to invoke GNU extension, so
> I suspect the grep invocations in the patch may not be working as
> you expect.

Thank you, I missed it. I suspect it was working because
single "2" still matching. Will fix it and other issues.

-- 
Max

^ permalink raw reply	[flat|nested] 25+ messages in thread

* Re: [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents
  2015-04-12  5:51         ` Junio C Hamano
@ 2015-04-14  4:22           ` Max Kirillov
  0 siblings, 0 replies; 25+ messages in thread
From: Max Kirillov @ 2015-04-14  4:22 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Max Kirillov, Jeff King, git

On Sat, Apr 11, 2015 at 10:51:10PM -0700, Junio C Hamano wrote:
> Max Kirillov <max@max630.net> writes:
> 
>> My exact case was that there was a change in one branch
>> which was overwritten during merge conflict resolution by
>> fully acepting the other branch - in a 2-parent merge. I
>> started looking for a way to visualize such cases. They
>> are not visible in usual diff, because they look same as
>> accepting change compared to the unchange branch.
> 
> Hmph, isn't that exactly why "diff -c" exists, not "diff --cc"
> that omits (usually) uninteresting hunks?

No, this shows too many. If a change is done in one branch
but the other branch did not introduce any changes since
mergebase and they merged cleanly the merge should not be
shown, and `diff -c` seems to show them.

^ permalink raw reply	[flat|nested] 25+ messages in thread

end of thread, other threads:[~2015-04-14  4:22 UTC | newest]

Thread overview: 25+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-04-02 20:34 [PATCH 0/4] diff --cc: relax path filtering Max Kirillov
2015-04-02 20:34 ` [PATCH 1/4] Add test for showing discarded changes with diff --cc Max Kirillov
2015-04-02 20:55   ` Junio C Hamano
2015-04-03 16:03     ` Max Kirillov
2015-04-02 20:34 ` [PATCH 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
2015-04-02 20:34 ` [PATCH 3/4] diff --cc: relax too strict paths picking Max Kirillov
2015-04-02 20:59   ` Junio C Hamano
2015-04-02 20:34 ` [PATCH 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
2015-04-02 21:13 ` [PATCH 0/4] diff --cc: relax path filtering Jeff King
2015-04-03 16:29   ` Max Kirillov
2015-04-03 15:58 ` [PATCH v2 " Max Kirillov
2015-04-03 15:58   ` [PATCH v2 1/4] t4059: test 'diff --cc' with a change from only few parents Max Kirillov
2015-04-11 20:04     ` Junio C Hamano
2015-04-11 21:07     ` Junio C Hamano
2015-04-11 21:20       ` Junio C Hamano
2015-04-12  5:43       ` Max Kirillov
2015-04-12  5:51         ` Junio C Hamano
2015-04-14  4:22           ` Max Kirillov
2015-04-14  4:09         ` [PATCH/RFC] combine-diff.c: make intersect_paths() behave like hunk filtering Max Kirillov
2015-04-03 15:58   ` [PATCH v2 2/4] combine-diff.c: refactor: extract insert_path() Max Kirillov
2015-04-11 20:14     ` Junio C Hamano
2015-04-03 15:58   ` [PATCH v2 3/4] diff --cc: relax too strict paths picking Max Kirillov
2015-04-03 15:58   ` [PATCH v2 4/4] t4059: rewrite to be adaptive to hunk filtering Max Kirillov
2015-04-12  5:48     ` Junio C Hamano
2015-04-14  4:18       ` Max Kirillov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.