git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Two RFC/WIP series for facilitating merge conflict resolution
@ 2018-08-06 22:44 Elijah Newren
  2018-08-06 22:45 ` [RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts Elijah Newren
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
  0 siblings, 2 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:44 UTC (permalink / raw)
  To: Git Mailing List

Hi everyone,

Last week, Duy posted an interesting RFC patch for improving merge
conflict resolution by coloring the "CONFLICT' messages that are
output during a merge. I have two more ideas (complementary to his)
for improving merge conflict resolution experience.  My two ideas are
mostly independent of each other but with a couple important ties;
I'll shortly respond with a short RFC/WIP series for each.

Some notes applicable to both:
  - Each has a very long cover letter, due to trying to anticipate
questions/surprises and responding to each.  It may be simpler to skip
the cover letter (or just read the up-front summary) and see if the
patches make sense,
  - I am planning to imminently start looking into my merge-recursive
rewrite[1].  One possibility is just doing these changes as part of
the new strategy...thoughts?


Elijah

[1] See part E from
https://public-inbox.org/git/CABPp-BFQJZHfCJZ1qvhvVcMd-_sOfi0Fkm5PexEwzzN+Zw2akw@mail.gmail.com/;
also the two RFCs I'm about to post are part C from that email.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts
  2018-08-06 22:44 Two RFC/WIP series for facilitating merge conflict resolution Elijah Newren
@ 2018-08-06 22:45 ` Elijah Newren
  2018-08-06 22:45   ` [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve Elijah Newren
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
  1 sibling, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:45 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Directory/file conflicts are more difficult than they need to be for users
to resolve (and to un-resolve).  Simplify that process by doing to the
index what we do to the working tree: renaming the file in the
directory/file conflict to a different location.  This also avoids leaving
cruft untracked files around if the user decides to abort the merge.

From one angle this proposal might appear surprising, so extended rationale
for this change can be found below (and if it seems surprising, some of the
below may need to be moved into the commit message for the patch).


== What git does, prior to this series ==

Let's say there's a directory/file conflict for path 'foo'.  One might see
git report:
  CONFLICT (file/directory): There is a directory with name foo in
  BRANCH2. Adding foo as foo~BRANCH
Further, at this point, git will record:
  * foo~BRANCH in the working tree
  * foo/ in the working tree
  * foo at higher order stage in the index
  * foo/* entries found in the index (at stage 0)

== User experience resolving directory/file conflicts ==

Let's say the user wants to resolve by just moving 'foo' (the file) to
somewhere else.  Here's five different things a user might try at this
point (not sequentially, but rather competing ideas they might try),
along with commentary about how each fails to resolve the conflict:

$ git mv foo other
  # Moves the directory instead, oops

$ mv foo~BRANCH other
$ git add other
  # Still leaves 'foo' conflicted in the index

$ git mv foo~BRANCH other
  # Error: "Not under source control"

$ git add -u
  # Removes conflict entry for 'foo' from index, but doesn't add
  # new one and leaves foo~BRANCH around untracked

$ git rm foo
  # Doesn't work ("foo: needs merge...fatal: git rm: 'foo': Is a directory)

== User experience un-resolving directory/file conflict ==

If the user decides they don't like the merge and run 'git merge --abort',
the abort fails due to a separate bug being fixed here:

  https://public-inbox.org/git/20180713163331.22446-1-newren@gmail.com/

However, even once the fixes there are part of git, a 'git merge --abort'
will leave behind new untracked files that were created by the merge
attempt (in the case above, one named foo~BRANCH).  This is suboptimal.

== Correct solution; old and new ==

Currently, this is what a user needs to run to resolve this conflict:

$ mv foo~BRANCH other
$ git add other
$ git rm --cached foo

If git would record foo~BRANCH at a higher stage in the index instead
of recording foo there, then we could shorten this to:

$ git add foo~BRANCH
$ git mv foo~BRANCH other

If we could also teach git-mv to quit reporting "not under version
control" for index entries with higher order stages (and instead rename
them while keeping them as higher order stages), then we could also allow
those two commands to be reversed:

$ git mv foo~BRANCH other
$ git add other

While this change to what git records in the index might feel like a lie,
it does make it easier for the user and we already have a precedent for
treating user convience as more important than trying to represent how we
got to the current state, as shown in the following analogy:

== Rename analogy ==

If one side of history renames A->B but there are content conflicts, one
choice for the contents for the index would be:
  <mode> <original sha> 1    A
  <mode> <side-one sha> 2    A
  <mode> <side-two sha> 3    B
However, that's not what git does.  In particular, this choice would require
the user to run both 'git add B' and 'git rm --cached A' to resolve the
conflict.  Further, it prevents the user from running commands such as
  git checkout [--ours|--theirs|--conflict|-m] B
  git diff [--ours|--theirs|--base] B
This would also make it harder to pair up entries from 'git ls-files -u'
(especially if there are many entries between A and B), since nothing
marks them as related anymore.  So, instead, git records the following in
the index:
  <mode> <original sha> 1    B
  <mode> <side-one sha> 2    B
  <mode> <side-two sha> 3    B
This might seem like a lie if you view the index as a place to record how
we got to the current state rather than a way to help users resolve
conflicts and update state, but it is certainly far more convenient for
the user to work with.  Follow suit with directory/file conflicts.


Elijah Newren (1):
  merge-recursive: make file/directory conflicts easier to resolve

 merge-recursive.c                    | 38 ++++++++++++++++++++++------
 t/t3030-merge-recursive.sh           | 16 ++++++------
 t/t6020-merge-df.sh                  |  4 +--
 t/t6022-merge-rename.sh              |  4 +--
 t/t6036-recursive-corner-cases.sh    |  5 ++--
 t/t6042-merge-rename-corner-cases.sh |  4 +--
 t/t6043-merge-rename-directories.sh  |  4 +--
 7 files changed, 49 insertions(+), 26 deletions(-)

-- 
2.18.0.550.g44d6daf40a.dirty

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve
  2018-08-06 22:45 ` [RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts Elijah Newren
@ 2018-08-06 22:45   ` Elijah Newren
  2018-08-09 17:36     ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:45 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

File/directory conflicts are somewhat difficult to resolve; by way of
contrast, renames are much easier to handle.  For file/directory
conflicts, we already have to record the file under a different name in
the working copy due to the directory being in the way; if we also record
the file under a different name in the index then it simplifies matters
for the user, and ensures that 'git reset --hard' and 'git merge --abort'
will clean up files created by the merge operation.

Note that there are multiple ways to get a file/directory conflict:
  * add/add (one side adds a file, the other a directory)
  * modify/delete (the side that deletes puts a directory in the way)
  * rename vs add-directory (directory added on one side in way of rename)
As such, there are multiple code paths that need updating.

FIXME: Several testcases were updated to show how this affected the
  testsuite in general, but there are still 11 more tests across five
  testfiles that need fixing.  As I said, this is just WIP/RFC.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                    | 38 ++++++++++++++++++++++------
 t/t3030-merge-recursive.sh           | 16 ++++++------
 t/t6020-merge-df.sh                  |  4 +--
 t/t6022-merge-rename.sh              |  4 +--
 t/t6036-recursive-corner-cases.sh    |  5 ++--
 t/t6042-merge-rename-corner-cases.sh |  4 +--
 t/t6043-merge-rename-directories.sh  |  4 +--
 7 files changed, 49 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1446e92bea..34906b0f90 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1527,6 +1527,20 @@ static int handle_change_delete(struct merge_options *o,
 		 * path.  We could call update_file_flags() with update_cache=0
 		 * and update_wd=0, but that's a no-op.
 		 */
+		if (!o->call_depth && alt_path) {
+			struct diff_filespec orig, new;
+			int stage = (change_branch == o->branch1) ? 2 : 3;
+
+			remove_file_from_cache(path);
+			orig.mode = o_mode;
+			oidcpy(&orig.oid, o_oid);
+			new.mode = changed_mode;
+			oidcpy(&new.oid, changed_oid);
+			if (update_stages(o, alt_path, &orig,
+					  stage == 2 ? &new : NULL,
+					  stage == 3 ? &new : NULL))
+				ret = -1;
+		}
 		if (change_branch != o->branch1 || alt_path)
 			ret = update_file(o, 0, changed_oid, changed_mode, update_path);
 	}
@@ -3089,11 +3103,11 @@ static int merge_content(struct merge_options *o,
 
 	if (df_conflict_remains || is_dirty) {
 		char *new_path;
-		if (o->call_depth) {
-			remove_file_from_cache(path);
-		} else {
+		new_path = unique_path(o, path, rename_conflict_info->branch1);
+		remove_file_from_cache(path);
+		if (!o->call_depth) {
 			if (!mfi.clean) {
-				if (update_stages(o, path, &one, &a, &b))
+				if (update_stages(o, new_path, &one, &a, &b))
 					return -1;
 			} else {
 				int file_from_stage2 = was_tracked(o, path);
@@ -3101,14 +3115,13 @@ static int merge_content(struct merge_options *o,
 				oidcpy(&merged.oid, &mfi.oid);
 				merged.mode = mfi.mode;
 
-				if (update_stages(o, path, NULL,
+				if (update_stages(o, new_path, NULL,
 						  file_from_stage2 ? &merged : NULL,
 						  file_from_stage2 ? NULL : &merged))
 					return -1;
 			}
 
 		}
-		new_path = unique_path(o, path, rename_conflict_info->branch1);
 		if (is_dirty) {
 			output(o, 1, _("Refusing to lose dirty file at %s"),
 			       path);
@@ -3244,10 +3257,19 @@ static int process_entry(struct merge_options *o,
 			output(o, 1, _("CONFLICT (%s): There is a directory with name %s in %s. "
 			       "Adding %s as %s"),
 			       conf, path, other_branch, path, new_path);
+			remove_file_from_cache(path);
+			if (!o->call_depth) {
+				struct diff_filespec dfs;
+
+				dfs.mode = mode;
+				oidcpy(&dfs.oid, oid);
+				if (update_stages(o, new_path, NULL,
+						  a_oid ? &dfs : NULL,
+						  a_oid ? NULL : &dfs))
+					clean_merge = -1;
+			}
 			if (update_file(o, 0, oid, mode, new_path))
 				clean_merge = -1;
-			else if (o->call_depth)
-				remove_file_from_cache(path);
 			free(new_path);
 		} else {
 			output(o, 2, _("Adding %s"), path);
diff --git a/t/t3030-merge-recursive.sh b/t/t3030-merge-recursive.sh
index ff641b348a..6d456da001 100755
--- a/t/t3030-merge-recursive.sh
+++ b/t/t3030-merge-recursive.sh
@@ -369,9 +369,9 @@ test_expect_success 'merge-recursive d/f conflict result' '
 
 	git ls-files -s >actual &&
 	(
-		echo "100644 $o0 1	a" &&
-		echo "100644 $o1 2	a" &&
 		echo "100644 $o4 0	a/c" &&
+		echo "100644 $o0 1	a~$c1" &&
+		echo "100644 $o1 2	a~$c1" &&
 		echo "100644 $o0 0	b" &&
 		echo "100644 $o0 0	c" &&
 		echo "100644 $o1 0	d/e"
@@ -393,9 +393,9 @@ test_expect_success 'merge-recursive d/f conflict result the other way' '
 
 	git ls-files -s >actual &&
 	(
-		echo "100644 $o0 1	a" &&
-		echo "100644 $o1 3	a" &&
 		echo "100644 $o4 0	a/c" &&
+		echo "100644 $o0 1	a~$c1" &&
+		echo "100644 $o1 3	a~$c1" &&
 		echo "100644 $o0 0	b" &&
 		echo "100644 $o0 0	c" &&
 		echo "100644 $o1 0	d/e"
@@ -420,9 +420,9 @@ test_expect_success 'merge-recursive d/f conflict result' '
 		echo "100644 $o1 0	a" &&
 		echo "100644 $o0 0	b" &&
 		echo "100644 $o0 0	c" &&
-		echo "100644 $o6 3	d" &&
 		echo "100644 $o0 1	d/e" &&
-		echo "100644 $o1 2	d/e"
+		echo "100644 $o1 2	d/e" &&
+		echo "100644 $o6 3	d~$c6"
 	) >expected &&
 	test_cmp expected actual
 
@@ -444,9 +444,9 @@ test_expect_success 'merge-recursive d/f conflict result' '
 		echo "100644 $o1 0	a" &&
 		echo "100644 $o0 0	b" &&
 		echo "100644 $o0 0	c" &&
-		echo "100644 $o6 2	d" &&
 		echo "100644 $o0 1	d/e" &&
-		echo "100644 $o1 3	d/e"
+		echo "100644 $o1 3	d/e" &&
+		echo "100644 $o6 2	d~$c6"
 	) >expected &&
 	test_cmp expected actual
 
diff --git a/t/t6020-merge-df.sh b/t/t6020-merge-df.sh
index 2af1beec5f..f4ea318ec8 100755
--- a/t/t6020-merge-df.sh
+++ b/t/t6020-merge-df.sh
@@ -81,7 +81,7 @@ test_expect_success 'modify/delete + directory/file conflict' '
 
 	test 5 -eq $(git ls-files -s | wc -l) &&
 	test 4 -eq $(git ls-files -u | wc -l) &&
-	test 1 -eq $(git ls-files -o | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
 
 	test -f letters/file &&
 	test -f letters.txt &&
@@ -100,7 +100,7 @@ test_expect_success 'modify/delete + directory/file conflict; other way' '
 
 	test 5 -eq $(git ls-files -s | wc -l) &&
 	test 4 -eq $(git ls-files -u | wc -l) &&
-	test 1 -eq $(git ls-files -o | wc -l) &&
+	test 0 -eq $(git ls-files -o | wc -l) &&
 
 	test -f letters/file &&
 	test -f letters.txt &&
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index b760c223c6..0ea3760265 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -385,7 +385,7 @@ test_expect_success 'Rename+D/F conflict; renamed file cannot merge and dir in t
 	test_must_fail git merge --strategy=recursive dir-in-way &&
 
 	test 5 -eq "$(git ls-files -u | wc -l)" &&
-	test 3 -eq "$(git ls-files -u dir | grep -v file-in-the-way | wc -l)" &&
+	test 3 -eq "$(git ls-files -u dir~HEAD | wc -l)" &&
 	test 2 -eq "$(git ls-files -u dir/file-in-the-way | wc -l)" &&
 
 	test_must_fail git diff --quiet &&
@@ -421,7 +421,7 @@ test_expect_success 'Same as previous, but merged other way' '
 	test_must_fail git merge --strategy=recursive renamed-file-has-conflicts &&
 
 	test 5 -eq "$(git ls-files -u | wc -l)" &&
-	test 3 -eq "$(git ls-files -u dir | grep -v file-in-the-way | wc -l)" &&
+	test 3 -eq "$(git ls-files -u dir~renamed-file-has-conflicts | wc -l)" &&
 	test 2 -eq "$(git ls-files -u dir/file-in-the-way | wc -l)" &&
 
 	test_must_fail git diff --quiet &&
diff --git a/t/t6036-recursive-corner-cases.sh b/t/t6036-recursive-corner-cases.sh
index 59e52c5a09..f8790e6975 100755
--- a/t/t6036-recursive-corner-cases.sh
+++ b/t/t6036-recursive-corner-cases.sh
@@ -471,7 +471,7 @@ test_expect_success 'setup differently handled merges of directory/file conflict
 		test_must_fail git merge C^0 &&
 		git clean -fd &&
 		git rm -rf a/ &&
-		git rm a &&
+		git rm a~HEAD &&
 		git cat-file -p B:a >a2 &&
 		git add a2 &&
 		git commit -m D2 &&
@@ -492,6 +492,7 @@ test_expect_success 'setup differently handled merges of directory/file conflict
 		test_must_fail git merge B^0 &&
 		git clean -fd &&
 		git rm -rf a/ &&
+		git rm a~B^0 &&
 		test_write_lines 1 2 3 4 5 6 7 8 >a &&
 		git add a &&
 		git commit -m E3 &&
@@ -501,7 +502,7 @@ test_expect_success 'setup differently handled merges of directory/file conflict
 		test_must_fail git merge B^0 &&
 		git clean -fd &&
 		git rm -rf a/ &&
-		git rm a &&
+		git rm a~B^0 &&
 		test_write_lines 1 2 3 4 5 6 7 8 >a2 &&
 		git add a2 &&
 		git commit -m E4 &&
diff --git a/t/t6042-merge-rename-corner-cases.sh b/t/t6042-merge-rename-corner-cases.sh
index 07dd09d985..25cb50478c 100755
--- a/t/t6042-merge-rename-corner-cases.sh
+++ b/t/t6042-merge-rename-corner-cases.sh
@@ -305,14 +305,14 @@ test_expect_success 'rename/directory conflict + clean content merge' '
 		git ls-files -u >out &&
 		test_line_count = 1 out &&
 		git ls-files -o >out &&
-		test_line_count = 2 out &&
+		test_line_count = 1 out &&
 
 		echo 0 >expect &&
 		git cat-file -p base:file >>expect &&
 		echo 7 >>expect &&
 		test_cmp expect newfile~HEAD &&
 
-		test $(git rev-parse :2:newfile) = $(git hash-object expect) &&
+		test $(git rev-parse :2:newfile~HEAD) = $(git hash-object expect) &&
 
 		test_path_is_file newfile/realfile &&
 		test_path_is_file newfile~HEAD
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 4a71f17edd..12047a3309 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1160,10 +1160,10 @@ test_expect_success '5d-check: Directory/file/file conflict due to directory ren
 		git ls-files -u >out &&
 		test_line_count = 1 out &&
 		git ls-files -o >out &&
-		test_line_count = 2 out &&
+		test_line_count = 1 out &&
 
 		git rev-parse >actual \
-			:0:y/b :0:y/c :0:z/d :0:y/f :2:y/d :0:y/d/e &&
+			:0:y/b :0:y/c :0:z/d :0:y/f :2:y/d~HEAD :0:y/d/e &&
 		git rev-parse >expect \
 			 O:z/b  O:z/c  B:z/d  B:z/f  A:y/d  B:y/d/e &&
 		test_cmp expect actual &&
-- 
2.18.0.550.g44d6daf40a.dirty


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts
  2018-08-06 22:44 Two RFC/WIP series for facilitating merge conflict resolution Elijah Newren
  2018-08-06 22:45 ` [RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts Elijah Newren
@ 2018-08-06 22:47 ` Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 1/3] rerere: avoid buffer overrun Elijah Newren
                     ` (3 more replies)
  1 sibling, 4 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:47 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

== Summary ==

For non-textual conflicts, I would like to provide additional information
in the working copy in the form of additional conflict markers and
explanatory text stating what type of non-textual conflict was involved.
This should
  * Make it clearer to users what conflicts they are dealing with and why
  * Enable new features like Thomas Rast' old remerge-diff proposal[1]

[1] https://public-inbox.org/git/cover.1409860234.git.tr@thomasrast.ch/

If this sounds rather imprecise, concrete examples are provided in the
next section of this email.  If this change sounds surprising or
non-intuitive, more detailed rationale motivating this change (which is
admittedly slightly non-obvious) can be found in the remainder of this
email.  It may be the more of the information below needs to be moved into
the commit message for patch 3.

== Examples of Proposal ==

There are two basic types of changes at play here, each best shown with a
representative example:

1) Representative example: A modify/delete conflict; the path in question
in the working tree would have conflict information at the top of the file
followed by the normal file contents; thus it could be of the form:

    <<<<<<<< HEAD
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    ========
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    >>>>>>>> BRANCH
    Lorem ipsum dolor sit amet, consectetuer sadipscing elitr,
    sed diam nonumy eirmod tempor invidunt ut labore et dolore
    magna aliquyam erat, sed diam voluptua. At vero eos et
    accusam et justo duo dolores et ea rebum. Stet clita kasd
    gubergren, no sea takimata sanctus est Lorem ipsum dolor
    sit amet.

Alternative ideas for handling the explanatory text here are welcome.  I
chose to use identical text on both sides of the conflict in an attempt
to highlight that this isn't a normal textual conflict and the text isn't
meant to be part of the file.

This type of example could apply for each of the following types of
conflicts:
  * modify/delete
  * rename/delete
  * directory/file
  * submodule/file
  * symlink/file
  * rename/rename(1to2)
  * executable mode conflict (i.e. 100644 vs. 100755 mode; could come
    from add/add or modify/delete or rename/delete)

It could also be used for the following types of conflicts to help
differentiate between it and other conflict types:
  * add/add
  * rename/add[/delete]
  * rename/rename(2to1)[/delete[/delete]]
  * rename/rename(1to2)/add[/add]

However, any of the types above would be inappropriate if the regular
file(s) in question were binary; in those cases, they'd actually fall
into category two:


2) Representative example: A binary edit/edit conflict.  In this case,
it would be inappropriate to put the conflict markers inside the
binary file.  Instead, we create another file (e.g. path~CONFLICTS)
and put conflict markers in it:

    <<<<<<<< HEAD
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      BINARY conflict: path foo modified in both branches
    ========
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      BINARY conflict: path foo modified in both branches
    >>>>>>>> BRANCH

This file would also be added to the index at stage 1 (so that 'git merge
--abort' would clean this file out instead of leaving it around untracked,
and also because 'git status' would report "deleted in both" which seems
reasonable).

This type of example could apply for each of the following types of
conflicts:
  * binary edit/edit
  * any of the conflicts from type 1 when binary files are involved
  * symlink edit/edit (or add/add)
  * symlink/submodule
  * symlink/directory
  * directory/submodule
  * submodule/submodule

It could also apply to the following new corner case conflict types from
directory rename detection:
  * N-way colliding paths (N>=2) due to directory renames
  * directory rename split; half renamed to one directory and half to another


== Motivation, part 1: Problem statement ==

When conflicts arise we need ways to inform the user of the existence of
the conflicts and their nature.  For textual conflicts with regular files,
we have a simple way of doing this: inserting conflict markers into the
relevant region of the file with both conflicting versions present.
Importantly, this representation of the conflict is present in the working
copy.

For other types of conflicts (path-based or non-regular files), we often
provide no hint in the working copy about either the existence or the
nature of the conflict.  I think this is suboptimal from a users'
point-of-view, and is also limiting some feature development.

== Motivation, part 2: Current non-textual conflict hints ==

For non-textual conflicts, the hints git currently gives the user come in
two forms: messages printed during the merge, and higher order stages in
the index.  Both have some downsides.

For large repos, conflict messages ("e.g. CONFLICT(modify/delete): ...")
printed during the merge can easily be "lost in the noise" and might even
be inaccessible depending on the terminal scrollback buffer size.
Further, as the user begins resolving conflicts in that terminal, it
becomes harder and harder to find the original conflict messages for the
remaining paths.

While higher order stages in the index can be helpful, there are many more
conflict types than there are permutations of higher order stages.  To
name just one example, if all three higher order stages exist, what type
of conflict is it?  It could be an edit/edit conflict, or a
rename/add/delete conflict, or even a file from a directory/file conflict
if that file was involved in a rename.

== Motivation, part 3: Disappearing conflict hints ==

I want to revive Thomas Rast' remerge-diff feature proposal.  To implement
that feature, he essentially does an auto-merge of the parent commits and
records a resulting tree.  That tree includes conflict information, namely
in the form of files that have conflict markers in them.  He then diffs
this auto-merged tree to the actual tree of the merge commit.

I like the idea of an auto-merge tree with conflict information, but note
that this means printed conflict messages and higher order index entries
will be _completely_ lost, making it important that there be a way of
storing hints about conflicts in the working tree.

(Side note: Thomas' old proposal partially address this; he takes paths
that only had either a stage 2 or 3 entry and does a two-way diff with an
empty file.  That is a very reasonable first cut, but it misses lots of
information.  For example, binary conflicts and mode conflicts would
essentially be ignored.  Differentation between conflict types -- which
may be important or helpful to users trying to understand what happened --
would be lost.)


Elijah Newren (3):
  rerere: avoid buffer overrun
  merge-recursive: fix handling of submodules in modify/delete conflicts
  merge-recursive: provide more conflict hints for non-textual conflicts

 merge-recursive.c                   | 135 +++++++++++++++++++++++++++-
 rerere.c                            |   2 +-
 t/t3031-merge-criscross.sh          |   2 +
 t/t6022-merge-rename.sh             |  39 ++------
 t/t6043-merge-rename-directories.sh |   4 +-
 t/t7610-mergetool.sh                |   4 +
 6 files changed, 146 insertions(+), 40 deletions(-)

-- 
2.18.0.550.g44d6daf40a.dirty


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 1/3] rerere: avoid buffer overrun
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
@ 2018-08-06 22:47   ` Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 2/3] merge-recursive: fix handling of submodules in modify/delete conflicts Elijah Newren
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:47 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

check_one_conflict() compares `i` to `active_nr` in two places to avoid
buffer overruns, but left out an important third location.  This has
not previously been a problem, because existing merge strategies have
tended to not create entries at stage #1 that do not have a
corresponding entry at either stage #2 or stage #3.  However, this is
not guaranteed, so add a check to avoid segfaults.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 rerere.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/rerere.c b/rerere.c
index 16c8aac621..7d22fb08c7 100644
--- a/rerere.c
+++ b/rerere.c
@@ -533,7 +533,7 @@ static int check_one_conflict(int i, int *type)
 	}
 
 	*type = PUNTED;
-	while (ce_stage(active_cache[i]) == 1)
+	while (i < active_nr && ce_stage(active_cache[i]) == 1)
 		i++;
 
 	/* Only handle regular files with both stages #2 and #3 */
-- 
2.18.0.550.g44d6daf40a.dirty

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 2/3] merge-recursive: fix handling of submodules in modify/delete conflicts
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 1/3] rerere: avoid buffer overrun Elijah Newren
@ 2018-08-06 22:47   ` Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 3/3] merge-recursive: provide more conflict hints for non-textual conflicts Elijah Newren
  2018-08-09 17:52   ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Junio C Hamano
  3 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:47 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

Similar to commit c641ca670729 ("merge-recursive: handle addition of
submodule on our side of history", 2017-11-14) a submodule can be
confused for a D/F conflict for modify/delete and rename/delete
conflicts.  (To the code, it appears there is a directory in the way of
us writing our expected path to the working tree, because our path is a
submodule; i.e. the submodule is itself the directory and any directory
is assumed to be a D/F conflict that is in the way.)  So, when we are
dealing with a submodule, avoid checking the working copy for a
directory being in the way.

Signed-off-by: Elijah Newren <newren@gmail.com>
---

It might be better to first check that the submodule existed on the HEAD
side of the merge, because there could be a directory in the way of the
submodule.  But that's starting to get ugly quick, and the real fix to
make this cleaner is the rewrite of merge-recursive that does an index-only
merge first, then updates the working copy later...

 merge-recursive.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1446e92bea..4832234073 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1466,7 +1466,7 @@ static int handle_change_delete(struct merge_options *o,
 	const char *update_path = path;
 	int ret = 0;
 
-	if (dir_in_way(path, !o->call_depth, 0) ||
+	if (dir_in_way(path, !o->call_depth && !S_ISGITLINK(changed_mode), 0) ||
 	    (!o->call_depth && would_lose_untracked(path))) {
 		update_path = alt_path = unique_path(o, path, change_branch);
 	}
-- 
2.18.0.550.g44d6daf40a.dirty


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [RFC/WIP PATCH 3/3] merge-recursive: provide more conflict hints for non-textual conflicts
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 1/3] rerere: avoid buffer overrun Elijah Newren
  2018-08-06 22:47   ` [RFC/WIP PATCH 2/3] merge-recursive: fix handling of submodules in modify/delete conflicts Elijah Newren
@ 2018-08-06 22:47   ` Elijah Newren
  2018-08-09 17:52   ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Junio C Hamano
  3 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-06 22:47 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren

For non-textual conflicts, provide additional information in the working
copy in the form of additional conflict markers and explanatory text
stating what type of non-textual conflict was involved.  In particular,
regular files involved in path-based conflicts would have something like
the following at the beginning of the file:

    <<<<<<<< HEAD
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    ========
    Conflict hint: This block of text was not part of the original
    branch; it serves instead to hint about non-textual conflicts:
      MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
    >>>>>>>> BRANCH

When non regular files (binaries, symlinks, etc.) are involved in
conflicts, we instead put this information in a separate file that only
contains this conflict information.

The goals of providing this extra information are:
  * Make it clearer to users what conflicts they are dealing with and why
  * Enable new features like Thomas Rast' old remerge-diff proposal[1]

[1] https://public-inbox.org/git/cover.1409860234.git.tr@thomasrast.ch/

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 133 +++++++++++++++++++++++++++-
 t/t3031-merge-criscross.sh          |   2 +
 t/t6022-merge-rename.sh             |  39 ++------
 t/t6043-merge-rename-directories.sh |   4 +-
 t/t7610-mergetool.sh                |   4 +
 5 files changed, 144 insertions(+), 38 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 4832234073..cdfe9824d2 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -313,6 +313,96 @@ static void output_commit_title(struct merge_options *o, struct commit *commit)
 	flush_output(o);
 }
 
+static void write_conflict_notice(struct strbuf *buf,
+				  struct strbuf *notice,
+				  int use_crlf)
+{
+	int marker_size = 8;
+
+	strbuf_addchars(buf, '<', marker_size);
+	if (use_crlf)
+		strbuf_addch(buf, '\r');
+	strbuf_addch(buf, '\n');
+
+	strbuf_addbuf(buf, notice);
+	if (use_crlf)
+		strbuf_addch(buf, '\r');
+	strbuf_addch(buf, '\n');
+
+	strbuf_addchars(buf, '=', marker_size);
+	if (use_crlf)
+		strbuf_addch(buf, '\r');
+	strbuf_addch(buf, '\n');
+
+	strbuf_addbuf(buf, notice);
+	if (use_crlf)
+		strbuf_addch(buf, '\r');
+	strbuf_addch(buf, '\n');
+
+	strbuf_addchars(buf, '>', marker_size);
+	if (use_crlf)
+		strbuf_addch(buf, '\r');
+	strbuf_addch(buf, '\n');
+}
+
+static int create_non_textual_conflict_file(struct merge_options *o,
+					    struct strbuf *notice,
+					    const char *path,
+					    struct object_id *oid)
+{
+	struct strbuf contents = STRBUF_INIT;
+	int ret = 0;
+	int use_crlf = 0; /* FIXME: Determine platform default?? */
+
+	write_conflict_notice(&contents, notice, use_crlf);
+
+	if (write_object_file(contents.buf, contents.len, "blob", oid))
+		ret = err(o, _("Unable to add %s to database"), path);
+
+	strbuf_release(&contents);
+	return ret;
+}
+
+static int insert_non_textual_conflict_header(struct merge_options *o,
+					      struct strbuf *notice,
+					      const char *path,
+					      struct object_id *oid,
+					      unsigned int mode)
+{
+	struct strbuf header = STRBUF_INIT;
+	struct strbuf dst_buf = STRBUF_INIT;
+	enum object_type type;
+	char *buf;
+	unsigned long size;
+	char *end;
+	int use_crlf;
+	int ret = 0;
+
+	if (!S_ISREG(mode))
+		BUG("insert_non_textual_conflict_header called on file with wrong mode: %0d", mode);
+
+	buf = read_object_file(oid, &type, &size);
+	if (!buf)
+		return err(o, _("cannot read object %s '%s'"), oid_to_hex(oid), path);
+	if (type != OBJ_BLOB) {
+		return err(o, _("blob expected for %s '%s'"), oid_to_hex(oid), path);
+	}
+
+	end = strchrnul(buf, '\n');
+	use_crlf = (end > buf && end[-1] == '\r');
+	write_conflict_notice(&header, notice, use_crlf);
+
+	strbuf_addbuf(&dst_buf, &header);
+	strbuf_add(&dst_buf, buf, size);
+
+	if (write_object_file(dst_buf.buf, dst_buf.len, type_name(type), oid))
+		ret = err(o, _("Unable to add %s to database"), path);
+
+	strbuf_release(&header);
+	strbuf_release(&dst_buf);
+	return ret;
+}
+
 static int add_cacheinfo(struct merge_options *o,
 			 unsigned int mode, const struct object_id *oid,
 			 const char *path, int stage, int refresh, int options)
@@ -1464,6 +1554,9 @@ static int handle_change_delete(struct merge_options *o,
 {
 	char *alt_path = NULL;
 	const char *update_path = path;
+	struct object_id new_oid;
+	struct strbuf sb = STRBUF_INIT;
+	int is_binary;
 	int ret = 0;
 
 	if (dir_in_way(path, !o->call_depth && !S_ISGITLINK(changed_mode), 0) ||
@@ -1527,8 +1620,44 @@ static int handle_change_delete(struct merge_options *o,
 		 * path.  We could call update_file_flags() with update_cache=0
 		 * and update_wd=0, but that's a no-op.
 		 */
-		if (change_branch != o->branch1 || alt_path)
-			ret = update_file(o, 0, changed_oid, changed_mode, update_path);
+		oidcpy(&new_oid, changed_oid);
+		strbuf_addf(&sb, _("CONFLICT (%s/delete): %s deleted in %s and %s in %s."),
+			    change, path, delete_branch, change_past, change_branch);
+		 /*
+		  * FIXME: figure out if update_path's contents are binary;
+		  * buffer_is_binary() may help, though in the case of e.g.
+		  * add/add conflicts it'd be nice to avoid calling that
+		  * multiple times per buffer.
+		  */
+		is_binary = 0;
+		if (S_ISREG(changed_mode) && !is_binary) {
+			insert_non_textual_conflict_header(
+			    o,
+			    &sb,
+			    update_path,
+			    &new_oid,
+			    changed_mode);
+			ret = update_file_flags(o, &new_oid, changed_mode,
+						update_path,
+						/* update_cache */ 0,
+						/* update_wd */    1);
+		} else {
+			struct diff_filespec conflict_file;
+			char *conflict_path;
+			conflict_path = unique_path(o, update_path, "conflicts");
+
+			conflict_file.mode = S_IFREG | 0644;
+			create_non_textual_conflict_file(o, &sb, conflict_path,
+							 &conflict_file.oid);
+			ret = update_file_flags(o, &conflict_file.oid,
+						conflict_file.mode,
+						conflict_path,
+						/* update_cache */ 0,
+						/* update_wd */    1);
+			ret |= update_stages(o, conflict_path, &conflict_file,
+					     NULL, NULL);
+			free(conflict_path);
+		}
 	}
 	free(alt_path);
 
diff --git a/t/t3031-merge-criscross.sh b/t/t3031-merge-criscross.sh
index e59b0a32d6..5c39339809 100755
--- a/t/t3031-merge-criscross.sh
+++ b/t/t3031-merge-criscross.sh
@@ -75,6 +75,7 @@ test_expect_success 'setup repo with criss-cross history' '
 	git checkout B &&
 	test_must_fail git merge E &&
 	# force-resolve
+	echo 9 >data/new-9 &&
 	git add data &&
 	git commit -m F &&
 	git branch F &&
@@ -83,6 +84,7 @@ test_expect_success 'setup repo with criss-cross history' '
 	git checkout C &&
 	test_must_fail git merge D &&
 	# force-resolve
+	echo 9 >data/new-9 &&
 	git add data &&
 	git commit -m G &&
 	git branch G
diff --git a/t/t6022-merge-rename.sh b/t/t6022-merge-rename.sh
index b760c223c6..a065d79049 100755
--- a/t/t6022-merge-rename.sh
+++ b/t/t6022-merge-rename.sh
@@ -467,7 +467,7 @@ test_expect_success 'both rename source and destination involved in D/F conflict
 	test -f destdir/foo &&
 	test -f one &&
 	test -f destdir~HEAD &&
-	test "stuff" = "$(cat destdir~HEAD)"
+	grep "stuff" destdir~HEAD
 '
 
 test_expect_success 'setup pair rename to parent of other (D/F conflicts)' '
@@ -510,8 +510,8 @@ test_expect_success 'pair rename to parent of other (D/F conflicts) w/ untracked
 	test -d one &&
 	test -f one~rename-two &&
 	test -f two &&
-	test "other" = $(cat one~rename-two) &&
-	test "stuff" = $(cat two)
+	grep "other" one~rename-two &&
+	grep "stuff" two
 '
 
 test_expect_success 'pair rename to parent of other (D/F conflicts) w/ clean start' '
@@ -529,8 +529,8 @@ test_expect_success 'pair rename to parent of other (D/F conflicts) w/ clean sta
 
 	test -f one &&
 	test -f two &&
-	test "other" = $(cat one) &&
-	test "stuff" = $(cat two)
+	grep "other" one &&
+	grep "stuff" two
 '
 
 test_expect_success 'setup rename of one file to two, with directories in the way' '
@@ -704,35 +704,6 @@ test_expect_success 'avoid unnecessary update, dir->(file,nothing)' '
 	test_cmp expect actual # "df" should have stayed intact
 '
 
-test_expect_success 'setup avoid unnecessary update, modify/delete' '
-	git rm -rf . &&
-	git clean -fdqx &&
-	rm -rf .git &&
-	git init &&
-
-	>irrelevant &&
-	>file &&
-	git add -A &&
-	git commit -mA &&
-
-	git checkout -b side &&
-	git rm -f file &&
-	git commit -m "Delete file" &&
-
-	git checkout master &&
-	echo bla >file &&
-	git add -A &&
-	git commit -m "Modify file"
-'
-
-test_expect_success 'avoid unnecessary update, modify/delete' '
-	git checkout -q master^0 &&
-	test-tool chmtime --get =1000000000 file >expect &&
-	test_must_fail git merge side &&
-	test-tool chmtime --get file >actual &&
-	test_cmp expect actual # "file" should have stayed intact
-'
-
 test_expect_success 'setup avoid unnecessary update, rename/add-dest' '
 	git rm -rf . &&
 	git clean -fdqx &&
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 4a71f17edd..c5e1874df1 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1908,8 +1908,8 @@ test_expect_success '7e-check: transitive rename in rename/delete AND dirs in th
 			 A:x/d/f  A:y/d/g  O:z/b  O:z/c  O:x/d &&
 		test_cmp expect actual &&
 
-		git hash-object y/d~B^0 >actual &&
-		git rev-parse O:x/d >expect &&
+		tail -n +6 y/d~B^0 >actual &&
+		git cat-file -p O:x/d >expect &&
 		test_cmp expect actual
 	)
 '
diff --git a/t/t7610-mergetool.sh b/t/t7610-mergetool.sh
index 047156e9d5..10c0c903c6 100755
--- a/t/t7610-mergetool.sh
+++ b/t/t7610-mergetool.sh
@@ -374,6 +374,7 @@ test_expect_success 'deleted vs modified submodule' '
 	( yes "" | git mergetool both >/dev/null 2>&1 ) &&
 	( yes "d" | git mergetool file11 file12 >/dev/null 2>&1 ) &&
 	( yes "r" | git mergetool submod ) &&
+	( yes "d" | git mergetool submod~conflicts ) &&
 	rmdir submod && mv submod-movedaside submod &&
 	test "$(cat submod/bar)" = "branch1 submodule" &&
 	git submodule update -N &&
@@ -391,6 +392,7 @@ test_expect_success 'deleted vs modified submodule' '
 	( yes "" | git mergetool both >/dev/null 2>&1 ) &&
 	( yes "d" | git mergetool file11 file12 >/dev/null 2>&1 ) &&
 	( yes "l" | git mergetool submod ) &&
+	( yes "d" | git mergetool submod~conflicts ) &&
 	test ! -e submod &&
 	output="$(git mergetool --no-prompt)" &&
 	test "$output" = "No files need merging" &&
@@ -405,6 +407,7 @@ test_expect_success 'deleted vs modified submodule' '
 	( yes "" | git mergetool both >/dev/null 2>&1 ) &&
 	( yes "d" | git mergetool file11 file12 >/dev/null 2>&1 ) &&
 	( yes "r" | git mergetool submod ) &&
+	( yes "d" | git mergetool submod~conflicts ) &&
 	test ! -e submod &&
 	test -d submod.orig &&
 	git submodule update -N &&
@@ -421,6 +424,7 @@ test_expect_success 'deleted vs modified submodule' '
 	( yes "" | git mergetool both >/dev/null 2>&1 ) &&
 	( yes "d" | git mergetool file11 file12 >/dev/null 2>&1 ) &&
 	( yes "l" | git mergetool submod ) &&
+	( yes "d" | git mergetool submod~conflicts ) &&
 	test "$(cat submod/bar)" = "master submodule" &&
 	git submodule update -N &&
 	test "$(cat submod/bar)" = "master submodule" &&
-- 
2.18.0.550.g44d6daf40a.dirty


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve
  2018-08-06 22:45   ` [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve Elijah Newren
@ 2018-08-09 17:36     ` Junio C Hamano
  2018-08-09 19:26       ` Elijah Newren
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2018-08-09 17:36 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> File/directory conflicts are somewhat difficult to resolve; by way of
> contrast, renames are much easier to handle.  For file/directory
> conflicts, we already have to record the file under a different name in
> the working copy due to the directory being in the way; if we also record
> the file under a different name in the index then it simplifies matters
> for the user, and ensures that 'git reset --hard' and 'git merge --abort'
> will clean up files created by the merge operation.

Yeah, and then our file "path" renamed to "path~2" to make room for
directory "path" they introduced, can be relocated to its final
place in the merge resolution, e.g. "git mv path~2 path/ours" or
"git mv path path.theirs && git mv path~2 path".

Of course, "git rm" and "git mv" must work sensibly, if we want this
change to be truly helpful--but if not, they need to be fixed ;-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts
  2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
                     ` (2 preceding siblings ...)
  2018-08-06 22:47   ` [RFC/WIP PATCH 3/3] merge-recursive: provide more conflict hints for non-textual conflicts Elijah Newren
@ 2018-08-09 17:52   ` Junio C Hamano
  2018-08-09 18:51     ` Elijah Newren
  3 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2018-08-09 17:52 UTC (permalink / raw)
  To: Elijah Newren; +Cc: git

Elijah Newren <newren@gmail.com> writes:

> 1) Representative example: A modify/delete conflict; the path in question
> in the working tree would have conflict information at the top of the file
> followed by the normal file contents; thus it could be of the form:
>
>     <<<<<<<< HEAD
>     Conflict hint: This block of text was not part of the original
>     branch; it serves instead to hint about non-textual conflicts:
>       MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
>     ========
>     Conflict hint: This block of text was not part of the original
>     branch; it serves instead to hint about non-textual conflicts:
>       MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
>     >>>>>>>> BRANCH
>     Lorem ipsum dolor sit amet, consectetuer sadipscing elitr,
>     sed diam nonumy eirmod tempor invidunt ut labore et dolore
>     magna aliquyam erat, sed diam voluptua. At vero eos et
>     accusam et justo duo dolores et ea rebum. Stet clita kasd
>     gubergren, no sea takimata sanctus est Lorem ipsum dolor
>     sit amet.
>
> Alternative ideas for handling the explanatory text here are welcome.

In a modify/delete conflict, we currently do not leave any in-file
clue, so smudging the modified side like this might be a change that
helps those who "grep e '<<<<<<<'" to find the set of paths that
need to be examined.  I personally do not feel it would be all that
useful, as "ls-files -u" is how I'd learn about these paths.

What I would want to see when faced to a modify/delete conflict is
how the modification side changed the contents, as the change, or
its moral equivalent, would need to be ported to other locations in
the context of the deleting side.  But I am not sure if it makes
sense to attempt to somehow include "diff HEAD...MERGE_HEAD" (or the
other way around) in the file to show the contents change on the
modification side.

> 2) Representative example: A binary edit/edit conflict.  In this case,
> it would be inappropriate to put the conflict markers inside the
> binary file.  Instead, we create another file (e.g. path~CONFLICTS)
> and put conflict markers in it:
>
>     <<<<<<<< HEAD
>     Conflict hint: This block of text was not part of the original
>     branch; it serves instead to hint about non-textual conflicts:
>       BINARY conflict: path foo modified in both branches
>     ========
>     Conflict hint: This block of text was not part of the original
>     branch; it serves instead to hint about non-textual conflicts:
>       BINARY conflict: path foo modified in both branches
>     >>>>>>>> BRANCH
>
> This file would also be added to the index at stage 1 (so that 'git merge
> --abort' would clean this file out instead of leaving it around untracked,
> and also because 'git status' would report "deleted in both" which seems
> reasonable).
>
> This type of example could apply for each of the following types of
> conflicts:
>   * binary edit/edit
>   * any of the conflicts from type 1 when binary files are involved
>   * symlink edit/edit (or add/add)
>   * symlink/submodule
>   * symlink/directory
>   * directory/submodule
>   * submodule/submodule
>
> It could also apply to the following new corner case conflict types from
> directory rename detection:
>   * N-way colliding paths (N>=2) due to directory renames
>   * directory rename split; half renamed to one directory and half to another

Hmph, I am starting to wonder if it may be easier to access if
instead you did not touch any working tree file to do any of the
above, and instead write a single file in $GIT_DIR/ to explain what
kind of conflicts these paths are involved in.  That would probably
give a better and easier-to-read summary than "ls-files -u" output.

Or do we have _enough_ information in the "ls-files -u" already to
infer "Ah, we are in symlink edit/edit conflict.", etc.?  If so,
perhaps "git status" can be extended to show what kind of conflict
these paths are in by reading the higher-stage index entries (and
lack of stages thereof, when dealing with a conflict with deletion
involved)?

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts
  2018-08-09 17:52   ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Junio C Hamano
@ 2018-08-09 18:51     ` Elijah Newren
  0 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-09 18:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Thu, Aug 9, 2018 at 10:52 AM Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
> > 1) Representative example: A modify/delete conflict; the path in question
> > in the working tree would have conflict information at the top of the file
> > followed by the normal file contents; thus it could be of the form:
> >
> >     <<<<<<<< HEAD
> >     Conflict hint: This block of text was not part of the original
> >     branch; it serves instead to hint about non-textual conflicts:
> >       MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
> >     ========
> >     Conflict hint: This block of text was not part of the original
> >     branch; it serves instead to hint about non-textual conflicts:
> >       MODIFY/DELETE: path foo modified in HEAD and deleted in BRANCH
> >     >>>>>>>> BRANCH
> >     Lorem ipsum dolor sit amet, consectetuer sadipscing elitr,
> >     sed diam nonumy eirmod tempor invidunt ut labore et dolore
> >     magna aliquyam erat, sed diam voluptua. At vero eos et
> >     accusam et justo duo dolores et ea rebum. Stet clita kasd
> >     gubergren, no sea takimata sanctus est Lorem ipsum dolor
> >     sit amet.
> >
> > Alternative ideas for handling the explanatory text here are welcome.
>
> In a modify/delete conflict, we currently do not leave any in-file
> clue, so smudging the modified side like this might be a change that
> helps those who "grep e '<<<<<<<'" to find the set of paths that
> need to be examined.

Yes, that's one of the things I'm hoping for with this change.

>  I personally do not feel it would be all that
> useful, as "ls-files -u" is how I'd learn about these paths.

"ls-files -u" is really nice; I love it and try to show it to others,
but keep getting surprised by how many people are surprised to learn
of its existence.

Further, "ls-files -u" may be insufficient even for those who know
about it, for two reasons (and admittedly, I care a lot more about the
second than the first):

* There are more conflict types than the number of permutations of
stages.  For example, how do you know if a file was from a
rename/delete or a modify/delete conflict?  And if from a
rename/delete, how do you determine the original filename?  The index
doesn't store this information.  Granted those two conflict types are
at least similar, but other stage combinations might be more
confusing.  For example, if all three stages are present, is the
conflict in question an edit/edit, a rename/add/delete, or a D/F
conflict where the file came from a rename?

* Future feature: Thomas Rast' proposed remerge-diff[1] (which I want
to resurrect and remove the edge/corner cases from).  This worked by
creating what I call an auto-merged tree, where you just accept all
conflicts and commit.  Then you diff the merge commit to what the
auto-merge tree was to see how the user edited conflicts to create the
merge commit.  Problem is, for the auto-merge tree we won't have an
index anymore so how do we represent conflicts that aren't edit/edit
of normal text files?  My proposal here is an answer for that; I'm
open to others, but it's the best I came up with.

[1] https://public-inbox.org/git/cover.1409860234.git.tr@thomasrast.ch/

(I'll also note here that others have requested the ability to create
an as-merged-as-possibly tree including conflicts for other purposes;
see "API suggestion" of
https://public-inbox.org/git/CAFAcib-2fxiVxtVWcbvafY3-Br7Y70HMiHFZoT0VfK6JU0Dp9A@mail.gmail.com/
and my response noting the various difficulties with non-textual
conflicts.  So it may not be just the remerge-diff ability here.)

> What I would want to see when faced to a modify/delete conflict is
> how the modification side changed the contents, as the change, or
> its moral equivalent, would need to be ported to other locations in
> the context of the deleting side.  But I am not sure if it makes
> sense to attempt to somehow include "diff HEAD...MERGE_HEAD" (or the
> other way around) in the file to show the contents change on the
> modification side.

That's a good point; I'm not sure how to include that either.
Something to think about.

> > 2) Representative example: A binary edit/edit conflict.  In this case,
> > it would be inappropriate to put the conflict markers inside the
> > binary file.  Instead, we create another file (e.g. path~CONFLICTS)
> > and put conflict markers in it:
> >
> >     <<<<<<<< HEAD
> >     Conflict hint: This block of text was not part of the original
> >     branch; it serves instead to hint about non-textual conflicts:
> >       BINARY conflict: path foo modified in both branches
> >     ========
> >     Conflict hint: This block of text was not part of the original
> >     branch; it serves instead to hint about non-textual conflicts:
> >       BINARY conflict: path foo modified in both branches
> >     >>>>>>>> BRANCH
> >
> > This file would also be added to the index at stage 1 (so that 'git merge
> > --abort' would clean this file out instead of leaving it around untracked,
> > and also because 'git status' would report "deleted in both" which seems
> > reasonable).
> >
> > This type of example could apply for each of the following types of
> > conflicts:
> >   * binary edit/edit
> >   * any of the conflicts from type 1 when binary files are involved
> >   * symlink edit/edit (or add/add)
> >   * symlink/submodule
> >   * symlink/directory
> >   * directory/submodule
> >   * submodule/submodule
> >
> > It could also apply to the following new corner case conflict types from
> > directory rename detection:
> >   * N-way colliding paths (N>=2) due to directory renames
> >   * directory rename split; half renamed to one directory and half to another
>
> Hmph, I am starting to wonder if it may be easier to access if
> instead you did not touch any working tree file to do any of the
> above, and instead write a single file in $GIT_DIR/ to explain what
> kind of conflicts these paths are involved in.  That would probably
> give a better and easier-to-read summary than "ls-files -u" output.

That's interesting, and would be an alternate way to help some users.
I think it's less discoverable than sticking the info in files in the
working tree for the user to see (particularly for the users who are
allergic to git commands and just stay in their IDE as much as
possible), but there might be something interesting here.

However, I don't see how this would cover the remerge-diff usecase (or
more general create-as-merged-as-possible-tree-with-conflicts-in-it
usecases) at all, which was my biggest motivation for these changes.
Is there a good solution there?

> Or do we have _enough_ information in the "ls-files -u" already to
> infer "Ah, we are in symlink edit/edit conflict.", etc.?  If so,
> perhaps "git status" can be extended to show what kind of conflict
> these paths are in by reading the higher-stage index entries (and
> lack of stages thereof, when dealing with a conflict with deletion
> involved)?

We have enough information in _some_ cases to determine which type of
conflict it is from the index (e.g. because we don't detect renames
for symlinks or submodules and thus have fewer conflict types possible
involving those).  However, we again run into problems (at least for
binaries) about there being more conflict types than permutations of
stage entries, and for all conflicts, we run into the same problem of
not having an index at all for the remerge-diff capability.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve
  2018-08-09 17:36     ` Junio C Hamano
@ 2018-08-09 19:26       ` Elijah Newren
  2018-08-09 20:54         ` Junio C Hamano
  0 siblings, 1 reply; 13+ messages in thread
From: Elijah Newren @ 2018-08-09 19:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Thu, Aug 9, 2018 at 10:36 AM Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
> > File/directory conflicts are somewhat difficult to resolve; by way of
> > contrast, renames are much easier to handle.  For file/directory
> > conflicts, we already have to record the file under a different name in
> > the working copy due to the directory being in the way; if we also record
> > the file under a different name in the index then it simplifies matters
> > for the user, and ensures that 'git reset --hard' and 'git merge --abort'
> > will clean up files created by the merge operation.
>
> Yeah, and then our file "path" renamed to "path~2" to make room for
> directory "path" they introduced, can be relocated to its final
> place in the merge resolution, e.g. "git mv path~2 path/ours" or
> "git mv path path.theirs && git mv path~2 path".

Yes.  :-)

> Of course, "git rm" and "git mv" must work sensibly, if we want this
> change to be truly helpful--but if not, they need to be fixed ;-)

That actually brings up an interesting question.  Right now, if given
a path that appears in the index but at a stage greater than 0, git mv
will fail with "not under version control".  Obviously, that error
message is a lie in such a case, but what should it do?  Print a more
accurate error message ("Refusing to rename file until you remove
conflicts and git add it"?)  Or, what I'd prefer, rename the entry (or
entries) and keep the higher stage number(s) -- is there an unseen
danger with doing this?

(Alternatively, if there is only one entry with stage greater than 0
and it has no other conflicts, one could envision git mv doing the
rename and dropping to stage 0 at the same time, but that sounds a bit
dangerous to me.)

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve
  2018-08-09 19:26       ` Elijah Newren
@ 2018-08-09 20:54         ` Junio C Hamano
  2018-08-09 21:27           ` Elijah Newren
  0 siblings, 1 reply; 13+ messages in thread
From: Junio C Hamano @ 2018-08-09 20:54 UTC (permalink / raw)
  To: Elijah Newren; +Cc: Git Mailing List

Elijah Newren <newren@gmail.com> writes:

>> Of course, "git rm" and "git mv" must work sensibly, if we want this
>> change to be truly helpful--but if not, they need to be fixed ;-)
>
> That actually brings up an interesting question.  Right now, if given
> a path that appears in the index but at a stage greater than 0, git mv
> will fail with "not under version control".  Obviously, that error
> message is a lie in such a case, but what should it do?
>
> (Alternatively, if there is only one entry with stage greater than 0
> and it has no other conflicts, one could envision git mv doing the
> rename and dropping to stage 0 at the same time, but that sounds a bit
> dangerous to me.)

I do not particularly think it is "dangerous".  In fact, that sort
of behaviour was what I had in mind when I said "work sensibly".

When resolving a conflict that they added a new path at stage #3 to
remove that path, I can say "git rm $that_path", which removes all
stages of that path and make the index closer to the next commit.
Or I may decide to keep that path by "git add $that_path", which
adds that path at stage #0.  I think "git mv $that_path $over_there"
that drops that path at stage #3 to stage #0 of another path would
be in line with these two.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve
  2018-08-09 20:54         ` Junio C Hamano
@ 2018-08-09 21:27           ` Elijah Newren
  0 siblings, 0 replies; 13+ messages in thread
From: Elijah Newren @ 2018-08-09 21:27 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Thu, Aug 9, 2018 at 1:54 PM Junio C Hamano <gitster@pobox.com> wrote:
> Elijah Newren <newren@gmail.com> writes:
>
> >> Of course, "git rm" and "git mv" must work sensibly, if we want this
> >> change to be truly helpful--but if not, they need to be fixed ;-)
> >
> > That actually brings up an interesting question.  Right now, if given
> > a path that appears in the index but at a stage greater than 0, git mv
> > will fail with "not under version control".  Obviously, that error
> > message is a lie in such a case, but what should it do?
> >
> > (Alternatively, if there is only one entry with stage greater than 0
> > and it has no other conflicts, one could envision git mv doing the
> > rename and dropping to stage 0 at the same time, but that sounds a bit
> > dangerous to me.)
>
> I do not particularly think it is "dangerous".  In fact, that sort
> of behaviour was what I had in mind when I said "work sensibly".
>
> When resolving a conflict that they added a new path at stage #3 to
> remove that path, I can say "git rm $that_path", which removes all
> stages of that path and make the index closer to the next commit.
> Or I may decide to keep that path by "git add $that_path", which
> adds that path at stage #0.  I think "git mv $that_path $over_there"
> that drops that path at stage #3 to stage #0 of another path would
> be in line with these two.

This argument makes sense to me *IF* there's no possibility for
internal textual conflicts.  But if there are textual conflicts, I
don't see how it works.  So either I'm misunderstanding part of what
you're suggesting or you may have overlooked that case.

Let's say we did want to drop to stage #0 when the user runs git mv.
I'm assuming you agree that'd be bad to do if there were still
conflict markers left in that file (which can happen when the file of
a D/F conflict came from a renamed file that had edits on both sides
of history, for example).  That suggests we have to open and parse the
file and look for conflict markers before dropping to stage #0 and
only proceeding when none are found.  That feels a bit magic; this
auto-resolving-upon-mv seems to risk surprising someone to me.  In
particular, I'm imagining a scenario where someone edits some file
enough to remove conflict markers but isn't satisfied that everything
is semantically resolved yet, then runs git mv on the file, then
starts working on other files, and then tries to come back to the
original file only to discover that they can't find it in the list of
unmerged files because we marked it as resolved for them.

Am I missing something here?

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2018-08-09 21:27 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-08-06 22:44 Two RFC/WIP series for facilitating merge conflict resolution Elijah Newren
2018-08-06 22:45 ` [RFC/WIP PATCH 0/1] Simplify handling of directory/file conflicts Elijah Newren
2018-08-06 22:45   ` [RFC/WIP PATCH 1/1] merge-recursive: make file/directory conflicts easier to resolve Elijah Newren
2018-08-09 17:36     ` Junio C Hamano
2018-08-09 19:26       ` Elijah Newren
2018-08-09 20:54         ` Junio C Hamano
2018-08-09 21:27           ` Elijah Newren
2018-08-06 22:47 ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Elijah Newren
2018-08-06 22:47   ` [RFC/WIP PATCH 1/3] rerere: avoid buffer overrun Elijah Newren
2018-08-06 22:47   ` [RFC/WIP PATCH 2/3] merge-recursive: fix handling of submodules in modify/delete conflicts Elijah Newren
2018-08-06 22:47   ` [RFC/WIP PATCH 3/3] merge-recursive: provide more conflict hints for non-textual conflicts Elijah Newren
2018-08-09 17:52   ` [RFC/WIP PATCH 0/3] Modifications to handling of non-textual file merge conflicts Junio C Hamano
2018-08-09 18:51     ` Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).