git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v7 00/31] Add directory rename detection to git
@ 2018-01-30 23:25 Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 01/31] directory rename detection: basic testcases Elijah Newren
                   ` (31 more replies)
  0 siblings, 32 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset=bogus too, Size: 26195 bytes --]

This patchset introduces directory rename detection to merge-recursive.  See
  https://public-inbox.org/git/20171110190550.27059-1-newren@gmail.com/
for the first series (including design considerations, etc.), and follow-up
series can be found at
  https://public-inbox.org/git/20171120220209.15111-1-newren@gmail.com/
  https://public-inbox.org/git/20171121080059.32304-1-newren@gmail.com/
  https://public-inbox.org/git/20171129014237.32570-1-newren@gmail.com/
  https://public-inbox.org/git/20171228041352.27880-1-newren@gmail.com/
  https://public-inbox.org/git/20180105202711.24311-1-newren@gmail.com/

Changes since v6 (full tbdiff follows below):
  * Fix missing file argument in a testcase, pointed out by SZEDER Gábor.
  * Add whitespace in some testcases in separate git-rev-parse invocations
    to make it easier to visually see what is being compared.
  * Rebased on latest origin/master (not that there were any conflicts)

Note: This series continues to depend upon en/merge-recursive-fixes, at
  least contextually.

Elijah Newren (31):
  directory rename detection: basic testcases
  directory rename detection: directory splitting testcases
  directory rename detection: testcases to avoid taking detection too
    far
  directory rename detection: partially renamed directory
    testcase/discussion
  directory rename detection: files/directories in the way of some
    renames
  directory rename detection: testcases checking which side did the
    rename
  directory rename detection: more involved edge/corner testcases
  directory rename detection: testcases exploring possibly suboptimal
    merges
  directory rename detection: miscellaneous testcases to complete
    coverage
  directory rename detection: tests for handling overwriting untracked
    files
  directory rename detection: tests for handling overwriting dirty files
  merge-recursive: move the get_renames() function
  merge-recursive: introduce new functions to handle rename logic
  merge-recursive: fix leaks of allocated renames and diff_filepairs
  merge-recursive: make !o->detect_rename codepath more obvious
  merge-recursive: split out code for determining diff_filepairs
  merge-recursive: add a new hashmap for storing directory renames
  merge-recursive: make a helper function for cleanup for handle_renames
  merge-recursive: add get_directory_renames()
  merge-recursive: check for directory level conflicts
  merge-recursive: add a new hashmap for storing file collisions
  merge-recursive: add computation of collisions due to dir rename &
    merging
  merge-recursive: check for file level conflicts then get new name
  merge-recursive: when comparing files, don't include trees
  merge-recursive: apply necessary modifications for directory renames
  merge-recursive: avoid clobbering untracked files with directory
    renames
  merge-recursive: fix overwriting dirty files involved in renames
  merge-recursive: fix remaining directory rename + dirty overwrite
    cases
  directory rename detection: new testcases showcasing a pair of bugs
  merge-recursive: avoid spurious rename/rename conflict from dir
    renames
  merge-recursive: ensure we write updates for directory-renamed file

 merge-recursive.c                   | 1212 ++++++++++-
 merge-recursive.h                   |   17 +
 strbuf.c                            |   16 +
 strbuf.h                            |   16 +
 t/t3501-revert-cherry-pick.sh       |    2 +-
 t/t6043-merge-rename-directories.sh | 3990 +++++++++++++++++++++++++++++++++++
 t/t7607-merge-overwrite.sh          |    2 +-
 unpack-trees.c                      |    4 +-
 unpack-trees.h                      |    4 +
 9 files changed, 5148 insertions(+), 115 deletions(-)
 create mode 100755 t/t6043-merge-rename-directories.sh

 1: 2fd0812b7a !  1: 5ba69c9c7b directory rename detection: basic testcases
    @@ -94,7 +94,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/d B:z/e/f &&
    ++			O:z/b    O:z/c    B:z/d    B:z/e/f &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/d >actual &&
    @@ -161,7 +161,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:y/d A:z/e &&
    ++			O:z/b    O:z/c    O:y/d    A:z/e &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:z/e
     +	)
    @@ -219,7 +219,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:x/d &&
    ++			O:z/b    O:z/c    O:x/d &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:x/d &&
     +		test_must_fail git rev-parse HEAD:z/d &&
    @@ -293,14 +293,14 @@
     +		git rev-parse >actual \
     +			:0:x/b :0:x/c :0:x/d :0:x/e :0:x/m :0:x/n &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:y/d O:y/e A:y/m B:z/n &&
    ++			 O:z/b  O:z/c  O:y/d  O:y/e  A:y/m  B:z/n &&
     +		test_cmp expect actual &&
     +
     +		test_must_fail git rev-parse :0:x/wham &&
     +		git rev-parse >actual \
     +			:2:x/wham :3:x/wham &&
     +		git rev-parse >expect \
    -+			A:y/wham B:z/wham &&
    ++			 A:y/wham  B:z/wham &&
     +		test_cmp expect actual &&
     +
     +		test_path_is_missing x/wham &&
    @@ -310,7 +310,7 @@
     +		git hash-object >actual \
     +			x/wham~HEAD x/wham~B^0 &&
     +		git rev-parse >expect \
    -+			A:y/wham B:z/wham &&
    ++			A:y/wham    B:z/wham &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -366,7 +366,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/newb HEAD:y/newc HEAD:y/d &&
     +		git rev-parse >expect \
    -+			O:z/oldb O:z/oldc B:z/d &&
    ++			O:z/oldb    O:z/oldc    B:z/d &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:z/d
     +	)
    @@ -432,7 +432,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:x/d HEAD:x/e HEAD:x/f HEAD:x/g &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:z/d O:z/e O:z/f A:z/g &&
    ++			O:z/b    O:z/c    O:z/d    O:z/e    O:z/f    A:z/g &&
     +		test_cmp expect actual &&
     +		test_path_is_missing z/g &&
     +		test_must_fail git rev-parse HEAD:z/g
 2: 5c697afd4b !  2: e1d23f7f95 directory rename detection: directory splitting testcases
    @@ -79,7 +79,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:w/c :0:z/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/d &&
    ++			 O:z/b  O:z/c  B:z/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -140,7 +140,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:w/c :0:x/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:x/d &&
    ++			 O:z/b  O:z/c  B:x/d &&
     +		test_cmp expect actual &&
     +		test_i18ngrep ! "CONFLICT.*directory rename split" out
     +	)
 3: 2b9346df1b !  3: b10cb49cf9 directory rename detection: testcases to avoid taking detection too far
    @@ -74,7 +74,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:x/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:z/d &&
    ++			O:z/b    O:z/c    O:z/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -133,7 +133,7 @@
     +
     +		test_must_fail git merge -s recursive B^0 >out &&
     +		test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
    -+		test_i18ngrep ! CONFLICT.*rename/rename.*y/d &&
    ++		test_i18ngrep ! CONFLICT.*rename/rename.*y/d out &&
     +
     +		git ls-files -s >out &&
     +		test_line_count = 5 out &&
    @@ -145,12 +145,12 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :1:z/d :2:x/d :3:w/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:z/d O:z/d O:z/d &&
    ++			 O:z/b  O:z/c  O:z/d  O:z/d  O:z/d &&
     +		test_cmp expect actual &&
     +
     +		test_path_is_missing z/d &&
     +		git hash-object >actual \
    -+			x/d w/d &&
    ++			x/d   w/d &&
     +		git rev-parse >expect \
     +			O:z/d O:z/d &&
     +		test_cmp expect actual
 4: 791201822d !  4: ec3ccf0a95 directory rename detection: partially renamed directory testcase/discussion
    @@ -103,7 +103,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/e HEAD:z/f &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:z/d O:z/e B:z/f &&
    ++			O:z/b    O:z/c    O:z/d    O:z/e    B:z/f &&
     +		test_cmp expect actual
     +	)
     +'
 5: 6fc6028a78 !  5: da018f1adb directory rename detection: files/directories in the way of some renames
    @@ -86,7 +86,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:y/d A:y/e A:z/e A:z/f &&
    ++			 O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -160,7 +160,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :0:y/e :2:y/d :3:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/e A:y/d B:y/d &&
    ++			 O:z/b  O:z/c  B:z/e  A:y/d  B:y/d &&
     +		test_cmp expect actual &&
     +
     +		test_must_fail git rev-parse :1:y/d &&
    @@ -241,20 +241,20 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :0:y/e &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/e &&
    ++			 O:z/b  O:z/c  B:z/e &&
     +		test_cmp expect actual &&
     +
     +		test_must_fail git rev-parse :1:y/d &&
     +		git rev-parse >actual \
     +			:2:w/d :3:w/d :1:x/d :2:y/d :3:y/d :3:z/d &&
     +		git rev-parse >expect \
    -+			O:x/d B:w/d O:x/d A:y/d B:y/d O:x/d &&
    ++			 O:x/d  B:w/d  O:x/d  A:y/d  B:y/d  O:x/d &&
     +		test_cmp expect actual &&
     +
     +		git hash-object >actual \
     +			w/d~HEAD w/d~B^0 z/d &&
     +		git rev-parse >expect \
    -+			O:x/d B:w/d O:x/d &&
    ++			O:x/d    B:w/d   O:x/d &&
     +		test_cmp expect actual &&
     +		test_path_is_missing x/d &&
     +		test_path_is_file y/d &&
    @@ -324,7 +324,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :0:z/d :0:y/f :2:y/d :0:y/d/e &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/d B:z/f A:y/d B:y/d/e &&
    ++			 O:z/b  O:z/c  B:z/d  B:z/f  A:y/d  B:y/d/e &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/d~HEAD >actual &&
 6: 317d54cfc0 !  6: f7ca54e7f2 directory rename detection: testcases checking which side did the rename
    @@ -84,7 +84,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :3:y/c &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c &&
    ++			 O:z/b  O:z/c &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -148,7 +148,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:z/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/d &&
    ++			O:z/b    O:z/c    B:z/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -210,7 +210,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:z/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/d &&
    ++			O:z/b    O:z/c    B:z/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -272,7 +272,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:z/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:x/d &&
    ++			O:z/b    O:z/c    O:x/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -335,7 +335,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:y/d B:z/d &&
    ++			O:z/b    O:z/c    B:y/d    B:z/d &&
     +		test_cmp expect actual
     +	)
     +'
 7: afb942c5d7 !  7: 8ae28f45fe directory rename detection: more involved edge/corner testcases
    @@ -97,11 +97,11 @@
     +		git rev-parse >actual \
     +			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/b O:z/b O:z/c O:z/c O:z/c B:z/d &&
    ++			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
     +		test_cmp expect actual &&
     +
     +		git hash-object >actual \
    -+			y/b w/b y/c x/c &&
    ++			y/b   w/b   y/c   x/c &&
     +		git rev-parse >expect \
     +			O:z/b O:z/b O:z/c O:z/c &&
     +		test_cmp expect actual
    @@ -168,7 +168,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :2:y/d :3:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:w/d O:x/d &&
    ++			 O:z/b  O:z/c  O:w/d  O:x/d &&
     +		test_cmp expect actual &&
     +
     +		test_path_is_missing y/d &&
    @@ -178,7 +178,7 @@
     +		git hash-object >actual \
     +			y/d~HEAD y/d~B^0 &&
     +		git rev-parse >expect \
    -+			O:w/d O:x/d &&
    ++			O:w/d    O:x/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -244,7 +244,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :1:x/d :2:w/d :3:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:x/d O:x/d O:x/d &&
    ++			 O:z/b  O:z/c  O:x/d  O:x/d  O:x/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -308,7 +308,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :3:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:x/d &&
    ++			 O:z/b  O:z/c  O:x/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -398,7 +398,7 @@
     +		git rev-parse >actual \
     +			:0:x/d/f :0:y/d/g :0:y/b :0:y/c :3:y/d &&
     +		git rev-parse >expect \
    -+			A:x/d/f A:y/d/g O:z/b O:z/c O:x/d &&
    ++			 A:x/d/f  A:y/d/g  O:z/b  O:z/c  O:x/d &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/d~B^0 >actual &&
 8: 157559ceac !  8: 8d05b8dc10 directory rename detection: testcases exploring possibly suboptimal merges
    @@ -92,7 +92,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
     +		git rev-parse >expect \
    -+			O:x/a O:x/b A:x/e A:y/f O:y/c O:y/d &&
    ++			O:x/a    O:x/b    A:x/e    A:y/f    O:y/c    O:y/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -170,7 +170,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/a HEAD:y/b HEAD:z/a HEAD:z/b HEAD:x/e HEAD:y/e &&
     +		git rev-parse >expect \
    -+			O:x/a O:x/b O:y/a O:y/b A:x/e A:y/e &&
    ++			O:x/a    O:x/b    O:y/a    O:y/b    A:x/e    A:y/e &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -238,7 +238,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :0:y/c :0:y/e :3:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/e B:z/d &&
    ++			 O:z/b  O:z/c  B:z/e  B:z/d &&
     +		test_cmp expect actual &&
     +
     +		test_must_fail git rev-parse :1:y/d &&
    @@ -320,7 +320,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/e &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/e &&
    ++			O:z/b    O:z/c    B:z/e &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -401,11 +401,11 @@
     +		git rev-parse >actual \
     +			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:w/c :0:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/b O:z/b O:z/c O:z/c O:z/c B:z/d &&
    ++			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
     +		test_cmp expect actual &&
     +
     +		git hash-object >actual \
    -+			y/b w/b y/c w/c &&
    ++			y/b   w/b   y/c   w/c &&
     +		git rev-parse >expect \
     +			O:z/b O:z/b O:z/c O:z/c &&
     +		test_cmp expect actual &&
 9: e9f563a34e !  9: 47ffccc86d directory rename detection: miscellaneous testcases to complete coverage
    @@ -105,13 +105,13 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/i &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/i &&
    ++			O:z/b    O:z/c    B:z/i &&
     +		test_cmp expect actual &&
     +
     +		git rev-parse >actual \
     +			HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
     +		git rev-parse >expect \
    -+			O:z/d/e O:z/d/f O:z/d/g B:z/d/h &&
    ++			O:z/d/e    O:z/d/f    O:z/d/g    B:z/d/h &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -174,7 +174,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c :0:expected &&
    ++			O:z/b    O:z/c    :0:expected &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:x/d &&
     +		test_must_fail git rev-parse HEAD:z/d &&
    @@ -264,7 +264,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e HEAD:x/f HEAD:x/g &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c O:x/d O:x/e O:w/f A:x/g &&
    ++			O:z/b    O:z/c    O:x/d    O:x/e    O:w/f    A:x/g &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -353,9 +353,13 @@
     +		test_line_count = 1 out &&
     +
     +		git rev-parse >actual \
    -+			HEAD:z/t HEAD:y/a HEAD:x/b HEAD:w/c HEAD:u/d HEAD:u/e HEAD:u/f &&
    -+		git rev-parse >expect \
    -+			B:z/t O:z/a O:y/b O:x/c O:w/d O:v/e A:u/f &&
    ++			HEAD:z/t \
    ++			HEAD:y/a HEAD:x/b HEAD:w/c \
    ++			HEAD:u/d HEAD:u/e HEAD:u/f &&
    ++		git rev-parse >expect \
    ++			B:z/t    \
    ++			O:z/a    O:y/b    O:x/c    \
    ++			O:w/d    O:v/e    A:u/f &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -439,16 +443,16 @@
     +			:0:combined/g :0:combined/h :0:combined/i \
     +			:0:combined/j :0:combined/k :0:combined/l &&
     +		git rev-parse >expect \
    -+			O:dir1/a O:dir1/b A:dir1/c \
    -+			O:dir2/d O:dir2/e A:dir2/f \
    -+			O:dir3/g O:dir3/h A:dir3/i \
    -+			O:dirN/j O:dirN/k A:dirN/l &&
    ++			 O:dir1/a      O:dir1/b      A:dir1/c \
    ++			 O:dir2/d      O:dir2/e      A:dir2/f \
    ++			 O:dir3/g      O:dir3/h      A:dir3/i \
    ++			 O:dirN/j      O:dirN/k      A:dirN/l &&
     +		test_cmp expect actual &&
     +
     +		git rev-parse >actual \
     +			:0:dir1/yo :0:dir2/yo :0:dir3/yo :0:dirN/yo &&
     +		git rev-parse >expect \
    -+			A:dir1/yo A:dir2/yo A:dir3/yo A:dirN/yo &&
    ++			 A:dir1/yo  A:dir2/yo  A:dir3/yo  A:dirN/yo &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -503,9 +507,15 @@
     +		test_line_count = 4 out &&
     +
     +		git rev-parse >actual \
    -+			HEAD:priority/a/foo HEAD:priority/b/bar HEAD:priority/b/baz HEAD:priority/c &&
    -+		git rev-parse >expect \
    -+			O:goal/a/foo O:goal/b/bar O:goal/b/baz B:goal/c &&
    ++			HEAD:priority/a/foo \
    ++			HEAD:priority/b/bar \
    ++			HEAD:priority/b/baz \
    ++			HEAD:priority/c &&
    ++		git rev-parse >expect \
    ++			O:goal/a/foo \
    ++			O:goal/b/bar \
    ++			O:goal/b/baz \
    ++			B:goal/c &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:goal/c
     +	)
    @@ -564,9 +574,15 @@
     +		test_line_count = 4 out &&
     +
     +		git rev-parse >actual \
    -+			HEAD:priority/alpha/foo HEAD:priority/beta/bar HEAD:priority/beta/baz HEAD:priority/c &&
    -+		git rev-parse >expect \
    -+			O:goal/a/foo O:goal/b/bar O:goal/b/baz B:goal/c &&
    ++			HEAD:priority/alpha/foo \
    ++			HEAD:priority/beta/bar  \
    ++			HEAD:priority/beta/baz  \
    ++			HEAD:priority/c &&
    ++		git rev-parse >expect \
    ++			O:goal/a/foo \
    ++			O:goal/b/bar \
    ++			O:goal/b/baz \
    ++			B:goal/c &&
     +		test_cmp expect actual &&
     +		test_must_fail git rev-parse HEAD:goal/c
     +	)
10: 0c6630551e ! 10: db7d9850c2 directory rename detection: tests for handling overwriting untracked files
    @@ -147,7 +147,7 @@
     +		git rev-parse >actual \
     +			:0:y/b :3:y/d :3:y/e &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c B:z/e &&
    ++			O:z/b  O:z/c  B:z/e &&
     +		test_cmp expect actual &&
     +
     +		echo very >expect &&
    @@ -223,7 +223,7 @@
     +		git rev-parse >actual \
     +			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :3:y/c &&
     +		git rev-parse >expect \
    -+			O:z/a O:z/b O:x/d O:x/c O:x/c O:x/c &&
    ++			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  O:x/c &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/c~B^0 >actual &&
    @@ -298,7 +298,7 @@
     +		git rev-parse >actual \
     +			:0:y/a :0:y/b :0:y/d :0:y/e :2:y/wham :3:y/wham &&
     +		git rev-parse >expect \
    -+			O:z/a O:z/b O:x/d O:x/e O:z/c O:x/f &&
    ++			 O:z/a  O:z/b  O:x/d  O:x/e  O:z/c     O:x/f &&
     +		test_cmp expect actual &&
     +
     +		test_must_fail git rev-parse :1:y/wham &&
    @@ -309,7 +309,7 @@
     +		git hash-object >actual \
     +			y/wham~B^0 y/wham~HEAD &&
     +		git rev-parse >expect \
    -+			O:x/f O:z/c &&
    ++			O:x/f      O:z/c &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -370,7 +370,7 @@
     +		git rev-parse >actual \
     +			:0:y/a :0:y/b :0:y/c &&
     +		git rev-parse >expect \
    -+			O:z/a O:z/b B:z/c &&
    ++			 O:z/a  O:z/b  B:z/c &&
     +		test_cmp expect actual &&
     +
     +		echo random >expect &&
11: 16cd375e15 ! 11: 0de0a9dfa0 directory rename detection: tests for handling overwriting dirty files
    @@ -87,7 +87,7 @@
     +		git rev-parse >actual \
     +			:0:z/a :2:z/c &&
     +		git rev-parse >expect \
    -+			O:z/a B:z/b &&
    ++			 O:z/a  B:z/b &&
     +		test_cmp expect actual &&
     +
     +		git hash-object z/c~HEAD >actual &&
    @@ -162,7 +162,7 @@
     +		git rev-parse >actual \
     +			:0:x/b :0:y/a :0:y/c &&
     +		git rev-parse >expect \
    -+			O:x/b O:z/a B:x/c &&
    ++			 O:x/b  O:z/a  B:x/c &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/c >actual &&
    @@ -302,7 +302,7 @@
     +		git rev-parse >actual \
     +			:0:x/b :0:y/a :0:y/c/d :3:y/c &&
     +		git rev-parse >expect \
    -+			O:x/b O:z/a B:y/c/d B:x/c &&
    ++			 O:x/b  O:z/a  B:y/c/d  B:x/c &&
     +		test_cmp expect actual &&
     +
     +		git hash-object y/c~HEAD >actual &&
    @@ -381,13 +381,13 @@
     +		git rev-parse >actual \
     +			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :2:y/c :3:y/c &&
     +		git rev-parse >expect \
    -+			O:z/a O:z/b O:x/d O:x/c O:x/c A:y/c O:x/c &&
    ++			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  A:y/c  O:x/c &&
     +		test_cmp expect actual &&
     +
     +		git hash-object >actual \
     +			y/c~B^0 y/c~HEAD &&
     +		git rev-parse >expect \
    -+			O:x/c A:y/c &&
    ++			O:x/c   A:y/c &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -458,13 +458,13 @@
     +		git hash-object >actual \
     +			y/wham~B^0 y/wham~HEAD &&
     +		git rev-parse >expect \
    -+			O:x/d O:x/c &&
    ++			O:x/d      O:x/c &&
     +		test_cmp expect actual &&
     +
     +		git rev-parse >actual \
     +			:0:y/a :0:y/b :2:y/wham :3:y/wham &&
     +		git rev-parse >expect \
    -+			O:z/a O:z/b O:x/c O:x/d &&
    ++			 O:z/a  O:z/b  O:x/c     O:x/d &&
     +		test_cmp expect actual
     +	)
     +'
12: b7f4530bcd = 12: 9a6777f577 merge-recursive: move the get_renames() function
13: 45c39bb872 = 13: ac6a95c7b8 merge-recursive: introduce new functions to handle rename logic
14: 21b30fe2da = 14: 76b09d49cd merge-recursive: fix leaks of allocated renames and diff_filepairs
15: bad89d2178 = 15: e4189f3da2 merge-recursive: make !o->detect_rename codepath more obvious
16: 67eec195a4 = 16: 6bc800e369 merge-recursive: split out code for determining diff_filepairs
17: 288e3987a7 = 17: 0757c92ca1 merge-recursive: add a new hashmap for storing directory renames
18: c2d58b3a29 = 18: f17343fc2c merge-recursive: make a helper function for cleanup for handle_renames
19: 1dcf482278 = 19: 9b63e257c8 merge-recursive: add get_directory_renames()
20: 968effeaf3 = 20: 6730d8e7b7 merge-recursive: check for directory level conflicts
21: 865b98f6b4 = 21: 178ec9e079 merge-recursive: add a new hashmap for storing file collisions
22: f74e41c564 = 22: 1f3ff65e82 merge-recursive: add computation of collisions due to dir rename & merging
23: 6678bf2ca4 = 23: d28651aeb0 merge-recursive: check for file level conflicts then get new name
24: 8dd2a045c8 = 24: d6f3d47304 merge-recursive: when comparing files, don't include trees
25: 30341f6551 = 25: f91509f9df merge-recursive: apply necessary modifications for directory renames
26: 498cb775f7 = 26: 9d903c98de merge-recursive: avoid clobbering untracked files with directory renames
27: 81d5073abc = 27: 2ab61d26a3 merge-recursive: fix overwriting dirty files involved in renames
28: 3a2e99fe75 = 28: d510c260b7 merge-recursive: fix remaining directory rename + dirty overwrite cases
29: e93a1b89e3 ! 29: b59a612e68 directory rename detection: new testcases showcasing a pair of bugs
    @@ -86,7 +86,7 @@
     +		git rev-parse >actual \
     +			HEAD:y/b HEAD:y/c HEAD:x/d &&
     +		git rev-parse >expect \
    -+			O:z/b O:z/c A:z/d &&
    ++			O:z/b    O:z/c    A:z/d &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -156,9 +156,15 @@
     +		test_line_count = 6 out &&
     +
     +		git rev-parse >actual \
    -+			HEAD:node1/leaf1 HEAD:node1/leaf2 HEAD:node1/leaf5 HEAD:node1/node2/leaf3 HEAD:node1/node2/leaf4 HEAD:node1/node2/leaf6 &&
    ++			HEAD:node1/leaf1 HEAD:node1/leaf2 HEAD:node1/leaf5 \
    ++			HEAD:node1/node2/leaf3 \
    ++			HEAD:node1/node2/leaf4 \
    ++			HEAD:node1/node2/leaf6 &&
     +		git rev-parse >expect \
    -+			O:node1/leaf1 O:node1/leaf2 B:node1/leaf5 O:node2/leaf3 O:node2/leaf4 B:node2/leaf6 &&
    ++			O:node1/leaf1    O:node1/leaf2    B:node1/leaf5 \
    ++			O:node2/leaf3 \
    ++			O:node2/leaf4 \
    ++			B:node2/leaf6 &&
     +		test_cmp expect actual
     +	)
     +'
    @@ -231,8 +237,10 @@
     +			HEAD:node2/node1/node2/leaf3 \
     +			HEAD:node2/node1/node2/leaf4 &&
     +		git rev-parse >expect \
    -+			O:node1/leaf1 O:node1/leaf2 \
    -+			O:node2/leaf3 O:node2/leaf4 &&
    ++			O:node1/leaf1 \
    ++			O:node1/leaf2 \
    ++			O:node2/leaf3 \
    ++			O:node2/leaf4 &&
     +		test_cmp expect actual
     +	)
     +'
30: d77d0d85b0 ! 30: d20b759b63 merge-recursive: avoid spurious rename/rename conflict from dir renames
    @@ -98,8 +98,8 @@
     -			:0:y/b :0:y/c :0:y/e :3:y/d &&
     +			:0:y/b :0:y/c :0:y/e :1:z/d :3:z/d &&
      		git rev-parse >expect \
    --			O:z/b O:z/c B:z/e B:z/d &&
    -+			O:z/b O:z/c B:z/e O:z/d B:z/d &&
    +-			 O:z/b  O:z/c  B:z/e  B:z/d &&
    ++			 O:z/b  O:z/c  B:z/e  O:z/d  B:z/d &&
      		test_cmp expect actual &&
      
     -		test_must_fail git rev-parse :1:y/d &&
31: c7ed5a2134 = 31: f69932adfe merge-recursive: ensure we write updates for directory-renamed file

-- 
2.16.1.106.gf69932adfe

^ permalink raw reply	[flat|nested] 63+ messages in thread

* [PATCH v7 01/31] directory rename detection: basic testcases
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 02/31] directory rename detection: directory splitting testcases Elijah Newren
                   ` (30 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 442 ++++++++++++++++++++++++++++++++++++
 1 file changed, 442 insertions(+)
 create mode 100755 t/t6043-merge-rename-directories.sh

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
new file mode 100755
index 0000000000..d045f0e31e
--- /dev/null
+++ b/t/t6043-merge-rename-directories.sh
@@ -0,0 +1,442 @@
+#!/bin/sh
+
+test_description="recursive merge with directory renames"
+# includes checking of many corner cases, with a similar methodology to:
+#   t6042: corner cases with renames but not criss-cross merges
+#   t6036: corner cases with both renames and criss-cross merges
+#
+# The setup for all of them, pictorially, is:
+#
+#      A
+#      o
+#     / \
+#  O o   ?
+#     \ /
+#      o
+#      B
+#
+# To help make it easier to follow the flow of tests, they have been
+# divided into sections and each test will start with a quick explanation
+# of what commits O, A, and B contain.
+#
+# Notation:
+#    z/{b,c}   means  files z/b and z/c both exist
+#    x/d_1     means  file x/d exists with content d1.  (Purpose of the
+#                     underscore notation is to differentiate different
+#                     files that might be renamed into each other's paths.)
+
+. ./test-lib.sh
+
+
+###########################################################################
+# SECTION 1: Basic cases we should be able to handle
+###########################################################################
+
+# Testcase 1a, Basic directory rename.
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e/f}
+#   Expected: y/{b,c,d,e/f}
+
+test_expect_success '1a-setup: Simple directory rename detection' '
+	test_create_repo 1a &&
+	(
+		cd 1a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		mkdir z/e &&
+		echo f >z/e/f &&
+		git add z/d z/e/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1a-check: Simple directory rename detection' '
+	(
+		cd 1a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d    B:z/e/f &&
+		test_cmp expect actual &&
+
+		git hash-object y/d >actual &&
+		git rev-parse B:z/d >expect &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_must_fail git rev-parse HEAD:z/e/f &&
+		test_path_is_missing z/d &&
+		test_path_is_missing z/e/f
+	)
+'
+
+# Testcase 1b, Merge a directory with another
+#   Commit O: z/{b,c},   y/d
+#   Commit A: z/{b,c,e}, y/d
+#   Commit B: y/{b,c,d}
+#   Expected: y/{b,c,d,e}
+
+test_expect_success '1b-setup: Merge a directory with another' '
+	test_create_repo 1b &&
+	(
+		cd 1b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/b y &&
+		git mv z/c y &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1b-check: Merge a directory with another' '
+	(
+		cd 1b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:y/d    A:z/e &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:z/e
+	)
+'
+
+# Testcase 1c, Transitive renaming
+#   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
+#   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   Commit O: z/{b,c},   x/d
+#   Commit A: y/{b,c},   x/d
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c,d}  (because x/d -> z/d -> y/d)
+
+test_expect_success '1c-setup: Transitive renaming' '
+	test_create_repo 1c &&
+	(
+		cd 1c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1c-check: Transitive renaming' '
+	(
+		cd 1c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:x/d &&
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_path_is_missing z/d
+	)
+'
+
+# Testcase 1d, Directory renames (merging two directories into one new one)
+#              cause a rename/rename(2to1) conflict
+#   (Related to testcases 1c and 7b)
+#   Commit O. z/{b,c},        y/{d,e}
+#   Commit A. x/{b,c},        y/{d,e,m,wham_1}
+#   Commit B. z/{b,c,n,wham_2}, x/{d,e}
+#   Expected: x/{b,c,d,e,m,n}, CONFLICT:(y/wham_1 & z/wham_2 -> x/wham)
+#   Note: y/m & z/n should definitely move into x.  By the same token, both
+#         y/wham_1 & z/wham_2 should too...giving us a conflict.
+
+test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) conflict' '
+	test_create_repo 1d &&
+	(
+		cd 1d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		echo e >y/e &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z x &&
+		echo m >y/m &&
+		echo wham1 >y/wham &&
+		git add y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y x &&
+		echo n >z/n &&
+		echo wham2 >z/wham &&
+		git add z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+	(
+		cd 1d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 8 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:x/c :0:x/d :0:x/e :0:x/m :0:x/n &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:y/d  O:y/e  A:y/m  B:z/n &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :0:x/wham &&
+		git rev-parse >actual \
+			:2:x/wham :3:x/wham &&
+		git rev-parse >expect \
+			 A:y/wham  B:z/wham &&
+		test_cmp expect actual &&
+
+		test_path_is_missing x/wham &&
+		test_path_is_file x/wham~HEAD &&
+		test_path_is_file x/wham~B^0 &&
+
+		git hash-object >actual \
+			x/wham~HEAD x/wham~B^0 &&
+		git rev-parse >expect \
+			A:y/wham    B:z/wham &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 1e, Renamed directory, with all filenames being renamed too
+#   Commit O: z/{oldb,oldc}
+#   Commit A: y/{newb,newc}
+#   Commit B: z/{oldb,oldc,d}
+#   Expected: y/{newb,newc,d}
+
+test_expect_success '1e-setup: Renamed directory, with all files being renamed too' '
+	test_create_repo 1e &&
+	(
+		cd 1e &&
+
+		mkdir z &&
+		echo b >z/oldb &&
+		echo c >z/oldc &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		git mv z/oldb y/newb &&
+		git mv z/oldc y/newc &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+	(
+		cd 1e &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/newb HEAD:y/newc HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/oldb    O:z/oldc    B:z/d &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:z/d
+	)
+'
+
+# Testcase 1f, Split a directory into two other directories
+#   (Related to testcases 3a, all of section 2, and all of section 4)
+#   Commit O: z/{b,c,d,e,f}
+#   Commit A: z/{b,c,d,e,f,g}
+#   Commit B: y/{b,c}, x/{d,e,f}
+#   Expected: y/{b,c}, x/{d,e,f,g}
+
+test_expect_success '1f-setup: Split a directory into two other directories' '
+	test_create_repo 1f &&
+	(
+		cd 1f &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		echo f >z/f &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo g >z/g &&
+		git add z/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		git mv z/e x/ &&
+		git mv z/f x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '1f-check: Split a directory into two other directories' '
+	(
+		cd 1f &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d HEAD:x/e HEAD:x/f HEAD:x/g &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    O:z/f    A:z/g &&
+		test_cmp expect actual &&
+		test_path_is_missing z/g &&
+		test_must_fail git rev-parse HEAD:z/g
+	)
+'
+
+###########################################################################
+# Rules suggested by testcases in section 1:
+#
+#   We should still detect the directory rename even if it wasn't just
+#   the directory renamed, but the files within it. (see 1b)
+#
+#   If renames split a directory into two or more others, the directory
+#   with the most renames, "wins" (see 1c).  However, see the testcases
+#   in section 2, plus testcases 3a and 4a.
+###########################################################################
+
+test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 02/31] directory rename detection: directory splitting testcases
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 01/31] directory rename detection: basic testcases Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 03/31] directory rename detection: testcases to avoid taking detection too far Elijah Newren
                   ` (29 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 143 ++++++++++++++++++++++++++++++++++++
 1 file changed, 143 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index d045f0e31e..b22a9052b3 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -439,4 +439,147 @@ test_expect_failure '1f-check: Split a directory into two other directories' '
 #   in section 2, plus testcases 3a and 4a.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 2: Split into multiple directories, with equal number of paths
+#
+# Explore the splitting-a-directory rules a bit; what happens in the
+# edge cases?
+#
+# Note that there is a closely related case of a directory not being
+# split on either side of history, but being renamed differently on
+# each side.  See testcase 8e for that.
+###########################################################################
+
+# Testcase 2a, Directory split into two on one side, with equal numbers of paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c,d}
+#   Expected: y/b, w/c, z/d, with warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2a-setup: Directory split into two on one side, with equal numbers of paths' '
+	test_create_repo 2a &&
+	(
+		cd 2a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir w &&
+		git mv z/b y/ &&
+		git mv z/c w/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+	(
+		cd 2a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT.*directory rename split" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:w/c :0:z/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 2b, Directory split into two on one side, with equal numbers of paths
+#   Commit O: z/{b,c}
+#   Commit A: y/b, w/c
+#   Commit B: z/{b,c}, x/d
+#   Expected: y/b, w/c, x/d; No warning about z/ -> (y/ vs. w/) conflict
+test_expect_success '2b-setup: Directory split into two on one side, with equal numbers of paths' '
+	test_create_repo 2b &&
+	(
+		cd 2b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir w &&
+		git mv z/b y/ &&
+		git mv z/c w/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir x &&
+		echo d >x/d &&
+		git add x/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '2b-check: Directory split into two on one side, with equal numbers of paths' '
+	(
+		cd 2b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:w/c :0:x/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:x/d &&
+		test_cmp expect actual &&
+		test_i18ngrep ! "CONFLICT.*directory rename split" out
+	)
+'
+
+###########################################################################
+# Rules suggested by section 2:
+#
+#   None; the rule was already covered in section 1.  These testcases are
+#   here just to make sure the conflict resolution and necessary warning
+#   messages are handled correctly.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 03/31] directory rename detection: testcases to avoid taking detection too far
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 01/31] directory rename detection: basic testcases Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 02/31] directory rename detection: directory splitting testcases Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 04/31] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
                   ` (28 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 153 ++++++++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b22a9052b3..8049ed5fc9 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -582,4 +582,157 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
 #   messages are handled correctly.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 3: Path in question is the source path for some rename already
+#
+# Combining cases from Section 1 and trying to handle them could lead to
+# directory renaming detection being over-applied.  So, this section
+# provides some good testcases to check that the implementation doesn't go
+# too far.
+###########################################################################
+
+# Testcase 3a, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 1c and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: z/{b,c,d} (no change)
+#   Commit B: y/{b,c}, x/d
+#   Expected: y/{b,c}, x/d
+test_expect_success '3a-setup: Avoid implicit rename if involved as source on other side' '
+	test_create_repo 3a &&
+	(
+		cd 3a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3a-check: Avoid implicit rename if involved as source on other side' '
+	(
+		cd 3a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 3b, Avoid implicit rename if involved as source on other side
+#   (Related to testcases 5c and 7c, also kind of 1e and 1f)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}, x/d
+#   Commit B: z/{b,c}, w/d
+#   Expected: y/{b,c}, CONFLICT:(z/d -> x/d vs. w/d)
+#   NOTE: We're particularly checking that since z/d is already involved as
+#         a source in a file rename on the same side of history, that we don't
+#         get it involved in directory rename detection.  If it were, we might
+#         end up with CONFLICT:(z/d -> y/d vs. x/d vs. w/d), i.e. a
+#         rename/rename/rename(1to3) conflict, which is just weird.
+test_expect_success '3b-setup: Avoid implicit rename if involved as source on current side' '
+	test_create_repo 3b &&
+	(
+		cd 3b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir w &&
+		git mv z/d w/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '3b-check: Avoid implicit rename if involved as source on current side' '
+	(
+		cd 3b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/d.*x/d.*w/d out &&
+		test_i18ngrep ! CONFLICT.*rename/rename.*y/d out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :1:z/d :2:x/d :3:w/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:z/d  O:z/d  O:z/d &&
+		test_cmp expect actual &&
+
+		test_path_is_missing z/d &&
+		git hash-object >actual \
+			x/d   w/d &&
+		git rev-parse >expect \
+			O:z/d O:z/d &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 3:
+#
+#   Avoid directory-rename-detection for a path, if that path is the source
+#   of a rename on either side of a merge.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 04/31] directory rename detection: partially renamed directory testcase/discussion
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (2 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 03/31] directory rename detection: testcases to avoid taking detection too far Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 05/31] directory rename detection: files/directories in the way of some renames Elijah Newren
                   ` (27 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 107 ++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 8049ed5fc9..f0213f2bbd 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -735,4 +735,111 @@ test_expect_success '3b-check: Avoid implicit rename if involved as source on cu
 #   of a rename on either side of a merge.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 4: Partially renamed directory; still exists on both sides of merge
+#
+# What if we were to attempt to do directory rename detection when someone
+# "mostly" moved a directory but still left some files around, or,
+# equivalently, fully renamed a directory in one commmit and then recreated
+# that directory in a later commit adding some new files and then tried to
+# merge?
+#
+# It's hard to divine user intent in these cases, because you can make an
+# argument that, depending on the intermediate history of the side being
+# merged, that some users will want files in that directory to
+# automatically be detected and renamed, while users with a different
+# intermediate history wouldn't want that rename to happen.
+#
+# I think that it is best to simply not have directory rename detection
+# apply to such cases.  My reasoning for this is four-fold: (1) it's
+# easiest for users in general to figure out what happened if we don't
+# apply directory rename detection in any such case, (2) it's an easy rule
+# to explain ["We don't do directory rename detection if the directory
+# still exists on both sides of the merge"], (3) we can get some hairy
+# edge/corner cases that would be really confusing and possibly not even
+# representable in the index if we were to even try, and [related to 3] (4)
+# attempting to resolve this issue of divining user intent by examining
+# intermediate history goes against the spirit of three-way merges and is a
+# path towards crazy corner cases that are far more complex than what we're
+# already dealing with.
+#
+# This section contains a test for this partially-renamed-directory case.
+###########################################################################
+
+# Testcase 4a, Directory split, with original directory still present
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d,e}
+#   Commit A: y/{b,c,d}, z/e
+#   Commit B: z/{b,c,d,e,f}
+#   Expected: y/{b,c,d}, z/{e,f}
+#   NOTE: Even though most files from z moved to y, we don't want f to follow.
+
+test_expect_success '4a-setup: Directory split, with original directory still present' '
+	test_create_repo 4a &&
+	(
+		cd 4a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		echo e >z/e &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir y &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo f >z/f &&
+		git add z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '4a-check: Directory split, with original directory still present' '
+	(
+		cd 4a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/e HEAD:z/f &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:z/d    O:z/e    B:z/f &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 4:
+#
+#   Directory-rename-detection should be turned off for any directories (as
+#   a source for renames) that exist on both sides of the merge.  (The "as
+#   a source for renames" clarification is due to cases like 1c where
+#   the target directory exists on both sides and we do want the rename
+#   detection.)  But, sadly, see testcase 8b.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 05/31] directory rename detection: files/directories in the way of some renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (3 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 04/31] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 06/31] directory rename detection: testcases checking which side did the rename Elijah Newren
                   ` (26 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 330 ++++++++++++++++++++++++++++++++++++
 1 file changed, 330 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index f0213f2bbd..ac9c3e9974 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -842,4 +842,334 @@ test_expect_success '4a-check: Directory split, with original directory still pr
 #   detection.)  But, sadly, see testcase 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 5: Files/directories in the way of subset of to-be-renamed paths
+#
+# Implicitly renaming files due to a detected directory rename could run
+# into problems if there are files or directories in the way of the paths
+# we want to rename.  Explore such cases in this section.
+###########################################################################
+
+# Testcase 5a, Merge directories, other side adds files to original and target
+#   Commit O: z/{b,c},       y/d
+#   Commit A: z/{b,c,e_1,f}, y/{d,e_2}
+#   Commit B: y/{b,c,d}
+#   Expected: z/e_1, y/{b,c,d,e_2,f} + CONFLICT warning
+#   NOTE: While directory rename detection is active here causing z/f to
+#         become y/f, we did not apply this for z/e_1 because that would
+#         give us an add/add conflict for y/e_1 vs y/e_2.  This problem with
+#         this add/add, is that both versions of y/e are from the same side
+#         of history, giving us no way to represent this conflict in the
+#         index.
+
+test_expect_success '5a-setup: Merge directories, other side adds files to original and target' '
+	test_create_repo 5a &&
+	(
+		cd 5a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir y &&
+		echo d >y/d &&
+		git add z y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e1 >z/e &&
+		echo f >z/f &&
+		echo e2 >y/e &&
+		git add z/e z/f y/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+	(
+		cd 5a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT.*implicit dir rename" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/d :0:y/e :0:z/e :0:y/f &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:y/d  A:y/e  A:z/e  A:z/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 5b, Rename/delete in order to get add/add/add conflict
+#   (Related to testcase 8d; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: y/{b,c,d_2}
+#   Commit B: z/{b,c,d_1,e}, y/d_3
+#   Expected: y/{b,c,e}, CONFLICT(add/add: y/d_2 vs. y/d_3)
+#   NOTE: If z/d_1 in commit B were to be involved in dir rename detection, as
+#         we normaly would since z/ is being renamed to y/, then this would be
+#         a rename/delete (z/d_1 -> y/d_1 vs. deleted) AND an add/add/add
+#         conflict of y/d_1 vs. y/d_2 vs. y/d_3.  Add/add/add is not
+#         representable in the index, so the existence of y/d_3 needs to
+#         cause us to bail on directory rename detection for that path, falling
+#         back to git behavior without the directory rename detection.
+
+test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflict' '
+	test_create_repo 5b &&
+	(
+		cd 5b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d1 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		echo d2 >y/d &&
+		git add y/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		echo d3 >y/d &&
+		echo e >z/e &&
+		git add y/d z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+	(
+		cd 5b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e :2:y/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e  A:y/d  B:y/d &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		test_path_is_file y/d
+	)
+'
+
+# Testcase 5c, Transitive rename would cause rename/rename/rename/add/add/add
+#   (Directory rename detection would result in transitive rename vs.
+#    rename/rename(1to2) and turn it into a rename/rename(1to3).  Further,
+#    rename paths conflict with separate adds on the other side)
+#   (Related to testcases 3b and 7c)
+#   Commit O: z/{b,c}, x/d_1
+#   Commit A: y/{b,c,d_2}, w/d_1
+#   Commit B: z/{b,c,d_1,e}, w/d_3, y/d_4
+#   Expected: A mess, but only a rename/rename(1to2)/add/add mess.  Use the
+#             presence of y/d_4 in B to avoid doing transitive rename of
+#             x/d_1 -> z/d_1 -> y/d_1, so that the only paths we have at
+#             y/d are y/d_2 and y/d_4.  We still do the move from z/e to y/e,
+#             though, because it doesn't have anything in the way.
+
+test_expect_success '5c-setup: Transitive rename would cause rename/rename/rename/add/add/add' '
+	test_create_repo 5c &&
+	(
+		cd 5c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d1 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		echo d2 >y/d &&
+		git add y/d &&
+		git mv x w &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		mkdir w &&
+		mkdir y &&
+		echo d3 >w/d &&
+		echo d4 >y/d &&
+		echo e >z/e &&
+		git add w/ y/ z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+	(
+		cd 5c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*z/d" out &&
+		test_i18ngrep "CONFLICT (add/add).* y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 9 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		git rev-parse >actual \
+			:2:w/d :3:w/d :1:x/d :2:y/d :3:y/d :3:z/d &&
+		git rev-parse >expect \
+			 O:x/d  B:w/d  O:x/d  A:y/d  B:y/d  O:x/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			w/d~HEAD w/d~B^0 z/d &&
+		git rev-parse >expect \
+			O:x/d    B:w/d   O:x/d &&
+		test_cmp expect actual &&
+		test_path_is_missing x/d &&
+		test_path_is_file y/d &&
+		grep -q "<<<<" y/d  # conflict markers should be present
+	)
+'
+
+# Testcase 5d, Directory/file/file conflict due to directory rename
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c,d_1}
+#   Commit B: z/{b,c,d_2,f}, y/d/e
+#   Expected: y/{b,c,d/e,f}, z/d_2, CONFLICT(file/directory), y/d_1~HEAD
+#   Note: The fact that y/d/ exists in B makes us bail on directory rename
+#         detection for z/d_2, but that doesn't prevent us from applying the
+#         directory rename detection for z/f -> y/f.
+
+test_expect_success '5d-setup: Directory/file/file conflict due to directory rename' '
+	test_create_repo 5d &&
+	(
+		cd 5d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		echo d1 >y/d &&
+		git add y/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir -p y/d &&
+		echo e >y/d/e &&
+		echo d2 >z/d &&
+		echo f >z/f &&
+		git add y/d/e z/d z/f &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+	(
+		cd 5d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (file/directory).*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:z/d :0:y/f :2:y/d :0:y/d/e &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/d  B:z/f  A:y/d  B:y/d/e &&
+		test_cmp expect actual &&
+
+		git hash-object y/d~HEAD >actual &&
+		git rev-parse A:y/d >expect &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 5:
+#
+#   If a subset of to-be-renamed files have a file or directory in the way,
+#   "turn off" the directory rename for those specific sub-paths, falling
+#   back to old handling.  But, sadly, see testcases 8a and 8b.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 06/31] directory rename detection: testcases checking which side did the rename
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (4 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 05/31] directory rename detection: files/directories in the way of some renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 07/31] directory rename detection: more involved edge/corner testcases Elijah Newren
                   ` (25 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 336 ++++++++++++++++++++++++++++++++++++
 1 file changed, 336 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index ac9c3e9974..fbeb8f4316 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1172,4 +1172,340 @@ test_expect_failure '5d-check: Directory/file/file conflict due to directory ren
 #   back to old handling.  But, sadly, see testcases 8a and 8b.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 6: Same side of the merge was the one that did the rename
+#
+# It may sound obvious that you only want to apply implicit directory
+# renames to directories if the _other_ side of history did the renaming.
+# If you did make an implementation that didn't explicitly enforce this
+# rule, the majority of cases that would fall under this section would
+# also be solved by following the rules from the above sections.  But
+# there are still a few that stick out, so this section covers them just
+# to make sure we also get them right.
+###########################################################################
+
+# Testcase 6a, Tricky rename/delete
+#   Commit O: z/{b,c,d}
+#   Commit A: z/b
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/b, CONFLICT(rename/delete, z/c -> y/c vs. NULL)
+#   Note: We're just checking here that the rename of z/b and z/c to put
+#         them under y/ doesn't accidentally catch z/d and make it look like
+#         it is also involved in a rename/delete conflict.
+
+test_expect_success '6a-setup: Tricky rename/delete' '
+	test_create_repo 6a &&
+	(
+		cd 6a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		git rm z/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6a-check: Tricky rename/delete' '
+	(
+		cd 6a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*z/c.*y/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 2 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :3:y/c &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6b, Same rename done on both sides
+#   (Related to testcases 6c and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   Note: If we did directory rename detection here, we'd move z/d into y/,
+#         but B did that rename and still decided to put the file into z/,
+#         so we probably shouldn't apply directory rename detection for it.
+
+test_expect_success '6b-setup: Same rename done on both sides' '
+	test_create_repo 6b &&
+	(
+		cd 6b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6b-check: Same rename done on both sides' '
+	(
+		cd 6b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6c, Rename only done on same side
+#   (Related to testcases 6b and 8e)
+#   Commit O: z/{b,c}
+#   Commit A: z/{b,c} (no change)
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Seems obvious, but just checking that the implementation doesn't
+#         "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6c-setup: Rename only done on same side' '
+	test_create_repo 6c &&
+	(
+		cd 6c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6c-check: Rename only done on same side' '
+	(
+		cd 6c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6d, We don't always want transitive renaming
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: z/{b,c}, x/d (no change)
+#   Commit B: y/{b,c}, z/d
+#   Expected: y/{b,c}, z/d
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c,d}.
+
+test_expect_success '6d-setup: We do not always want transitive renaming' '
+	test_create_repo 6d &&
+	(
+		cd 6d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		git mv x z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6d-check: We do not always want transitive renaming' '
+	(
+		cd 6d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 6e, Add/add from one-side
+#   Commit O: z/{b,c}
+#   Commit A: z/{b,c} (no change)
+#   Commit B: y/{b,c,d_1}, z/d_2
+#   Expected: y/{b,c,d_1}, z/d_2
+#   NOTE: Again, this seems obvious but just checking that the implementation
+#         doesn't "accidentally detect a rename" and give us y/{b,c} +
+#         add/add conflict on y/d_1 vs y/d_2.
+
+test_expect_success '6e-setup: Add/add from one side' '
+	test_create_repo 6e &&
+	(
+		cd 6e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		git commit --allow-empty -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		echo d1 > y/d &&
+		mkdir z &&
+		echo d2 > z/d &&
+		git add y/d z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '6e-check: Add/add from one side' '
+	(
+		cd 6e &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:z/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:y/d    B:z/d &&
+		test_cmp expect actual
+	)
+'
+
+###########################################################################
+# Rules suggested by section 6:
+#
+#   Only apply implicit directory renames to directories if the other
+#   side of history is the one doing the renaming.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 07/31] directory rename detection: more involved edge/corner testcases
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (5 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 06/31] directory rename detection: testcases checking which side did the rename Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 08/31] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
                   ` (24 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 396 ++++++++++++++++++++++++++++++++++++
 1 file changed, 396 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index fbeb8f4316..68bd86f555 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1508,4 +1508,400 @@ test_expect_success '6e-check: Add/add from one side' '
 #   side of history is the one doing the renaming.
 ###########################################################################
 
+
+###########################################################################
+# SECTION 7: More involved Edge/Corner cases
+#
+# The ruleset we have generated in the above sections seems to provide
+# well-defined merges.  But can we find edge/corner cases that either (a)
+# are harder for users to understand, or (b) have a resolution that is
+# non-intuitive or suboptimal?
+#
+# The testcases in this section dive into cases that I've tried to craft in
+# a way to find some that might be surprising to users or difficult for
+# them to understand (the next section will look at non-intuitive or
+# suboptimal merge results).  Some of the testcases are similar to ones
+# from past sections, but have been simplified to try to highlight error
+# messages using a "modified" path (due to the directory rename).  Are
+# users okay with these?
+#
+# In my opinion, testcases that are difficult to understand from this
+# section is due to difficulty in the testcase rather than the directory
+# renaming (similar to how t6042 and t6036 have difficult resolutions due
+# to the problem setup itself being complex).  And I don't think the
+# error messages are a problem.
+#
+# On the other hand, the testcases in section 8 worry me slightly more...
+###########################################################################
+
+# Testcase 7a, rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/b, x/c, z/d
+#   Expected: y/d, CONFLICT(rename/rename for both z/b and z/c)
+#   NOTE: There's a rename of z/ here, y/ has more renames, so z/d -> y/d.
+
+test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	test_create_repo 7a &&
+	(
+		cd 7a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir w &&
+		mkdir x &&
+		git mv z/b w/ &&
+		git mv z/c x/ &&
+		echo d > z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+	(
+		cd 7a &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*z/b.*y/b.*w/b" out &&
+		test_i18ngrep "CONFLICT (rename/rename).*z/c.*y/c.*x/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:x/c :0:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/b   w/b   y/c   x/c &&
+		git rev-parse >expect \
+			O:z/b O:z/b O:z/c O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7b, rename/rename(2to1), but only due to transitive rename
+#   (Related to testcase 1d)
+#   Commit O: z/{b,c},     x/d_1, w/d_2
+#   Commit A: y/{b,c,d_2}, x/d_1
+#   Commit B: z/{b,c,d_1},        w/d_2
+#   Expected: y/{b,c}, CONFLICT(rename/rename(2to1): x/d_1, w/d_2 -> y_d)
+
+test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive rename' '
+	test_create_repo 7b &&
+	(
+		cd 7b &&
+
+		mkdir z &&
+		mkdir x &&
+		mkdir w &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d1 > x/d &&
+		echo d2 > w/d &&
+		git add z x w &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv w/d y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+	(
+		cd 7b &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :2:y/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:w/d  O:x/d &&
+		test_cmp expect actual &&
+
+		test_path_is_missing y/d &&
+		test_path_is_file y/d~HEAD &&
+		test_path_is_file y/d~B^0 &&
+
+		git hash-object >actual \
+			y/d~HEAD y/d~B^0 &&
+		git rev-parse >expect \
+			O:w/d    O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7c, rename/rename(1to...2or3); transitive rename may add complexity
+#   (Related to testcases 3b and 5c)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: y/{b,c}, w/d
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(x/d -> w/d vs. y/d)
+#   NOTE: z/ was renamed to y/ so we do want to report
+#         neither CONFLICT(x/d -> w/d vs. z/d)
+#         nor CONFLiCT x/d -> w/d vs. y/d vs. z/d)
+
+test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may add complexity' '
+	test_create_repo 7c &&
+	(
+		cd 7c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv x w &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+	(
+		cd 7c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/rename).*x/d.*w/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :1:x/d :2:w/d :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:x/d  O:x/d  O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7d, transitive rename involved in rename/delete; how is it reported?
+#   (Related somewhat to testcases 5b and 8d)
+#   Commit O: z/{b,c}, x/d
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d}
+#   Expected: y/{b,c}, CONFLICT(delete x/d vs rename to y/d)
+#   NOTE: z->y so NOT CONFLICT(delete x/d vs rename to z/d)
+
+test_expect_success '7d-setup: transitive rename involved in rename/delete; how is it reported?' '
+	test_create_repo 7d &&
+	(
+		cd 7d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git rm -rf x &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+	(
+		cd 7d &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  O:x/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 7e, transitive rename in rename/delete AND dirs in the way
+#   (Very similar to 'both rename source and destination involved in D/F conflict' from t6022-merge-rename.sh)
+#   (Also related to testcases 9c and 9d)
+#   Commit O: z/{b,c},     x/d_1
+#   Commit A: y/{b,c,d/g}, x/d/f
+#   Commit B: z/{b,c,d_1}
+#   Expected: rename/delete(x/d_1->y/d_1 vs. None) + D/F conflict on y/d
+#             y/{b,c,d/g}, y/d_1~B^0, x/d/f
+
+#   NOTE: The main path of interest here is d_1 and where it ends up, but
+#         this is actually a case that has two potential directory renames
+#         involved and D/F conflict(s), so it makes sense to walk through
+#         each step.
+#
+#         Commit A renames z/ -> y/.  Thus everything that B adds to z/
+#         should be instead moved to y/.  This gives us the D/F conflict on
+#         y/d because x/d_1 -> z/d_1 -> y/d_1 conflicts with y/d/g.
+#
+#         Further, commit B renames x/ -> z/, thus everything A adds to x/
+#         should instead be moved to z/...BUT we removed z/ and renamed it
+#         to y/, so maybe everything should move not from x/ to z/, but
+#         from x/ to z/ to y/.  Doing so might make sense from the logic so
+#         far, but note that commit A had both an x/ and a y/; it did the
+#         renaming of z/ to y/ and created x/d/f and it clearly made these
+#         things separate, so it doesn't make much sense to push these
+#         together.  Doing so is what I'd call a doubly transitive rename;
+#         see testcases 9c and 9d for further discussion of this issue and
+#         how it's resolved.
+
+test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in the way' '
+	test_create_repo 7e &&
+	(
+		cd 7e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d1 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git rm x/d &&
+		mkdir -p x/d &&
+		mkdir -p y/d &&
+		echo f >x/d/f &&
+		echo g >y/d/g &&
+		git add x/d/f y/d/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/ &&
+		rmdir x &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+	(
+		cd 7e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).*x/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 5 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:x/d/f :0:y/d/g :0:y/b :0:y/c :3:y/d &&
+		git rev-parse >expect \
+			 A:x/d/f  A:y/d/g  O:z/b  O:z/c  O:x/d &&
+		test_cmp expect actual &&
+
+		git hash-object y/d~B^0 >actual &&
+		git rev-parse O:x/d >expect &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 08/31] directory rename detection: testcases exploring possibly suboptimal merges
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (6 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 07/31] directory rename detection: more involved edge/corner testcases Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 09/31] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
                   ` (23 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 404 ++++++++++++++++++++++++++++++++++++
 1 file changed, 404 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 68bd86f555..e2db5d0ac1 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -1904,4 +1904,408 @@ test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in th
 	)
 '
 
+###########################################################################
+# SECTION 8: Suboptimal merges
+#
+# As alluded to in the last section, the ruleset we have built up for
+# detecting directory renames unfortunately has some special cases where it
+# results in slightly suboptimal or non-intuitive behavior.  This section
+# explores these cases.
+#
+# To be fair, we already had non-intuitive or suboptimal behavior for most
+# of these cases in git before introducing implicit directory rename
+# detection, but it'd be nice if there was a modified ruleset out there
+# that handled these cases a bit better.
+###########################################################################
+
+# Testcase 8a, Dual-directory rename, one into the others' way
+#   Commit O. x/{a,b},   y/{c,d}
+#   Commit A. x/{a,b,e}, y/{c,d,f}
+#   Commit B. y/{a,b},   z/{c,d}
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: y/{a,b,f},   z/{c,d},   x/e
+#   Currently expected:       y/{a,b,e,f}, z/{c,d}
+#   Optimal:                  y/{a,b,e},   z/{c,d,f}
+#
+# Note: Both x and y got renamed and it'd be nice to detect both, and we do
+# better with directory rename detection than git did without, but the
+# simple rule from section 5 prevents me from handling this as optimally as
+# we potentially could.
+
+test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
+	test_create_repo 8a &&
+	(
+		cd 8a &&
+
+		mkdir x &&
+		mkdir y &&
+		echo a >x/a &&
+		echo b >x/b &&
+		echo c >y/c &&
+		echo d >y/d &&
+		git add x y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e >x/e &&
+		echo f >y/f &&
+		git add x/e y/f &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y z &&
+		git mv x y &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+	(
+		cd 8a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/a HEAD:y/b HEAD:y/e HEAD:y/f HEAD:z/c HEAD:z/d &&
+		git rev-parse >expect \
+			O:x/a    O:x/b    A:x/e    A:y/f    O:y/c    O:y/d &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8b, Dual-directory rename, one into the others' way, with conflicting filenames
+#   Commit O. x/{a_1,b_1},     y/{a_2,b_2}
+#   Commit A. x/{a_1,b_1,e_1}, y/{a_2,b_2,e_2}
+#   Commit B. y/{a_1,b_1},     z/{a_2,b_2}
+#
+#   w/o dir-rename detection: y/{a_1,b_1,e_2}, z/{a_2,b_2}, x/e_1
+#   Currently expected:       <same>
+#   Scary:                    y/{a_1,b_1},     z/{a_2,b_2}, CONFLICT(add/add, e_1 vs. e_2)
+#   Optimal:                  y/{a_1,b_1,e_1}, z/{a_2,b_2,e_2}
+#
+# Note: Very similar to 8a, except instead of 'e' and 'f' in directories x and
+# y, both are named 'e'.  Without directory rename detection, neither file
+# moves directories.  Implement directory rename detection suboptimally, and
+# you get an add/add conflict, but both files were added in commit A, so this
+# is an add/add conflict where one side of history added both files --
+# something we can't represent in the index.  Obviously, we'd prefer the last
+# resolution, but our previous rules are too coarse to allow it.  Using both
+# the rules from section 4 and section 5 save us from the Scary resolution,
+# making us fall back to pre-directory-rename-detection behavior for both
+# e_1 and e_2.
+
+test_expect_success '8b-setup: Dual-directory rename, one into the others way, with conflicting filenames' '
+	test_create_repo 8b &&
+	(
+		cd 8b &&
+
+		mkdir x &&
+		mkdir y &&
+		echo a1 >x/a &&
+		echo b1 >x/b &&
+		echo a2 >y/a &&
+		echo b2 >y/b &&
+		git add x y &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo e1 >x/e &&
+		echo e2 >y/e &&
+		git add x/e y/e &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv y z &&
+		git mv x y &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '8b-check: Dual-directory rename, one into the others way, with conflicting filenames' '
+	(
+		cd 8b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/a HEAD:y/b HEAD:z/a HEAD:z/b HEAD:x/e HEAD:y/e &&
+		git rev-parse >expect \
+			O:x/a    O:x/b    O:y/a    O:y/b    A:x/e    A:y/e &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8c, rename+modify/delete
+#   (Related to testcases 5b and 8d)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d_modified,e}
+#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
+#
+#   Note: This testcase doesn't present any concerns for me...until you
+#         compare it with testcases 5b and 8d.  See notes in 8d for more
+#         details.
+
+test_expect_success '8c-setup: rename+modify/delete' '
+	test_create_repo 8c &&
+	(
+		cd 8c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		test_seq 1 10 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo 11 >z/d &&
+		test_chmod +x z/d &&
+		echo e >z/e &&
+		git add z/d z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8c-check: rename+modify/delete' '
+	(
+		cd 8c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		test_i18ngrep "CONFLICT (rename/delete).* z/d.*y/d" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			:0:y/b :0:y/c :0:y/e :3:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/c  B:z/e  B:z/d &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/d &&
+		test_must_fail git rev-parse :2:y/d &&
+		git ls-files -s y/d | grep ^100755 &&
+		test_path_is_file y/d
+	)
+'
+
+# Testcase 8d, rename/delete...or not?
+#   (Related to testcase 5b; these may appear slightly inconsistent to users;
+#    Also related to testcases 7d and 7e)
+#   Commit O: z/{b,c,d}
+#   Commit A: y/{b,c}
+#   Commit B: z/{b,c,d,e}
+#   Expected: y/{b,c,e}
+#
+#   Note: It would also be somewhat reasonable to resolve this as
+#             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
+#   The logic being that the only difference between this testcase and 8c
+#   is that there is no modification to d.  That suggests that instead of a
+#   rename/modify vs. delete conflict, we should just have a rename/delete
+#   conflict, otherwise we are being inconsistent.
+#
+#   However...as far as consistency goes, we didn't report a conflict for
+#   path d_1 in testcase 5b due to a different file being in the way.  So,
+#   we seem to be forced to have cases where users can change things
+#   slightly and get what they may perceive as inconsistent results.  It
+#   would be nice to avoid that, but I'm not sure I see how.
+#
+#   In this case, I'm leaning towards: commit A was the one that deleted z/d
+#   and it did the rename of z to y, so the two "conflicts" (rename vs.
+#   delete) are both coming from commit A, which is illogical.  Conflicts
+#   during merging are supposed to be about opposite sides doing things
+#   differently.
+
+test_expect_success '8d-setup: rename/delete...or not?' '
+	test_create_repo 8d &&
+	(
+		cd 8d &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		test_seq 1 10 >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/d &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8d-check: rename/delete...or not?' '
+	(
+		cd 8d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/e &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/e &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 8e, Both sides rename, one side adds to original directory
+#   Commit O: z/{b,c}
+#   Commit A: y/{b,c}
+#   Commit B: w/{b,c}, z/d
+#
+# Possible Resolutions:
+#   w/o dir-rename detection: z/d, CONFLICT(z/b -> y/b vs. w/b),
+#                                  CONFLICT(z/c -> y/c vs. w/c)
+#   Currently expected:       y/d, CONFLICT(z/b -> y/b vs. w/b),
+#                                  CONFLICT(z/c -> y/c vs. w/c)
+#   Optimal:                  ??
+#
+# Notes: In commit A, directory z got renamed to y.  In commit B, directory z
+#        did NOT get renamed; the directory is still present; instead it is
+#        considered to have just renamed a subset of paths in directory z
+#        elsewhere.  Therefore, the directory rename done in commit A to z/
+#        applies to z/d and maps it to y/d.
+#
+#        It's possible that users would get confused about this, but what
+#        should we do instead?  Silently leaving at z/d seems just as bad or
+#        maybe even worse.  Perhaps we could print a big warning about z/d
+#        and how we're moving to y/d in this case, but when I started thinking
+#        about the ramifications of doing that, I didn't know how to rule out
+#        that opening other weird edge and corner cases so I just punted.
+
+test_expect_success '8e-setup: Both sides rename, one side adds to original directory' '
+	test_create_repo 8e &&
+	(
+		cd 8e &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z w &&
+		mkdir z &&
+		echo d >z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+	(
+		cd 8e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/c.*y/c.*w/c out &&
+		test_i18ngrep CONFLICT.*rename/rename.*z/b.*y/b.*w/b out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:1:z/b :2:y/b :3:w/b :1:z/c :2:y/c :3:w/c :0:y/d &&
+		git rev-parse >expect \
+			 O:z/b  O:z/b  O:z/b  O:z/c  O:z/c  O:z/c  B:z/d &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/b   w/b   y/c   w/c &&
+		git rev-parse >expect \
+			O:z/b O:z/b O:z/c O:z/c &&
+		test_cmp expect actual &&
+
+		test_path_is_missing z/b &&
+		test_path_is_missing z/c
+	)
+'
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 09/31] directory rename detection: miscellaneous testcases to complete coverage
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (7 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 08/31] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 10/31] directory rename detection: tests for handling overwriting untracked files Elijah Newren
                   ` (22 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

I came up with the testcases in the first eight sections before coding up
the implementation.  The testcases in this section were mostly ones I
thought of while coding/debugging, and which I was too lazy to insert
into the previous sections because I didn't want to re-label with all the
testcase references.  :-)

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 565 +++++++++++++++++++++++++++++++++++-
 1 file changed, 564 insertions(+), 1 deletion(-)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index e2db5d0ac1..b730256653 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -305,6 +305,7 @@ test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) con
 '
 
 # Testcase 1e, Renamed directory, with all filenames being renamed too
+#   (Related to testcases 9f & 9g)
 #   Commit O: z/{oldb,oldc}
 #   Commit A: y/{newb,newc}
 #   Commit B: z/{oldb,oldc,d}
@@ -593,7 +594,7 @@ test_expect_success '2b-check: Directory split into two on one side, with equal
 ###########################################################################
 
 # Testcase 3a, Avoid implicit rename if involved as source on other side
-#   (Related to testcases 1c and 1f)
+#   (Related to testcases 1c, 1f, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: z/{b,c,d} (no change)
 #   Commit B: y/{b,c}, x/d
@@ -2308,4 +2309,566 @@ test_expect_failure '8e-check: Both sides rename, one side adds to original dire
 	)
 '
 
+###########################################################################
+# SECTION 9: Other testcases
+#
+# This section consists of miscellaneous testcases I thought of during
+# the implementation which round out the testing.
+###########################################################################
+
+# Testcase 9a, Inner renamed directory within outer renamed directory
+#   (Related to testcase 1f)
+#   Commit O: z/{b,c,d/{e,f,g}}
+#   Commit A: y/{b,c}, x/w/{e,f,g}
+#   Commit B: z/{b,c,d/{e,f,g,h},i}
+#   Expected: y/{b,c,i}, x/w/{e,f,g,h}
+#   NOTE: The only reason this one is interesting is because when a directory
+#         is split into multiple other directories, we determine by the weight
+#         of which one had the most paths going to it.  A naive implementation
+#         of that could take the new file in commit B at z/i to x/w/i or x/i.
+
+test_expect_success '9a-setup: Inner renamed directory within outer renamed directory' '
+	test_create_repo 9a &&
+	(
+		cd 9a &&
+
+		mkdir -p z/d &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo e >z/d/e &&
+		echo f >z/d/f &&
+		echo g >z/d/g &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir x &&
+		git mv z/d x/w &&
+		git mv z y &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo h >z/d/h &&
+		echo i >z/i &&
+		git add z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+	(
+		cd 9a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/i &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    B:z/i &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			HEAD:x/w/e HEAD:x/w/f HEAD:x/w/g HEAD:x/w/h &&
+		git rev-parse >expect \
+			O:z/d/e    O:z/d/f    O:z/d/g    B:z/d/h &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9b, Transitive rename with content merge
+#   (Related to testcase 1c)
+#   Commit O: z/{b,c},   x/d_1
+#   Commit A: y/{b,c},   x/d_2
+#   Commit B: z/{b,c,d_3}
+#   Expected: y/{b,c,d_merged}
+
+test_expect_success '9b-setup: Transitive rename with content merge' '
+	test_create_repo 9b &&
+	(
+		cd 9b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		test_seq 1 10 >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		test_seq 1 11 >x/d &&
+		git add x/d &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		test_seq 0 10 >x/d &&
+		git mv x/d z/d &&
+		git add z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9b-check: Transitive rename with content merge' '
+	(
+		cd 9b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		test_seq 0 11 >expected &&
+		test_cmp expected y/d &&
+		git add expected &&
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    :0:expected &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:x/d &&
+		test_must_fail git rev-parse HEAD:z/d &&
+		test_path_is_missing z/d &&
+
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse O:x/d) &&
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse A:x/d) &&
+		test $(git rev-parse HEAD:y/d) != $(git rev-parse B:z/d)
+	)
+'
+
+# Testcase 9c, Doubly transitive rename?
+#   (Related to testcase 1c, 7e, and 9d)
+#   Commit O: z/{b,c},     x/{d,e},    w/f
+#   Commit A: y/{b,c},     x/{d,e,f,g}
+#   Commit B: z/{b,c,d,e},             w/f
+#   Expected: y/{b,c,d,e}, x/{f,g}
+#
+#   NOTE: x/f and x/g may be slightly confusing here.  The rename from w/f to
+#         x/f is clear.  Let's look beyond that.  Here's the logic:
+#            Commit B renamed x/ -> z/
+#            Commit A renamed z/ -> y/
+#         So, we could possibly further rename x/f to z/f to y/f, a doubly
+#         transient rename.  However, where does it end?  We can chain these
+#         indefinitely (see testcase 9d).  What if there is a D/F conflict
+#         at z/f/ or y/f/?  Or just another file conflict at one of those
+#         paths?  In the case of an N-long chain of transient renamings,
+#         where do we "abort" the rename at?  Can the user make sense of
+#         the resulting conflict and resolve it?
+#
+#         To avoid this confusion I use the simple rule that if the other side
+#         of history did a directory rename to a path that your side renamed
+#         away, then ignore that particular rename from the other side of
+#         history for any implicit directory renames.
+
+test_expect_success '9c-setup: Doubly transitive rename?' '
+	test_create_repo 9c &&
+	(
+		cd 9c &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		mkdir x &&
+		echo d >x/d &&
+		echo e >x/e &&
+		mkdir w &&
+		echo f >w/f &&
+		git add z x w &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z y &&
+		git mv w/f x/ &&
+		echo g >x/g &&
+		git add x/g &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/d &&
+		git mv x/e z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9c-check: Doubly transitive rename?' '
+	(
+		cd 9c &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+		test_i18ngrep "WARNING: Avoiding applying x -> z rename to x/f" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:y/d HEAD:y/e HEAD:x/f HEAD:x/g &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    O:x/d    O:x/e    O:w/f    A:x/g &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9d, N-fold transitive rename?
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit O: z/a, y/b, x/c, w/d, v/e, u/f
+#   Commit A:  y/{a,b},  w/{c,d},  u/{e,f}
+#   Commit B: z/{a,t}, x/{b,c}, v/{d,e}, u/f
+#   Expected: <see NOTE first>
+#
+#   NOTE: z/ -> y/ (in commit A)
+#         y/ -> x/ (in commit B)
+#         x/ -> w/ (in commit A)
+#         w/ -> v/ (in commit B)
+#         v/ -> u/ (in commit A)
+#         So, if we add a file to z, say z/t, where should it end up?  In u?
+#         What if there's another file or directory named 't' in one of the
+#         intervening directories and/or in u itself?  Also, shouldn't the
+#         same logic that places 't' in u/ also move ALL other files to u/?
+#         What if there are file or directory conflicts in any of them?  If
+#         we attempted to do N-way (N-fold? N-ary? N-uple?) transitive renames
+#         like this, would the user have any hope of understanding any
+#         conflicts or how their working tree ended up?  I think not, so I'm
+#         ruling out N-ary transitive renames for N>1.
+#
+#   Therefore our expected result is:
+#     z/t, y/a, x/b, w/c, u/d, u/e, u/f
+#   The reason that v/d DOES get transitively renamed to u/d is that u/ isn't
+#   renamed somewhere.  A slightly sub-optimal result, but it uses fairly
+#   simple rules that are consistent with what we need for all the other
+#   testcases and simplifies things for the user.
+
+test_expect_success '9d-setup: N-way transitive rename?' '
+	test_create_repo 9d &&
+	(
+		cd 9d &&
+
+		mkdir z y x w v u &&
+		echo a >z/a &&
+		echo b >y/b &&
+		echo c >x/c &&
+		echo d >w/d &&
+		echo e >v/e &&
+		echo f >u/f &&
+		git add z y x w v u &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/a y/ &&
+		git mv x/c w/ &&
+		git mv v/e u/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo t >z/t &&
+		git mv y/b x/ &&
+		git mv w/d v/ &&
+		git add z/t &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9d-check: N-way transitive rename?' '
+	(
+		cd 9d &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 >out &&
+		test_i18ngrep "WARNING: Avoiding applying z -> y rename to z/t" out &&
+		test_i18ngrep "WARNING: Avoiding applying y -> x rename to y/a" out &&
+		test_i18ngrep "WARNING: Avoiding applying x -> w rename to x/b" out &&
+		test_i18ngrep "WARNING: Avoiding applying w -> v rename to w/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -o >out &&
+		test_line_count = 1 out &&
+
+		git rev-parse >actual \
+			HEAD:z/t \
+			HEAD:y/a HEAD:x/b HEAD:w/c \
+			HEAD:u/d HEAD:u/e HEAD:u/f &&
+		git rev-parse >expect \
+			B:z/t    \
+			O:z/a    O:y/b    O:x/c    \
+			O:w/d    O:v/e    A:u/f &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9e, N-to-1 whammo
+#   (Related to testcase 9c...and 1c and 7e)
+#   Commit O: dir1/{a,b}, dir2/{d,e}, dir3/{g,h}, dirN/{j,k}
+#   Commit A: dir1/{a,b,c,yo}, dir2/{d,e,f,yo}, dir3/{g,h,i,yo}, dirN/{j,k,l,yo}
+#   Commit B: combined/{a,b,d,e,g,h,j,k}
+#   Expected: combined/{a,b,c,d,e,f,g,h,i,j,k,l}, CONFLICT(Nto1) warnings,
+#             dir1/yo, dir2/yo, dir3/yo, dirN/yo
+
+test_expect_success '9e-setup: N-to-1 whammo' '
+	test_create_repo 9e &&
+	(
+		cd 9e &&
+
+		mkdir dir1 dir2 dir3 dirN &&
+		echo a >dir1/a &&
+		echo b >dir1/b &&
+		echo d >dir2/d &&
+		echo e >dir2/e &&
+		echo g >dir3/g &&
+		echo h >dir3/h &&
+		echo j >dirN/j &&
+		echo k >dirN/k &&
+		git add dir* &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		echo c  >dir1/c &&
+		echo yo >dir1/yo &&
+		echo f  >dir2/f &&
+		echo yo >dir2/yo &&
+		echo i  >dir3/i &&
+		echo yo >dir3/yo &&
+		echo l  >dirN/l &&
+		echo yo >dirN/yo &&
+		git add dir* &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv dir1 combined &&
+		git mv dir2/* combined/ &&
+		git mv dir3/* combined/ &&
+		git mv dirN/* combined/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
+	(
+		cd 9e &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 >out &&
+		grep "CONFLICT (implicit dir rename): Cannot map more than one path to combined/yo" out >error_line &&
+		grep -q dir1/yo error_line &&
+		grep -q dir2/yo error_line &&
+		grep -q dir3/yo error_line &&
+		grep -q dirN/yo error_line &&
+
+		git ls-files -s >out &&
+		test_line_count = 16 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 2 out &&
+
+		git rev-parse >actual \
+			:0:combined/a :0:combined/b :0:combined/c \
+			:0:combined/d :0:combined/e :0:combined/f \
+			:0:combined/g :0:combined/h :0:combined/i \
+			:0:combined/j :0:combined/k :0:combined/l &&
+		git rev-parse >expect \
+			 O:dir1/a      O:dir1/b      A:dir1/c \
+			 O:dir2/d      O:dir2/e      A:dir2/f \
+			 O:dir3/g      O:dir3/h      A:dir3/i \
+			 O:dirN/j      O:dirN/k      A:dirN/l &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			:0:dir1/yo :0:dir2/yo :0:dir3/yo :0:dirN/yo &&
+		git rev-parse >expect \
+			 A:dir1/yo  A:dir2/yo  A:dir3/yo  A:dirN/yo &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 9f, Renamed directory that only contained immediate subdirs
+#   (Related to testcases 1e & 9g)
+#   Commit O: goal/{a,b}/$more_files
+#   Commit A: priority/{a,b}/$more_files
+#   Commit B: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{a,b}/$more_files, priority/c
+
+test_expect_success '9f-setup: Renamed directory that only contained immediate subdirs' '
+	test_create_repo 9f &&
+	(
+		cd 9f &&
+
+		mkdir -p goal/a &&
+		mkdir -p goal/b &&
+		echo foo >goal/a/foo &&
+		echo bar >goal/b/bar &&
+		echo baz >goal/b/baz &&
+		git add goal &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv goal/ priority &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >goal/c &&
+		git add goal/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+	(
+		cd 9f &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:priority/a/foo \
+			HEAD:priority/b/bar \
+			HEAD:priority/b/baz \
+			HEAD:priority/c &&
+		git rev-parse >expect \
+			O:goal/a/foo \
+			O:goal/b/bar \
+			O:goal/b/baz \
+			B:goal/c &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:goal/c
+	)
+'
+
+# Testcase 9g, Renamed directory that only contained immediate subdirs, immediate subdirs renamed
+#   (Related to testcases 1e & 9f)
+#   Commit O: goal/{a,b}/$more_files
+#   Commit A: priority/{alpha,bravo}/$more_files
+#   Commit B: goal/{a,b}/$more_files, goal/c
+#   Expected: priority/{alpha,bravo}/$more_files, priority/c
+
+test_expect_success '9g-setup: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	test_create_repo 9g &&
+	(
+		cd 9g &&
+
+		mkdir -p goal/a &&
+		mkdir -p goal/b &&
+		echo foo >goal/a/foo &&
+		echo bar >goal/b/bar &&
+		echo baz >goal/b/baz &&
+		git add goal &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir priority &&
+		git mv goal/a/ priority/alpha &&
+		git mv goal/b/ priority/beta &&
+		rmdir goal/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >goal/c &&
+		git add goal/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9g-check: Renamed directory that only contained immediate subdirs, immediate subdirs renamed' '
+	(
+		cd 9g &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:priority/alpha/foo \
+			HEAD:priority/beta/bar  \
+			HEAD:priority/beta/baz  \
+			HEAD:priority/c &&
+		git rev-parse >expect \
+			O:goal/a/foo \
+			O:goal/b/bar \
+			O:goal/b/baz \
+			B:goal/c &&
+		test_cmp expect actual &&
+		test_must_fail git rev-parse HEAD:goal/c
+	)
+'
+
+###########################################################################
+# Rules suggested by section 9:
+#
+#   If the other side of history did a directory rename to a path that your
+#   side renamed away, then ignore that particular rename from the other
+#   side of history for any implicit directory renames.
+###########################################################################
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 10/31] directory rename detection: tests for handling overwriting untracked files
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (8 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 09/31] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 11/31] directory rename detection: tests for handling overwriting dirty files Elijah Newren
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 367 ++++++++++++++++++++++++++++++++++++
 1 file changed, 367 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b730256653..aa9af49edc 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2871,4 +2871,371 @@ test_expect_failure '9g-check: Renamed directory that only contained immediate s
 #   side of history for any implicit directory renames.
 ###########################################################################
 
+###########################################################################
+# SECTION 10: Handling untracked files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling, at least in the case of directory renames.
+###########################################################################
+
+# Testcase 10a, Overwrite untracked: normal rename/delete
+#   Commit O: z/{b,c_1}
+#   Commit A: z/b + untracked z/c + untracked z/d
+#   Commit B: z/{b,d_1}
+#   Expected: Aborted Merge +
+#       ERROR_MSG(untracked working tree files would be overwritten by merge)
+
+test_expect_success '10a-setup: Overwrite untracked with normal rename/delete' '
+	test_create_repo 10a &&
+	(
+		cd 10a &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/c z/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '10a-check: Overwrite untracked with normal rename/delete' '
+	(
+		cd 10a &&
+
+		git checkout A^0 &&
+		echo very >z/c &&
+		echo important >z/d &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "The following untracked working tree files would be overwritten by merge" err &&
+
+		git ls-files -s >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		echo very >expect &&
+		test_cmp expect z/c &&
+
+		echo important >expect &&
+		test_cmp expect z/d &&
+
+		git rev-parse HEAD:z/b >actual &&
+		git rev-parse O:z/b >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 10b, Overwrite untracked: dir rename + delete
+#   Commit O: z/{b,c_1}
+#   Commit A: y/b + untracked y/{c,d,e}
+#   Commit B: z/{b,d_1,e}
+#   Expected: Failed Merge; y/b + untracked y/c + untracked y/d on disk +
+#             z/c_1 -> z/d_1 rename recorded at stage 3 for y/d +
+#       ERROR_MSG(refusing to lose untracked file at 'y/d')
+
+test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
+	test_create_repo 10b &&
+	(
+		cd 10b &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git rm z/c &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z/c z/d &&
+		echo e >z/e &&
+		git add z/e &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+	(
+		cd 10b &&
+
+		git checkout A^0 &&
+		echo very >y/c &&
+		echo important >y/d &&
+		echo contents >y/e &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/delete).*Version B\^0 of y/d left in tree at y/d~B\^0" out &&
+		test_i18ngrep "Error: Refusing to lose untracked file at y/e; writing to y/e~B\^0 instead" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 5 out &&
+
+		git rev-parse >actual \
+			:0:y/b :3:y/d :3:y/e &&
+		git rev-parse >expect \
+			O:z/b  O:z/c  B:z/e &&
+		test_cmp expect actual &&
+
+		echo very >expect &&
+		test_cmp expect y/c &&
+
+		echo important >expect &&
+		test_cmp expect y/d &&
+
+		echo contents >expect &&
+		test_cmp expect y/e
+	)
+'
+
+# Testcase 10c, Overwrite untracked: dir rename/rename(1to2)
+#   Commit O: z/{a,b}, x/{c,d}
+#   Commit A: y/{a,b}, w/c, x/d + different untracked y/c
+#   Commit B: z/{a,b,c}, x/d
+#   Expected: Failed Merge; y/{a,b} + x/d + untracked y/c +
+#             CONFLICT(rename/rename) x/c -> w/c vs y/c +
+#             y/c~B^0 +
+#             ERROR_MSG(Refusing to lose untracked file at y/c)
+
+test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)' '
+	test_create_repo 10c &&
+	(
+		cd 10c &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		mkdir w &&
+		git mv x/c w/c &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/c z/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+	(
+		cd 10c &&
+
+		git checkout A^0 &&
+		echo important >y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose untracked file at y/c; adding as y/c~B\^0 instead" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 3 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :3:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  O:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c~B^0 >actual &&
+		git rev-parse O:x/c >expect &&
+		test_cmp expect actual &&
+
+		echo important >expect &&
+		test_cmp expect y/c
+	)
+'
+
+# Testcase 10d, Delete untracked w/ dir rename/rename(2to1)
+#   Commit O: z/{a,b,c_1},        x/{d,e,f_2}
+#   Commit A: y/{a,b},            x/{d,e,f_2,wham_1} + untracked y/wham
+#   Commit B: z/{a,b,c_1,wham_2}, y/{d,e}
+#   Expected: Failed Merge; y/{a,b,d,e} + untracked y/{wham,wham~B^0,wham~HEAD}+
+#             CONFLICT(rename/rename) z/c_1 vs x/f_2 -> y/wham
+#             ERROR_MSG(Refusing to lose untracked file at y/wham)
+
+test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
+	test_create_repo 10d &&
+	(
+		cd 10d &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >z/c &&
+		echo d >x/d &&
+		echo e >x/e &&
+		echo f >x/f &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/c x/wham &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/f z/wham &&
+		git mv x/ y/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+	(
+		cd 10d &&
+
+		git checkout A^0 &&
+		echo important >y/wham &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose untracked file at y/wham" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:y/d :0:y/e :2:y/wham :3:y/wham &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/e  O:z/c     O:x/f &&
+		test_cmp expect actual &&
+
+		test_must_fail git rev-parse :1:y/wham &&
+
+		echo important >expect &&
+		test_cmp expect y/wham &&
+
+		git hash-object >actual \
+			y/wham~B^0 y/wham~HEAD &&
+		git rev-parse >expect \
+			O:x/f      O:z/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 10e, Does git complain about untracked file that's not in the way?
+#   Commit O: z/{a,b}
+#   Commit A: y/{a,b} + untracked z/c
+#   Commit B: z/{a,b,c}
+#   Expected: y/{a,b,c} + untracked z/c
+
+test_expect_success '10e-setup: Does git complain about untracked file that is not really in the way?' '
+	test_create_repo 10e &&
+	(
+		cd 10e &&
+
+		mkdir z &&
+		echo a >z/a &&
+		echo b >z/b &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo c >z/c &&
+		git add z/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '10e-check: Does git complain about untracked file that is not really in the way?' '
+	(
+		cd 10e &&
+
+		git checkout A^0 &&
+		mkdir z &&
+		echo random >z/c &&
+
+		git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep ! "following untracked working tree files would be overwritten by merge" err &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  B:z/c &&
+		test_cmp expect actual &&
+
+		echo random >expect &&
+		test_cmp expect z/c
+	)
+'
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 11/31] directory rename detection: tests for handling overwriting dirty files
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (9 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 10/31] directory rename detection: tests for handling overwriting untracked files Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 12/31] merge-recursive: move the get_renames() function Elijah Newren
                   ` (20 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 458 ++++++++++++++++++++++++++++++++++++
 1 file changed, 458 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index aa9af49edc..fbac664408 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3238,4 +3238,462 @@ test_expect_failure '10e-check: Does git complain about untracked file that is n
 	)
 '
 
+###########################################################################
+# SECTION 11: Handling dirty (not up-to-date) files
+#
+# unpack_trees(), upon which the recursive merge algorithm is based, aborts
+# the operation if untracked or dirty files would be deleted or overwritten
+# by the merge.  Unfortunately, unpack_trees() does not understand renames,
+# and if it doesn't abort, then it muddies up the working directory before
+# we even get to the point of detecting renames, so we need some special
+# handling.  This was true even of normal renames, but there are additional
+# codepaths that need special handling with directory renames.  Add
+# testcases for both renamed-by-directory-rename-detection and standard
+# rename cases.
+###########################################################################
+
+# Testcase 11a, Avoid losing dirty contents with simple rename
+#   Commit O: z/{a,b_v1},
+#   Commit A: z/{a,c_v1}, and z/c_v1 has uncommitted mods
+#   Commit B: z/{a,b_v2}
+#   Expected: ERROR_MSG(Refusing to lose dirty file at z/c) +
+#             z/a, staged version of z/c has sha1sum matching B:z/b_v2,
+#             z/c~HEAD with contents of B:z/b_v2,
+#             z/c with uncommitted mods on top of A:z/c_v1
+
+test_expect_success '11a-setup: Avoid losing dirty contents with simple rename' '
+	test_create_repo 11a &&
+	(
+		cd 11a &&
+
+		mkdir z &&
+		echo a >z/a &&
+		test_seq 1 10 >z/b &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/b z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo 11 >>z/b &&
+		git add z/b &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+	(
+		cd 11a &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 2 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:z/a :2:z/c &&
+		git rev-parse >expect \
+			 O:z/a  B:z/b &&
+		test_cmp expect actual &&
+
+		git hash-object z/c~HEAD >actual &&
+		git rev-parse B:z/b >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11b, Avoid losing dirty file involved in directory rename
+#   Commit O: z/a,         x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit B: y/a,         x/{b,c_v2}
+#   Expected: y/{a,c_v2}, x/b, z/c_v1 with uncommitted mods untracked,
+#             ERROR_MSG(Refusing to lose dirty file at z/c)
+
+
+test_expect_success '11b-setup: Avoid losing dirty file involved in directory rename' '
+	test_create_repo 11b &&
+	(
+		cd 11b &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		echo 11 >>x/c &&
+		git add x/c &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+	(
+		cd 11b &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		grep -q stuff z/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -m >out &&
+		test_line_count = 0 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:y/a :0:y/c &&
+		git rev-parse >expect \
+			 O:x/b  O:z/a  B:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c >actual &&
+		git rev-parse B:x/c >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11c, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit O: y/a,         x/{b,c_v1}
+#   Commit A: y/{a,c_v1},  x/b,       and y/c_v1 has uncommitted mods
+#   Commit B: y/{a,c/d},   x/{b,c_v2}
+#   Expected: Abort_msg("following files would be overwritten by merge") +
+#             y/c left untouched (still has uncommitted mods)
+
+test_expect_success '11c-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	test_create_repo 11c &&
+	(
+		cd 11c &&
+
+		mkdir y x &&
+		echo a >y/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add y x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c y/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y/c &&
+		echo d >y/c/d &&
+		echo 11 >>x/c &&
+		git add x/c y/c/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '11c-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	(
+		cd 11c &&
+
+		git checkout A^0 &&
+		echo stuff >>y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "following files would be overwritten by merge" err &&
+
+		grep -q stuff y/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected y/c &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+		git ls-files -u >out &&
+		test_line_count = 0 out &&
+		git ls-files -m >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 3 out
+	)
+'
+
+# Testcase 11d, Avoid losing not-up-to-date with rename + D/F conflict
+#   Commit O: z/a,         x/{b,c_v1}
+#   Commit A: z/{a,c_v1},  x/b,       and z/c_v1 has uncommitted mods
+#   Commit B: y/{a,c/d},   x/{b,c_v2}
+#   Expected: D/F: y/c_v2 vs y/c/d) +
+#             Warning_Msg("Refusing to lose dirty file at z/c) +
+#             y/{a,c~HEAD,c/d}, x/b, now-untracked z/c_v1 with uncommitted mods
+
+test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conflict' '
+	test_create_repo 11d &&
+	(
+		cd 11d &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >x/b &&
+		test_seq 1 10 >x/c &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv x/c z/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv z y &&
+		mkdir y/c &&
+		echo d >y/c/d &&
+		echo 11 >>x/c &&
+		git add x/c y/c/d &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+	(
+		cd 11d &&
+
+		git checkout A^0 &&
+		echo stuff >>z/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "Refusing to lose dirty file at z/c" out &&
+
+		grep -q stuff z/c &&
+		test_seq 1 10 >expected &&
+		echo stuff >>expected &&
+		test_cmp expected z/c
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 1 out &&
+		git ls-files -o >out &&
+		test_line_count = 5 out &&
+
+		git rev-parse >actual \
+			:0:x/b :0:y/a :0:y/c/d :3:y/c &&
+		git rev-parse >expect \
+			 O:x/b  O:z/a  B:y/c/d  B:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object y/c~HEAD >actual &&
+		git rev-parse B:x/c >expect &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11e, Avoid deleting not-up-to-date with dir rename/rename(1to2)/add
+#   Commit O: z/{a,b},      x/{c_1,d}
+#   Commit A: y/{a,b,c_2},  x/d, w/c_1, and y/c_2 has uncommitted mods
+#   Commit B: z/{a,b,c_1},  x/d
+#   Expected: Failed Merge; y/{a,b} + x/d +
+#             CONFLICT(rename/rename) x/c_1 -> w/c_1 vs y/c_1 +
+#             ERROR_MSG(Refusing to lose dirty file at y/c)
+#             y/c~B^0 has O:x/c_1 contents
+#             y/c~HEAD has A:y/c_2 contents
+#             y/c has dirty file from before merge
+
+test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	test_create_repo 11e &&
+	(
+		cd 11e &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		echo c >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		echo different >y/c &&
+		mkdir w &&
+		git mv x/c w/ &&
+		git add y/c &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/c z/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+	(
+		cd 11e &&
+
+		git checkout A^0 &&
+		echo mods >>y/c &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose dirty file at y/c" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 7 out &&
+		git ls-files -u >out &&
+		test_line_count = 4 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		echo different >expected &&
+		echo mods >>expected &&
+		test_cmp expected y/c &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :0:x/d :1:x/c :2:w/c :2:y/c :3:y/c &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/d  O:x/c  O:x/c  A:y/c  O:x/c &&
+		test_cmp expect actual &&
+
+		git hash-object >actual \
+			y/c~B^0 y/c~HEAD &&
+		git rev-parse >expect \
+			O:x/c   A:y/c &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 11f, Avoid deleting not-up-to-date w/ dir rename/rename(2to1)
+#   Commit O: z/{a,b},        x/{c_1,d_2}
+#   Commit A: y/{a,b,wham_1}, x/d_2, except y/wham has uncommitted mods
+#   Commit B: z/{a,b,wham_2}, x/c_1
+#   Expected: Failed Merge; y/{a,b} + untracked y/{wham~B^0,wham~B^HEAD} +
+#             y/wham with dirty changes from before merge +
+#             CONFLICT(rename/rename) x/c vs x/d -> y/wham
+#             ERROR_MSG(Refusing to lose dirty file at y/wham)
+
+test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	test_create_repo 11f &&
+	(
+		cd 11f &&
+
+		mkdir z x &&
+		echo a >z/a &&
+		echo b >z/b &&
+		test_seq 1 10 >x/c &&
+		echo d >x/d &&
+		git add z x &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv z/ y/ &&
+		git mv x/c y/wham &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv x/d z/wham &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+	(
+		cd 11f &&
+
+		git checkout A^0 &&
+		echo important >>y/wham &&
+
+		test_must_fail git merge -s recursive B^0 >out 2>err &&
+		test_i18ngrep "CONFLICT (rename/rename)" out &&
+		test_i18ngrep "Refusing to lose dirty file at y/wham" out &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+		git ls-files -u >out &&
+		test_line_count = 2 out &&
+		git ls-files -o >out &&
+		test_line_count = 4 out &&
+
+		test_seq 1 10 >expected &&
+		echo important >>expected &&
+		test_cmp expected y/wham &&
+
+		test_must_fail git rev-parse :1:y/wham &&
+		git hash-object >actual \
+			y/wham~B^0 y/wham~HEAD &&
+		git rev-parse >expect \
+			O:x/d      O:x/c &&
+		test_cmp expect actual &&
+
+		git rev-parse >actual \
+			:0:y/a :0:y/b :2:y/wham :3:y/wham &&
+		git rev-parse >expect \
+			 O:z/a  O:z/b  O:x/c     O:x/d &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 12/31] merge-recursive: move the get_renames() function
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (10 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 11/31] directory rename detection: tests for handling overwriting dirty files Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-02 23:27   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic Elijah Newren
                   ` (19 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

I want to re-use some other functions in the file without moving those
other functions or dealing with a handful of annoying split function
declarations and definitions.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 139 +++++++++++++++++++++++++++---------------------------
 1 file changed, 70 insertions(+), 69 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 9d53f30111..2028dd113b 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -537,75 +537,6 @@ struct rename {
 	unsigned processed:1;
 };
 
-/*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
- */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
-{
-	int i;
-	struct string_list *renames;
-	struct diff_options opts;
-
-	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
-
-	diff_setup(&opts);
-	opts.flags.recursive = 1;
-	opts.flags.rename_empty = 0;
-	opts.detect_rename = DIFF_DETECT_RENAME;
-	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
-			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
-			    1000;
-	opts.rename_score = o->rename_score;
-	opts.show_rename_progress = o->show_rename_progress;
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_setup_done(&opts);
-	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
-	diffcore_std(&opts);
-	if (opts.needed_rename_limit > o->needed_rename_limit)
-		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
-		struct string_list_item *item;
-		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
-		if (pair->status != 'R') {
-			diff_free_filepair(pair);
-			continue;
-		}
-		re = xmalloc(sizeof(*re));
-		re->processed = 0;
-		re->pair = pair;
-		item = string_list_lookup(entries, re->pair->one->path);
-		if (!item)
-			re->src_entry = insert_stage_data(re->pair->one->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->src_entry = item->util;
-
-		item = string_list_lookup(entries, re->pair->two->path);
-		if (!item)
-			re->dst_entry = insert_stage_data(re->pair->two->path,
-					o_tree, a_tree, b_tree, entries);
-		else
-			re->dst_entry = item->util;
-		item = string_list_insert(renames, pair->one->path);
-		item->util = re;
-	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
-	return renames;
-}
-
 static int update_stages(struct merge_options *opt, const char *path,
 			 const struct diff_filespec *o,
 			 const struct diff_filespec *a,
@@ -1389,6 +1320,76 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 	return ret;
 }
 
+/*
+ * Get information of all renames which occurred between 'o_tree' and
+ * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
+ * 'b_tree') to be able to associate the correct cache entries with
+ * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+	struct diff_options opts;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+	if (!o->detect_rename)
+		return renames;
+
+	diff_setup(&opts);
+	opts.flags.recursive = 1;
+	opts.flags.rename_empty = 0;
+	opts.detect_rename = DIFF_DETECT_RENAME;
+	opts.rename_limit = o->merge_rename_limit >= 0 ? o->merge_rename_limit :
+			    o->diff_rename_limit >= 0 ? o->diff_rename_limit :
+			    1000;
+	opts.rename_score = o->rename_score;
+	opts.show_rename_progress = o->show_rename_progress;
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_setup_done(&opts);
+	diff_tree_oid(&o_tree->object.oid, &tree->object.oid, "", &opts);
+	diffcore_std(&opts);
+	if (opts.needed_rename_limit > o->needed_rename_limit)
+		o->needed_rename_limit = opts.needed_rename_limit;
+	for (i = 0; i < diff_queued_diff.nr; ++i) {
+		struct string_list_item *item;
+		struct rename *re;
+		struct diff_filepair *pair = diff_queued_diff.queue[i];
+
+		if (pair->status != 'R') {
+			diff_free_filepair(pair);
+			continue;
+		}
+		re = xmalloc(sizeof(*re));
+		re->processed = 0;
+		re->pair = pair;
+		item = string_list_lookup(entries, re->pair->one->path);
+		if (!item)
+			re->src_entry = insert_stage_data(re->pair->one->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->src_entry = item->util;
+
+		item = string_list_lookup(entries, re->pair->two->path);
+		if (!item)
+			re->dst_entry = insert_stage_data(re->pair->two->path,
+					o_tree, a_tree, b_tree, entries);
+		else
+			re->dst_entry = item->util;
+		item = string_list_insert(renames, pair->one->path);
+		item->util = re;
+	}
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_flush(&opts);
+	return renames;
+}
+
 static int process_renames(struct merge_options *o,
 			   struct string_list *a_renames,
 			   struct string_list *b_renames)
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (11 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 12/31] merge-recursive: move the get_renames() function Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-02 23:36   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
                   ` (18 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

The amount of logic in merge_trees() relative to renames was just a few
lines, but split it out into new handle_renames() and cleanup_renames()
functions to prepare for additional logic to be added to each.  No code or
logic changes, just a new place to put stuff for when the rename detection
gains additional checks.

Note that process_renames() records pointers to various information (such
as diff_filepairs) into rename_conflict_info structs.  Even though the
rename string_lists are not directly used once handle_renames() completes,
we should not immediately free the lists at the end of that function
because they store the information referenced in the rename_conflict_info,
which is used later in process_entry().  Thus the reason for a separate
cleanup_renames().

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 2028dd113b..eac3041261 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1645,6 +1645,32 @@ static int process_renames(struct merge_options *o,
 	return clean_merge;
 }
 
+struct rename_info {
+	struct string_list *head_renames;
+	struct string_list *merge_renames;
+};
+
+static int handle_renames(struct merge_options *o,
+			  struct tree *common,
+			  struct tree *head,
+			  struct tree *merge,
+			  struct string_list *entries,
+			  struct rename_info *ri)
+{
+	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
+	return process_renames(o, ri->head_renames, ri->merge_renames);
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+	string_list_clear(re_info->head_renames, 0);
+	string_list_clear(re_info->merge_renames, 0);
+
+	free(re_info->head_renames);
+	free(re_info->merge_renames);
+}
+
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
 {
 	return (is_null_oid(oid) || mode == 0) ? NULL: (struct object_id *)oid;
@@ -2004,7 +2030,8 @@ int merge_trees(struct merge_options *o,
 	}
 
 	if (unmerged_cache()) {
-		struct string_list *entries, *re_head, *re_merge;
+		struct string_list *entries;
+		struct rename_info re_info;
 		int i;
 		/*
 		 * Only need the hashmap while processing entries, so
@@ -2018,9 +2045,8 @@ int merge_trees(struct merge_options *o,
 		get_files_dirs(o, merge);
 
 		entries = get_unmerged();
-		re_head  = get_renames(o, head, common, head, merge, entries);
-		re_merge = get_renames(o, merge, common, head, merge, entries);
-		clean = process_renames(o, re_head, re_merge);
+		clean = handle_renames(o, common, head, merge, entries,
+				       &re_info);
 		record_df_conflict_files(o, entries);
 		if (clean < 0)
 			goto cleanup;
@@ -2045,16 +2071,13 @@ int merge_trees(struct merge_options *o,
 		}
 
 cleanup:
-		string_list_clear(re_merge, 0);
-		string_list_clear(re_head, 0);
+		cleanup_renames(&re_info);
+
 		string_list_clear(entries, 1);
+		free(entries);
 
 		hashmap_free(&o->current_file_dir_set, 1);
 
-		free(re_merge);
-		free(re_head);
-		free(entries);
-
 		if (clean < 0)
 			return clean;
 	}
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (12 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-02 23:41   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
                   ` (17 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

get_renames() has always zero'ed out diff_queued_diff.nr while only
manually free'ing diff_filepairs that did not correspond to renames.
Further, it allocated struct renames that were tucked away in the
return string_list.  Make sure all of these are deallocated when we
are done with them.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index eac3041261..1986af79a9 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1662,13 +1662,23 @@ static int handle_renames(struct merge_options *o,
 	return process_renames(o, ri->head_renames, ri->merge_renames);
 }
 
-static void cleanup_renames(struct rename_info *re_info)
+static void cleanup_rename(struct string_list *rename)
 {
-	string_list_clear(re_info->head_renames, 0);
-	string_list_clear(re_info->merge_renames, 0);
+	const struct rename *re;
+	int i;
 
-	free(re_info->head_renames);
-	free(re_info->merge_renames);
+	for (i = 0; i < rename->nr; i++) {
+		re = rename->items[i].util;
+		diff_free_filepair(re->pair);
+	}
+	string_list_clear(rename, 1);
+	free(rename);
+}
+
+static void cleanup_renames(struct rename_info *re_info)
+{
+	cleanup_rename(re_info->head_renames);
+	cleanup_rename(re_info->merge_renames);
 }
 
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (13 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-02 23:48   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs Elijah Newren
                   ` (16 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Previously, if !o->detect_rename then get_renames() would return an
empty string_list, and then process_renames() would have nothing to
iterate over.  It seems more straightforward to simply avoid calling
either function in that case.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 1986af79a9..4e6d0c248e 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1338,8 +1338,6 @@ static struct string_list *get_renames(struct merge_options *o,
 	struct diff_options opts;
 
 	renames = xcalloc(1, sizeof(struct string_list));
-	if (!o->detect_rename)
-		return renames;
 
 	diff_setup(&opts);
 	opts.flags.recursive = 1;
@@ -1657,6 +1655,12 @@ static int handle_renames(struct merge_options *o,
 			  struct string_list *entries,
 			  struct rename_info *ri)
 {
+	ri->head_renames = NULL;
+	ri->merge_renames = NULL;
+
+	if (!o->detect_rename)
+		return 1;
+
 	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
 	return process_renames(o, ri->head_renames, ri->merge_renames);
@@ -1667,6 +1671,9 @@ static void cleanup_rename(struct string_list *rename)
 	const struct rename *re;
 	int i;
 
+	if (rename == NULL)
+		return;
+
 	for (i = 0; i < rename->nr; i++) {
 		re = rename->items[i].util;
 		diff_free_filepair(re->pair);
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (14 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-03  0:06   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames Elijah Newren
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Create a new function, get_diffpairs() to compute the diff_filepairs
between two trees.  While these are currently only used in
get_renames(), I want them to be available to some new functions.  No
actual logic changes yet.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 86 +++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 64 insertions(+), 22 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 4e6d0c248e..8ac69e1cbb 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1321,24 +1321,15 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 }
 
 /*
- * Get information of all renames which occurred between 'o_tree' and
- * 'tree'. We need the three trees in the merge ('o_tree', 'a_tree' and
- * 'b_tree') to be able to associate the correct cache entries with
- * the rename information. 'tree' is always equal to either a_tree or b_tree.
+ * Get the diff_filepairs changed between o_tree and tree.
  */
-static struct string_list *get_renames(struct merge_options *o,
-				       struct tree *tree,
-				       struct tree *o_tree,
-				       struct tree *a_tree,
-				       struct tree *b_tree,
-				       struct string_list *entries)
+static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
+					       struct tree *o_tree,
+					       struct tree *tree)
 {
-	int i;
-	struct string_list *renames;
+	struct diff_queue_struct *ret;
 	struct diff_options opts;
 
-	renames = xcalloc(1, sizeof(struct string_list));
-
 	diff_setup(&opts);
 	opts.flags.recursive = 1;
 	opts.flags.rename_empty = 0;
@@ -1354,10 +1345,43 @@ static struct string_list *get_renames(struct merge_options *o,
 	diffcore_std(&opts);
 	if (opts.needed_rename_limit > o->needed_rename_limit)
 		o->needed_rename_limit = opts.needed_rename_limit;
-	for (i = 0; i < diff_queued_diff.nr; ++i) {
+
+	ret = malloc(sizeof(struct diff_queue_struct));
+	ret->queue = diff_queued_diff.queue;
+	ret->nr = diff_queued_diff.nr;
+	/* Ignore diff_queued_diff.alloc; we won't be changing size at all */
+
+	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
+	diff_queued_diff.nr = 0;
+	diff_queued_diff.queue = NULL;
+	diff_flush(&opts);
+	return ret;
+}
+
+/*
+ * Get information of all renames which occurred in 'pairs', making use of
+ * any implicit directory renames inferred from the other side of history.
+ * We need the three trees in the merge ('o_tree', 'a_tree' and 'b_tree')
+ * to be able to associate the correct cache entries with the rename
+ * information; tree is always equal to either a_tree or b_tree.
+ */
+static struct string_list *get_renames(struct merge_options *o,
+				       struct diff_queue_struct *pairs,
+				       struct tree *tree,
+				       struct tree *o_tree,
+				       struct tree *a_tree,
+				       struct tree *b_tree,
+				       struct string_list *entries)
+{
+	int i;
+	struct string_list *renames;
+
+	renames = xcalloc(1, sizeof(struct string_list));
+
+	for (i = 0; i < pairs->nr; ++i) {
 		struct string_list_item *item;
 		struct rename *re;
-		struct diff_filepair *pair = diff_queued_diff.queue[i];
+		struct diff_filepair *pair = pairs->queue[i];
 
 		if (pair->status != 'R') {
 			diff_free_filepair(pair);
@@ -1382,9 +1406,6 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
-	opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_queued_diff.nr = 0;
-	diff_flush(&opts);
 	return renames;
 }
 
@@ -1655,15 +1676,36 @@ static int handle_renames(struct merge_options *o,
 			  struct string_list *entries,
 			  struct rename_info *ri)
 {
+	struct diff_queue_struct *head_pairs, *merge_pairs;
+	int clean;
+
 	ri->head_renames = NULL;
 	ri->merge_renames = NULL;
 
 	if (!o->detect_rename)
 		return 1;
 
-	ri->head_renames  = get_renames(o, head, common, head, merge, entries);
-	ri->merge_renames = get_renames(o, merge, common, head, merge, entries);
-	return process_renames(o, ri->head_renames, ri->merge_renames);
+	head_pairs = get_diffpairs(o, common, head);
+	merge_pairs = get_diffpairs(o, common, merge);
+
+	ri->head_renames  = get_renames(o, head_pairs, head,
+					 common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge_pairs, merge,
+					 common, head, merge, entries);
+	clean = process_renames(o, ri->head_renames, ri->merge_renames);
+
+	/*
+	 * Some cleanup is deferred until cleanup_renames() because the
+	 * data structures are still needed and referenced in
+	 * process_entry().  But there are a few things we can free now.
+	 */
+
+	free(head_pairs->queue);
+	free(head_pairs);
+	free(merge_pairs->queue);
+	free(merge_pairs);
+
+	return clean;
 }
 
 static void cleanup_rename(struct string_list *rename)
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (15 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-03  0:26   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

This just adds dir_rename_entry and the associated functions; code using
these will be added in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 35 +++++++++++++++++++++++++++++++++++
 merge-recursive.h |  8 ++++++++
 2 files changed, 43 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 8ac69e1cbb..3b6d0e3f70 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -49,6 +49,41 @@ static unsigned int path_hash(const char *path)
 	return ignore_case ? strihash(path) : strhash(path);
 }
 
+static struct dir_rename_entry *dir_rename_find_entry(struct hashmap *hashmap,
+						      char *dir)
+{
+	struct dir_rename_entry key;
+
+	if (dir == NULL)
+		return NULL;
+	hashmap_entry_init(&key, strhash(dir));
+	key.dir = dir;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int dir_rename_cmp(void *unused_cmp_data,
+			  const struct dir_rename_entry *e1,
+			  const struct dir_rename_entry *e2,
+			  const void *unused_keydata)
+{
+	return strcmp(e1->dir, e2->dir);
+}
+
+static void dir_rename_init(struct hashmap *map)
+{
+	hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
+}
+
+static void dir_rename_entry_init(struct dir_rename_entry *entry,
+				  char *directory)
+{
+	hashmap_entry_init(entry, strhash(directory));
+	entry->dir = directory;
+	entry->non_unique_new_dir = 0;
+	strbuf_init(&entry->new_dir, 0);
+	string_list_init(&entry->possible_new_dirs, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index 80d69d1401..d7f4cc80c1 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -29,6 +29,14 @@ struct merge_options {
 	struct string_list df_conflict_file_set;
 };
 
+struct dir_rename_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *dir;
+	unsigned non_unique_new_dir:1;
+	struct strbuf new_dir;
+	struct string_list possible_new_dirs;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (16 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-03  0:31   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 19/31] merge-recursive: add get_directory_renames() Elijah Newren
                   ` (13 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

In anticipation of more involved cleanup to come, make a helper function
for doing the cleanup at the end of handle_renames.  Rename the already
existing cleanup_rename[s]() to final_cleanup_rename[s](), name the new
helper initial_cleanup_rename(), and leave the big comment in the code
about why we can't do all the cleanup at once.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 3b6d0e3f70..40ed8e1f39 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1704,6 +1704,12 @@ struct rename_info {
 	struct string_list *merge_renames;
 };
 
+static void initial_cleanup_rename(struct diff_queue_struct *pairs)
+{
+	free(pairs->queue);
+	free(pairs);
+}
+
 static int handle_renames(struct merge_options *o,
 			  struct tree *common,
 			  struct tree *head,
@@ -1734,16 +1740,13 @@ static int handle_renames(struct merge_options *o,
 	 * data structures are still needed and referenced in
 	 * process_entry().  But there are a few things we can free now.
 	 */
-
-	free(head_pairs->queue);
-	free(head_pairs);
-	free(merge_pairs->queue);
-	free(merge_pairs);
+	initial_cleanup_rename(head_pairs);
+	initial_cleanup_rename(merge_pairs);
 
 	return clean;
 }
 
-static void cleanup_rename(struct string_list *rename)
+static void final_cleanup_rename(struct string_list *rename)
 {
 	const struct rename *re;
 	int i;
@@ -1759,10 +1762,10 @@ static void cleanup_rename(struct string_list *rename)
 	free(rename);
 }
 
-static void cleanup_renames(struct rename_info *re_info)
+static void final_cleanup_renames(struct rename_info *re_info)
 {
-	cleanup_rename(re_info->head_renames);
-	cleanup_rename(re_info->merge_renames);
+	final_cleanup_rename(re_info->head_renames);
+	final_cleanup_rename(re_info->merge_renames);
 }
 
 static struct object_id *stage_oid(const struct object_id *oid, unsigned mode)
@@ -2165,7 +2168,7 @@ int merge_trees(struct merge_options *o,
 		}
 
 cleanup:
-		cleanup_renames(&re_info);
+		final_cleanup_renames(&re_info);
 
 		string_list_clear(entries, 1);
 		free(entries);
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (17 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-03  1:02   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 20/31] merge-recursive: check for directory level conflicts Elijah Newren
                   ` (12 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

This populates a list of directory renames for us.  The list of
directory renames is not yet used, but will be in subsequent commits.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 155 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 152 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 40ed8e1f39..c75d3a5139 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1393,6 +1393,138 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static void get_renamed_dir_portion(const char *old_path, const char *new_path,
+				    char **old_dir, char **new_dir)
+{
+	char *end_of_old, *end_of_new;
+	int old_len, new_len;
+
+	*old_dir = NULL;
+	*new_dir = NULL;
+
+	/* For
+	 *    "a/b/c/d/foo.c" -> "a/b/something-else/d/foo.c"
+	 * the "d/foo.c" part is the same, we just want to know that
+	 *    "a/b/c" was renamed to "a/b/something-else"
+	 * so, for this example, this function returns "a/b/c" in
+	 * *old_dir and "a/b/something-else" in *new_dir.
+	 *
+	 * Also, if the basename of the file changed, we don't care.  We
+	 * want to know which portion of the directory, if any, changed.
+	 */
+	end_of_old = strrchr(old_path, '/');
+	end_of_new = strrchr(new_path, '/');
+
+	if (end_of_old == NULL || end_of_new == NULL)
+		return;
+	while (*--end_of_new == *--end_of_old &&
+	       end_of_old != old_path &&
+	       end_of_new != new_path)
+		; /* Do nothing; all in the while loop */
+	/*
+	 * We've found the first non-matching character in the directory
+	 * paths.  That means the current directory we were comparing
+	 * represents the rename.  Move end_of_old and end_of_new back
+	 * to the full directory name.
+	 */
+	if (*end_of_old == '/')
+		end_of_old++;
+	if (*end_of_old != '/')
+		end_of_new++;
+	end_of_old = strchr(end_of_old, '/');
+	end_of_new = strchr(end_of_new, '/');
+
+	/*
+	 * It may have been the case that old_path and new_path were the same
+	 * directory all along.  Don't claim a rename if they're the same.
+	 */
+	old_len = end_of_old - old_path;
+	new_len = end_of_new - new_path;
+
+	if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
+		*old_dir = xstrndup(old_path, old_len);
+		*new_dir = xstrndup(new_path, new_len);
+	}
+}
+
+static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
+					     struct tree *tree)
+{
+	struct hashmap *dir_renames;
+	struct hashmap_iter iter;
+	struct dir_rename_entry *entry;
+	int i;
+
+	dir_renames = malloc(sizeof(struct hashmap));
+	dir_rename_init(dir_renames);
+	for (i = 0; i < pairs->nr; ++i) {
+		struct string_list_item *item;
+		int *count;
+		struct diff_filepair *pair = pairs->queue[i];
+		char *old_dir, *new_dir;
+
+		/* File not part of directory rename if it wasn't renamed */
+		if (pair->status != 'R')
+			continue;
+
+		get_renamed_dir_portion(pair->one->path, pair->two->path,
+					&old_dir,        &new_dir);
+		if (!old_dir)
+			/* Directory didn't change at all; ignore this one. */
+			continue;
+
+		entry = dir_rename_find_entry(dir_renames, old_dir);
+		if (!entry) {
+			entry = xmalloc(sizeof(struct dir_rename_entry));
+			dir_rename_entry_init(entry, old_dir);
+			hashmap_put(dir_renames, entry);
+		} else {
+			free(old_dir);
+		}
+		item = string_list_lookup(&entry->possible_new_dirs, new_dir);
+		if (!item) {
+			item = string_list_insert(&entry->possible_new_dirs,
+						  new_dir);
+			item->util = xcalloc(1, sizeof(int));
+		} else {
+			free(new_dir);
+		}
+		count = item->util;
+		*count += 1;
+	}
+
+	hashmap_iter_init(dir_renames, &iter);
+	while ((entry = hashmap_iter_next(&iter))) {
+		int max = 0;
+		int bad_max = 0;
+		char *best = NULL;
+
+		for (i = 0; i < entry->possible_new_dirs.nr; i++) {
+			int *count = entry->possible_new_dirs.items[i].util;
+
+			if (*count == max)
+				bad_max = max;
+			else if (*count > max) {
+				max = *count;
+				best = entry->possible_new_dirs.items[i].string;
+			}
+		}
+		if (bad_max == max)
+			entry->non_unique_new_dir = 1;
+		else {
+			assert(entry->new_dir.len == 0);
+			strbuf_addstr(&entry->new_dir, best);
+		}
+		/* Strings were xstrndup'ed before inserting into string-list,
+		 * so ask string_list to remove the entries for us.
+		 */
+		entry->possible_new_dirs.strdup_strings = 1;
+		string_list_clear(&entry->possible_new_dirs, 1);
+	}
+
+	return dir_renames;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1704,8 +1836,21 @@ struct rename_info {
 	struct string_list *merge_renames;
 };
 
-static void initial_cleanup_rename(struct diff_queue_struct *pairs)
+static void initial_cleanup_rename(struct diff_queue_struct *pairs,
+				   struct hashmap *dir_renames)
 {
+	struct hashmap_iter iter;
+	struct dir_rename_entry *e;
+
+	hashmap_iter_init(dir_renames, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->dir);
+		strbuf_release(&e->new_dir);
+		/* possible_new_dirs already cleared in get_directory_renames */
+	}
+	hashmap_free(dir_renames, 1);
+	free(dir_renames);
+
 	free(pairs->queue);
 	free(pairs);
 }
@@ -1718,6 +1863,7 @@ static int handle_renames(struct merge_options *o,
 			  struct rename_info *ri)
 {
 	struct diff_queue_struct *head_pairs, *merge_pairs;
+	struct hashmap *dir_re_head, *dir_re_merge;
 	int clean;
 
 	ri->head_renames = NULL;
@@ -1729,6 +1875,9 @@ static int handle_renames(struct merge_options *o,
 	head_pairs = get_diffpairs(o, common, head);
 	merge_pairs = get_diffpairs(o, common, merge);
 
+	dir_re_head = get_directory_renames(head_pairs, head);
+	dir_re_merge = get_directory_renames(merge_pairs, merge);
+
 	ri->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge_pairs, merge,
@@ -1740,8 +1889,8 @@ static int handle_renames(struct merge_options *o,
 	 * data structures are still needed and referenced in
 	 * process_entry().  But there are a few things we can free now.
 	 */
-	initial_cleanup_rename(head_pairs);
-	initial_cleanup_rename(merge_pairs);
+	initial_cleanup_rename(head_pairs, dir_re_head);
+	initial_cleanup_rename(merge_pairs, dir_re_merge);
 
 	return clean;
 }
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 20/31] merge-recursive: check for directory level conflicts
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (18 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 19/31] merge-recursive: add get_directory_renames() Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-05 20:00   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions Elijah Newren
                   ` (11 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
directory level.  There will be additional checks at the individual
file level too, which will be added later.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index c75d3a5139..9e9ad45d2a 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1393,6 +1393,15 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
 	return ret;
 }
 
+static int tree_has_path(struct tree *tree, const char *path)
+{
+	unsigned char hashy[20];
+	unsigned int mode_o;
+
+	return !get_tree_entry(tree->object.oid.hash, path,
+			       hashy, &mode_o);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir)
 {
@@ -1447,6 +1456,112 @@ static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 	}
 }
 
+static void remove_hashmap_entries(struct hashmap *dir_renames,
+				   struct string_list *items_to_remove)
+{
+	int i;
+	struct dir_rename_entry *entry;
+
+	for (i = 0; i < items_to_remove->nr; i++) {
+		entry = items_to_remove->items[i].util;
+		hashmap_remove(dir_renames, entry, NULL);
+	}
+	string_list_clear(items_to_remove, 0);
+}
+
+/*
+ * There are a couple things we want to do at the directory level:
+ *   1. Check for both sides renaming to the same thing, in order to avoid
+ *      implicit renaming of files that should be left in place.  (See
+ *      testcase 6b in t6043 for details.)
+ *   2. Prune directory renames if there are still files left in the
+ *      the original directory.  These represent a partial directory rename,
+ *      i.e. a rename where only some of the files within the directory
+ *      were renamed elsewhere.  (Technically, this could be done earlier
+ *      in get_directory_renames(), except that would prevent us from
+ *      doing the previous check and thus failing testcase 6b.)
+ *   3. Check for rename/rename(1to2) conflicts (at the directory level).
+ *      In the future, we could potentially record this info as well and
+ *      omit reporting rename/rename(1to2) conflicts for each path within
+ *      the affected directories, thus cleaning up the merge output.
+ *   NOTE: We do NOT check for rename/rename(2to1) conflicts at the
+ *         directory level, because merging directories is fine.  If it
+ *         causes conflicts for files within those merged directories, then
+ *         that should be detected at the individual path level.
+ */
+static void handle_directory_level_conflicts(struct merge_options *o,
+					     struct hashmap *dir_re_head,
+					     struct tree *head,
+					     struct hashmap *dir_re_merge,
+					     struct tree *merge)
+{
+	struct hashmap_iter iter;
+	struct dir_rename_entry *head_ent;
+	struct dir_rename_entry *merge_ent;
+
+	struct string_list remove_from_head = STRING_LIST_INIT_NODUP;
+	struct string_list remove_from_merge = STRING_LIST_INIT_NODUP;
+
+	hashmap_iter_init(dir_re_head, &iter);
+	while ((head_ent = hashmap_iter_next(&iter))) {
+		merge_ent = dir_rename_find_entry(dir_re_merge, head_ent->dir);
+		if (merge_ent &&
+		    !head_ent->non_unique_new_dir &&
+		    !merge_ent->non_unique_new_dir &&
+		    !strbuf_cmp(&head_ent->new_dir, &merge_ent->new_dir)) {
+			/* 1. Renamed identically; remove it from both sides */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			strbuf_release(&merge_ent->new_dir);
+		} else if (tree_has_path(head, head_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+		}
+	}
+
+	remove_hashmap_entries(dir_re_head, &remove_from_head);
+	remove_hashmap_entries(dir_re_merge, &remove_from_merge);
+
+	hashmap_iter_init(dir_re_merge, &iter);
+	while ((merge_ent = hashmap_iter_next(&iter))) {
+		head_ent = dir_rename_find_entry(dir_re_head, merge_ent->dir);
+		if (tree_has_path(merge, merge_ent->dir)) {
+			/* 2. This wasn't a directory rename after all */
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+		} else if (head_ent &&
+			   !head_ent->non_unique_new_dir &&
+			   !merge_ent->non_unique_new_dir) {
+			/* 3. rename/rename(1to2) */
+			/*
+			 * We can assume it's not rename/rename(1to1) because
+			 * that was case (1), already checked above.  So we
+			 * know that head_ent->new_dir and merge_ent->new_dir
+			 * are different strings.
+			 */
+			output(o, 1, _("CONFLICT (rename/rename): "
+				       "Rename directory %s->%s in %s. "
+				       "Rename directory %s->%s in %s"),
+			       head_ent->dir, head_ent->new_dir.buf, o->branch1,
+			       head_ent->dir, merge_ent->new_dir.buf, o->branch2);
+			string_list_append(&remove_from_head,
+					   head_ent->dir)->util = head_ent;
+			strbuf_release(&head_ent->new_dir);
+			string_list_append(&remove_from_merge,
+					   merge_ent->dir)->util = merge_ent;
+			strbuf_release(&merge_ent->new_dir);
+		}
+	}
+
+	remove_hashmap_entries(dir_re_head, &remove_from_head);
+	remove_hashmap_entries(dir_re_merge, &remove_from_merge);
+}
+
 static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 					     struct tree *tree)
 {
@@ -1878,6 +1993,10 @@ static int handle_renames(struct merge_options *o,
 	dir_re_head = get_directory_renames(head_pairs, head);
 	dir_re_merge = get_directory_renames(merge_pairs, merge);
 
+	handle_directory_level_conflicts(o,
+					 dir_re_head, head,
+					 dir_re_merge, merge);
+
 	ri->head_renames  = get_renames(o, head_pairs, head,
 					 common, head, merge, entries);
 	ri->merge_renames = get_renames(o, merge_pairs, merge,
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (19 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 20/31] merge-recursive: check for directory level conflicts Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-05 20:02   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 22/31] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
                   ` (10 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Directory renames with the ability to merge directories opens up the
possibility of add/add/add/.../add conflicts, if each of the N
directories being merged into one target directory all had a file with
the same name.  We need a way to check for and report on such
collisions; this hashmap will be used for this purpose.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 23 +++++++++++++++++++++++
 merge-recursive.h |  7 +++++++
 2 files changed, 30 insertions(+)

diff --git a/merge-recursive.c b/merge-recursive.c
index 9e9ad45d2a..ac968ad2ae 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -84,6 +84,29 @@ static void dir_rename_entry_init(struct dir_rename_entry *entry,
 	string_list_init(&entry->possible_new_dirs, 0);
 }
 
+static struct collision_entry *collision_find_entry(struct hashmap *hashmap,
+						    char *target_file)
+{
+	struct collision_entry key;
+
+	hashmap_entry_init(&key, strhash(target_file));
+	key.target_file = target_file;
+	return hashmap_get(hashmap, &key, NULL);
+}
+
+static int collision_cmp(void *unused_cmp_data,
+			 const struct collision_entry *e1,
+			 const struct collision_entry *e2,
+			 const void *unused_keydata)
+{
+	return strcmp(e1->target_file, e2->target_file);
+}
+
+static void collision_init(struct hashmap *map)
+{
+	hashmap_init(map, (hashmap_cmp_fn) collision_cmp, NULL, 0);
+}
+
 static void flush_output(struct merge_options *o)
 {
 	if (o->buffer_output < 2 && o->obuf.len) {
diff --git a/merge-recursive.h b/merge-recursive.h
index d7f4cc80c1..e1be27f57c 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -37,6 +37,13 @@ struct dir_rename_entry {
 	struct string_list possible_new_dirs;
 };
 
+struct collision_entry {
+	struct hashmap_entry ent; /* must be the first member! */
+	char *target_file;
+	struct string_list source_files;
+	unsigned reported_already:1;
+};
+
 /* merge_trees() but with recursive ancestor consolidation */
 int merge_recursive(struct merge_options *o,
 		    struct commit *h1,
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 22/31] merge-recursive: add computation of collisions due to dir rename & merging
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (20 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 23/31] merge-recursive: check for file level conflicts then get new name Elijah Newren
                   ` (9 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

directory renaming and merging can cause one or more files to be moved to
where an existing file is, or to cause several files to all be moved to
the same (otherwise vacant) location.  Add checking and reporting for such
cases, falling back to no-directory-rename handling for such paths.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 120 insertions(+), 3 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index ac968ad2ae..dc03b1bb54 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1425,6 +1425,31 @@ static int tree_has_path(struct tree *tree, const char *path)
 			       hashy, &mode_o);
 }
 
+/*
+ * Return a new string that replaces the beginning portion (which matches
+ * entry->dir), with entry->new_dir.  In perl-speak:
+ *   new_path_name = (old_path =~ s/entry->dir/entry->new_dir/);
+ * NOTE:
+ *   Caller must ensure that old_path starts with entry->dir + '/'.
+ */
+static char *apply_dir_rename(struct dir_rename_entry *entry,
+			      const char *old_path)
+{
+	struct strbuf new_path = STRBUF_INIT;
+	int oldlen, newlen;
+
+	if (entry->non_unique_new_dir)
+		return NULL;
+
+	oldlen = strlen(entry->dir);
+	newlen = entry->new_dir.len + (strlen(old_path) - oldlen) + 1;
+	strbuf_grow(&new_path, newlen);
+	strbuf_addbuf(&new_path, &entry->new_dir);
+	strbuf_addstr(&new_path, &old_path[oldlen]);
+
+	return strbuf_detach(&new_path, NULL);
+}
+
 static void get_renamed_dir_portion(const char *old_path, const char *new_path,
 				    char **old_dir, char **new_dir)
 {
@@ -1663,6 +1688,84 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
 	return dir_renames;
 }
 
+static struct dir_rename_entry *check_dir_renamed(const char *path,
+						  struct hashmap *dir_renames)
+{
+	char temp[PATH_MAX];
+	char *end;
+	struct dir_rename_entry *entry;
+
+	strcpy(temp, path);
+	while ((end = strrchr(temp, '/'))) {
+		*end = '\0';
+		entry = dir_rename_find_entry(dir_renames, temp);
+		if (entry)
+			return entry;
+	}
+	return NULL;
+}
+
+static void compute_collisions(struct hashmap *collisions,
+			       struct hashmap *dir_renames,
+			       struct diff_queue_struct *pairs)
+{
+	int i;
+
+	/*
+	 * Multiple files can be mapped to the same path due to directory
+	 * renames done by the other side of history.  Since that other
+	 * side of history could have merged multiple directories into one,
+	 * if our side of history added the same file basename to each of
+	 * those directories, then all N of them would get implicitly
+	 * renamed by the directory rename detection into the same path,
+	 * and we'd get an add/add/.../add conflict, and all those adds
+	 * from *this* side of history.  This is not representable in the
+	 * index, and users aren't going to easily be able to make sense of
+	 * it.  So we need to provide a good warning about what's
+	 * happening, and fall back to no-directory-rename detection
+	 * behavior for those paths.
+	 *
+	 * See testcases 9e and all of section 5 from t6043 for examples.
+	 */
+	collision_init(collisions);
+
+	for (i = 0; i < pairs->nr; ++i) {
+		struct dir_rename_entry *dir_rename_ent;
+		struct collision_entry *collision_ent;
+		char *new_path;
+		struct diff_filepair *pair = pairs->queue[i];
+
+		if (pair->status == 'D')
+			continue;
+		dir_rename_ent = check_dir_renamed(pair->two->path,
+						   dir_renames);
+		if (!dir_rename_ent)
+			continue;
+
+		new_path = apply_dir_rename(dir_rename_ent, pair->two->path);
+		if (!new_path)
+			/*
+			 * dir_rename_ent->non_unique_new_path is true, which
+			 * means there is no directory rename for us to use,
+			 * which means it won't cause us any additional
+			 * collisions.
+			 */
+			continue;
+		collision_ent = collision_find_entry(collisions, new_path);
+		if (!collision_ent) {
+			collision_ent = xcalloc(1,
+						sizeof(struct collision_entry));
+			hashmap_entry_init(collision_ent, strhash(new_path));
+			hashmap_put(collisions, collision_ent);
+			collision_ent->target_file = new_path;
+		} else {
+			free(new_path);
+		}
+		string_list_insert(&collision_ent->source_files,
+				   pair->two->path);
+	}
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1672,6 +1775,7 @@ static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
  */
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
+				       struct hashmap *dir_renames,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
@@ -1679,8 +1783,12 @@ static struct string_list *get_renames(struct merge_options *o,
 				       struct string_list *entries)
 {
 	int i;
+	struct hashmap collisions;
+	struct hashmap_iter iter;
+	struct collision_entry *e;
 	struct string_list *renames;
 
+	compute_collisions(&collisions, dir_renames, pairs);
 	renames = xcalloc(1, sizeof(struct string_list));
 
 	for (i = 0; i < pairs->nr; ++i) {
@@ -1711,6 +1819,13 @@ static struct string_list *get_renames(struct merge_options *o,
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
 	}
+
+	hashmap_iter_init(&collisions, &iter);
+	while ((e = hashmap_iter_next(&iter))) {
+		free(e->target_file);
+		string_list_clear(&e->source_files, 0);
+	}
+	hashmap_free(&collisions, 1);
 	return renames;
 }
 
@@ -2020,9 +2135,11 @@ static int handle_renames(struct merge_options *o,
 					 dir_re_head, head,
 					 dir_re_merge, merge);
 
-	ri->head_renames  = get_renames(o, head_pairs, head,
-					 common, head, merge, entries);
-	ri->merge_renames = get_renames(o, merge_pairs, merge,
+	ri->head_renames  = get_renames(o, head_pairs,
+					dir_re_merge, head,
+					common, head, merge, entries);
+	ri->merge_renames = get_renames(o, merge_pairs,
+					dir_re_head, merge,
 					 common, head, merge, entries);
 	clean = process_renames(o, ri->head_renames, ri->merge_renames);
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 23/31] merge-recursive: check for file level conflicts then get new name
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (21 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 22/31] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 24/31] merge-recursive: when comparing files, don't include trees Elijah Newren
                   ` (8 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Before trying to apply directory renames to paths within the given
directories, we want to make sure that there aren't conflicts at the
file level either.  If there aren't any, then get the new name from
any directory renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 174 ++++++++++++++++++++++++++++++++++--
 strbuf.c                            |  16 ++++
 strbuf.h                            |  16 ++++
 t/t6043-merge-rename-directories.sh |   2 +-
 4 files changed, 199 insertions(+), 9 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index dc03b1bb54..354d91d2a8 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1517,6 +1517,91 @@ static void remove_hashmap_entries(struct hashmap *dir_renames,
 	string_list_clear(items_to_remove, 0);
 }
 
+/*
+ * See if there is a directory rename for path, and if there are any file
+ * level conflicts for the renamed location.  If there is a rename and
+ * there are no conflicts, return the new name.  Otherwise, return NULL.
+ */
+static char *handle_path_level_conflicts(struct merge_options *o,
+					 const char *path,
+					 struct dir_rename_entry *entry,
+					 struct hashmap *collisions,
+					 struct tree *tree)
+{
+	char *new_path = NULL;
+	struct collision_entry *collision_ent;
+	int clean = 1;
+	struct strbuf collision_paths = STRBUF_INIT;
+
+	/*
+	 * entry has the mapping of old directory name to new directory name
+	 * that we want to apply to path.
+	 */
+	new_path = apply_dir_rename(entry, path);
+
+	if (!new_path) {
+		/* This should only happen when entry->non_unique_new_dir set */
+		if (!entry->non_unique_new_dir)
+			BUG("entry->non_unqiue_dir not set and !new_path");
+		output(o, 1, _("CONFLICT (directory rename split): "
+			       "Unclear where to place %s because directory "
+			       "%s was renamed to multiple other directories, "
+			       "with no destination getting a majority of the "
+			       "files."),
+		       path, entry->dir);
+		clean = 0;
+		return NULL;
+	}
+
+	/*
+	 * The caller needs to have ensured that it has pre-populated
+	 * collisions with all paths that map to new_path.  Do a quick check
+	 * to ensure that's the case.
+	 */
+	collision_ent = collision_find_entry(collisions, new_path);
+	if (collision_ent == NULL)
+		BUG("collision_ent is NULL");
+
+	/*
+	 * Check for one-sided add/add/.../add conflicts, i.e.
+	 * where implicit renames from the other side doing
+	 * directory rename(s) can affect this side of history
+	 * to put multiple paths into the same location.  Warn
+	 * and bail on directory renames for such paths.
+	 */
+	if (collision_ent->reported_already) {
+		clean = 0;
+	} else if (tree_has_path(tree, new_path)) {
+		collision_ent->reported_already = 1;
+		strbuf_add_separated_string_list(&collision_paths, ", ",
+						 &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Existing "
+			       "file/dir at %s in the way of implicit "
+			       "directory rename(s) putting the following "
+			       "path(s) there: %s."),
+		       new_path, collision_paths.buf);
+		clean = 0;
+	} else if (collision_ent->source_files.nr > 1) {
+		collision_ent->reported_already = 1;
+		strbuf_add_separated_string_list(&collision_paths, ", ",
+						 &collision_ent->source_files);
+		output(o, 1, _("CONFLICT (implicit dir rename): Cannot map "
+			       "more than one path to %s; implicit directory "
+			       "renames tried to put these paths there: %s"),
+		       new_path, collision_paths.buf);
+		clean = 0;
+	}
+
+	/* Free memory we no longer need */
+	strbuf_release(&collision_paths);
+	if (!clean && new_path) {
+		free(new_path);
+		return NULL;
+	}
+
+	return new_path;
+}
+
 /*
  * There are a couple things we want to do at the directory level:
  *   1. Check for both sides renaming to the same thing, in order to avoid
@@ -1766,6 +1851,59 @@ static void compute_collisions(struct hashmap *collisions,
 	}
 }
 
+static char *check_for_directory_rename(struct merge_options *o,
+					const char *path,
+					struct tree *tree,
+					struct hashmap *dir_renames,
+					struct hashmap *dir_rename_exclusions,
+					struct hashmap *collisions,
+					int *clean_merge)
+{
+	char *new_path = NULL;
+	struct dir_rename_entry *entry = check_dir_renamed(path, dir_renames);
+	struct dir_rename_entry *oentry = NULL;
+
+	if (!entry)
+		return new_path;
+
+	/*
+	 * This next part is a little weird.  We do not want to do an
+	 * implicit rename into a directory we renamed on our side, because
+	 * that will result in a spurious rename/rename(1to2) conflict.  An
+	 * example:
+	 *   Base commit: dumbdir/afile, otherdir/bfile
+	 *   Side 1:      smrtdir/afile, otherdir/bfile
+	 *   Side 2:      dumbdir/afile, dumbdir/bfile
+	 * Here, while working on Side 1, we could notice that otherdir was
+	 * renamed/merged to dumbdir, and change the diff_filepair for
+	 * otherdir/bfile into a rename into dumbdir/bfile.  However, Side
+	 * 2 will notice the rename from dumbdir to smrtdir, and do the
+	 * transitive rename to move it from dumbdir/bfile to
+	 * smrtdir/bfile.  That gives us bfile in dumbdir vs being in
+	 * smrtdir, a rename/rename(1to2) conflict.  We really just want
+	 * the file to end up in smrtdir.  And the way to achieve that is
+	 * to not let Side1 do the rename to dumbdir, since we know that is
+	 * the source of one of our directory renames.
+	 *
+	 * That's why oentry and dir_rename_exclusions is here.
+	 *
+	 * As it turns out, this also prevents N-way transient rename
+	 * confusion; See testcases 9c and 9d of t6043.
+	 */
+	oentry = dir_rename_find_entry(dir_rename_exclusions, entry->new_dir.buf);
+	if (oentry) {
+		output(o, 1, _("WARNING: Avoiding applying %s -> %s rename "
+			       "to %s, because %s itself was renamed."),
+		       entry->dir, entry->new_dir.buf, path, entry->new_dir.buf);
+	} else {
+		new_path = handle_path_level_conflicts(o, path, entry,
+						       collisions, tree);
+		*clean_merge &= (new_path != NULL);
+	}
+
+	return new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1776,11 +1914,13 @@ static void compute_collisions(struct hashmap *collisions,
 static struct string_list *get_renames(struct merge_options *o,
 				       struct diff_queue_struct *pairs,
 				       struct hashmap *dir_renames,
+				       struct hashmap *dir_rename_exclusions,
 				       struct tree *tree,
 				       struct tree *o_tree,
 				       struct tree *a_tree,
 				       struct tree *b_tree,
-				       struct string_list *entries)
+				       struct string_list *entries,
+				       int *clean_merge)
 {
 	int i;
 	struct hashmap collisions;
@@ -1795,11 +1935,22 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct string_list_item *item;
 		struct rename *re;
 		struct diff_filepair *pair = pairs->queue[i];
+		char *new_path; /* non-NULL only with directory renames */
 
-		if (pair->status != 'R') {
+		if (pair->status == 'D') {
+			diff_free_filepair(pair);
+			continue;
+		}
+		new_path = check_for_directory_rename(o, pair->two->path, tree,
+						      dir_renames,
+						      dir_rename_exclusions,
+						      &collisions,
+						      clean_merge);
+		if (pair->status != 'R' && !new_path) {
 			diff_free_filepair(pair);
 			continue;
 		}
+
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
 		re->pair = pair;
@@ -2117,7 +2268,7 @@ static int handle_renames(struct merge_options *o,
 {
 	struct diff_queue_struct *head_pairs, *merge_pairs;
 	struct hashmap *dir_re_head, *dir_re_merge;
-	int clean;
+	int clean = 1;
 
 	ri->head_renames = NULL;
 	ri->merge_renames = NULL;
@@ -2136,13 +2287,20 @@ static int handle_renames(struct merge_options *o,
 					 dir_re_merge, merge);
 
 	ri->head_renames  = get_renames(o, head_pairs,
-					dir_re_merge, head,
-					common, head, merge, entries);
+					dir_re_merge, dir_re_head, head,
+					common, head, merge, entries,
+					&clean);
+	if (clean < 0)
+		goto cleanup;
 	ri->merge_renames = get_renames(o, merge_pairs,
-					dir_re_head, merge,
-					 common, head, merge, entries);
-	clean = process_renames(o, ri->head_renames, ri->merge_renames);
+					dir_re_head, dir_re_merge, merge,
+					common, head, merge, entries,
+					&clean);
+	if (clean < 0)
+		goto cleanup;
+	clean &= process_renames(o, ri->head_renames, ri->merge_renames);
 
+cleanup:
 	/*
 	 * Some cleanup is deferred until cleanup_renames() because the
 	 * data structures are still needed and referenced in
diff --git a/strbuf.c b/strbuf.c
index 1df674e919..113e751d63 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -1,5 +1,6 @@
 #include "cache.h"
 #include "refs.h"
+#include "string-list.h"
 #include "utf8.h"
 
 int starts_with(const char *str, const char *prefix)
@@ -163,6 +164,21 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
 	return ret;
 }
 
+void strbuf_add_separated_string_list(struct strbuf *str,
+				      const char *sep,
+				      struct string_list *slist)
+{
+	struct string_list_item *item;
+	int sep_needed = 0;
+
+	for_each_string_list_item(item, slist) {
+		if (sep_needed)
+			strbuf_addstr(str, sep);
+		strbuf_addstr(str, item->string);
+		sep_needed = 1;
+	}
+}
+
 void strbuf_list_free(struct strbuf **sbs)
 {
 	struct strbuf **s = sbs;
diff --git a/strbuf.h b/strbuf.h
index 14c8c10d66..e2e9e5be22 100644
--- a/strbuf.h
+++ b/strbuf.h
@@ -1,6 +1,8 @@
 #ifndef STRBUF_H
 #define STRBUF_H
 
+struct string_list;
+
 /**
  * strbuf's are meant to be used with all the usual C string and memory
  * APIs. Given that the length of the buffer is known, it's often better to
@@ -528,6 +530,20 @@ static inline struct strbuf **strbuf_split(const struct strbuf *sb,
 	return strbuf_split_max(sb, terminator, 0);
 }
 
+/*
+ * Adds all strings of a string list to the strbuf, separated by the given
+ * separator.  For example, if sep is
+ *   ', '
+ * and slist contains
+ *   ['element1', 'element2', ..., 'elementN'],
+ * then write:
+ *   'element1, element2, ..., elementN'
+ * to str.  If only one element, just write "element1" to str.
+ */
+extern void strbuf_add_separated_string_list(struct strbuf *str,
+					     const char *sep,
+					     struct string_list *slist);
+
 /**
  * Free a NULL-terminated list of strbufs (for example, the return
  * values of the strbuf_split*() functions).
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index fbac664408..f0ccec2826 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -489,7 +489,7 @@ test_expect_success '2a-setup: Directory split into two on one side, with equal
 	)
 '
 
-test_expect_failure '2a-check: Directory split into two on one side, with equal numbers of paths' '
+test_expect_success '2a-check: Directory split into two on one side, with equal numbers of paths' '
 	(
 		cd 2a &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 24/31] merge-recursive: when comparing files, don't include trees
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (22 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 23/31] merge-recursive: check for file level conflicts then get new name Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames Elijah Newren
                   ` (7 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

get_renames() would look up stage data that already existed (populated
in get_unmerged(), taken from whatever unpack_trees() created), and if
it didn't exist, would call insert_stage_data() to create the necessary
entry for the given file.  The insert_stage_data() fallback becomes
much more important for directory rename detection, because that creates
a mechanism to have a file in the resulting merge that didn't exist on
either side of history.  However, insert_stage_data(), due to calling
get_tree_entry() loaded up trees as readily as files.  We aren't
interested in comparing trees to files; the D/F conflict handling is
done elsewhere.  This code is just concerned with what entries existed
for a given path on the different sides of the merge, so create a
get_tree_entry_if_blob() helper function and use it.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c | 27 +++++++++++++++++++++------
 1 file changed, 21 insertions(+), 6 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 354d91d2a8..38dc0eefaf 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -418,6 +418,21 @@ static void get_files_dirs(struct merge_options *o, struct tree *tree)
 	read_tree_recursive(tree, "", 0, 0, &match_all, save_files_dirs, o);
 }
 
+static int get_tree_entry_if_blob(const unsigned char *tree,
+				  const char *path,
+				  unsigned char *hashy,
+				  unsigned int *mode_o)
+{
+	int ret;
+
+	ret = get_tree_entry(tree, path, hashy, mode_o);
+	if (S_ISDIR(*mode_o)) {
+		hashcpy(hashy, null_sha1);
+		*mode_o = 0;
+	}
+	return ret;
+}
+
 /*
  * Returns an index_entry instance which doesn't have to correspond to
  * a real cache entry in Git's index.
@@ -428,12 +443,12 @@ static struct stage_data *insert_stage_data(const char *path,
 {
 	struct string_list_item *item;
 	struct stage_data *e = xcalloc(1, sizeof(struct stage_data));
-	get_tree_entry(o->object.oid.hash, path,
-			e->stages[1].oid.hash, &e->stages[1].mode);
-	get_tree_entry(a->object.oid.hash, path,
-			e->stages[2].oid.hash, &e->stages[2].mode);
-	get_tree_entry(b->object.oid.hash, path,
-			e->stages[3].oid.hash, &e->stages[3].mode);
+	get_tree_entry_if_blob(o->object.oid.hash, path,
+			       e->stages[1].oid.hash, &e->stages[1].mode);
+	get_tree_entry_if_blob(a->object.oid.hash, path,
+			       e->stages[2].oid.hash, &e->stages[2].mode);
+	get_tree_entry_if_blob(b->object.oid.hash, path,
+			       e->stages[3].oid.hash, &e->stages[3].mode);
 	item = string_list_insert(entries, path);
 	item->util = e;
 	return e;
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (23 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 24/31] merge-recursive: when comparing files, don't include trees Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-16  1:14   ` SZEDER Gábor
  2018-01-30 23:25 ` [PATCH v7 26/31] merge-recursive: avoid clobbering untracked files with " Elijah Newren
                   ` (6 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

This commit hooks together all the directory rename logic by making the
necessary changes to the rename struct, it's dst_entry, and the
diff_filepair under consideration.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 187 +++++++++++++++++++++++++++++++++++-
 t/t6043-merge-rename-directories.sh |  50 +++++-----
 2 files changed, 211 insertions(+), 26 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 38dc0eefaf..7c78dc2dc1 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -177,6 +177,7 @@ static int oid_eq(const struct object_id *a, const struct object_id *b)
 
 enum rename_type {
 	RENAME_NORMAL = 0,
+	RENAME_DIR,
 	RENAME_DELETE,
 	RENAME_ONE_FILE_TO_ONE,
 	RENAME_ONE_FILE_TO_TWO,
@@ -607,6 +608,7 @@ struct rename {
 	 */
 	struct stage_data *src_entry;
 	struct stage_data *dst_entry;
+	unsigned add_turned_into_rename:1;
 	unsigned processed:1;
 };
 
@@ -641,6 +643,27 @@ static int update_stages(struct merge_options *opt, const char *path,
 	return 0;
 }
 
+static int update_stages_for_stage_data(struct merge_options *opt,
+					const char *path,
+					const struct stage_data *stage_data)
+{
+	struct diff_filespec o, a, b;
+
+	o.mode = stage_data->stages[1].mode;
+	oidcpy(&o.oid, &stage_data->stages[1].oid);
+
+	a.mode = stage_data->stages[2].mode;
+	oidcpy(&a.oid, &stage_data->stages[2].oid);
+
+	b.mode = stage_data->stages[3].mode;
+	oidcpy(&b.oid, &stage_data->stages[3].oid);
+
+	return update_stages(opt, path,
+			     is_null_sha1(o.oid.hash) ? NULL : &o,
+			     is_null_sha1(a.oid.hash) ? NULL : &a,
+			     is_null_sha1(b.oid.hash) ? NULL : &b);
+}
+
 static void update_entry(struct stage_data *entry,
 			 struct diff_filespec *o,
 			 struct diff_filespec *a,
@@ -1117,6 +1140,18 @@ static int merge_file_one(struct merge_options *o,
 	return merge_file_1(o, &one, &a, &b, branch1, branch2, mfi);
 }
 
+static int conflict_rename_dir(struct merge_options *o,
+			       struct diff_filepair *pair,
+			       const char *rename_branch,
+			       const char *other_branch)
+{
+	const struct diff_filespec *dest = pair->two;
+
+	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
+		return -1;
+	return 0;
+}
+
 static int handle_change_delete(struct merge_options *o,
 				 const char *path, const char *old_path,
 				 const struct object_id *o_oid, int o_mode,
@@ -1386,6 +1421,24 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
 					  new_path2);
+		/*
+		 * unpack_trees() actually populates the index for us for
+		 * "normal" rename/rename(2to1) situtations so that the
+		 * correct entries are at the higher stages, which would
+		 * make the call below to update_stages_for_stage_data
+		 * unnecessary.  However, if either of the renames came
+		 * from a directory rename, then unpack_trees() will not
+		 * have gotten the right data loaded into the index, so we
+		 * need to do so now.  (While it'd be tempting to move this
+		 * call to update_stages_for_stage_data() to
+		 * apply_directory_rename_modifications(), that would break
+		 * our intermediate calls to would_lose_untracked() since
+		 * those rely on the current in-memory index.  See also the
+		 * big "NOTE" in update_stages()).
+		 */
+		if (update_stages_for_stage_data(o, path, ci->dst_entry1))
+			ret = -1;
+
 		free(new_path2);
 		free(new_path1);
 	}
@@ -1919,6 +1972,111 @@ static char *check_for_directory_rename(struct merge_options *o,
 	return new_path;
 }
 
+static void apply_directory_rename_modifications(struct merge_options *o,
+						 struct diff_filepair *pair,
+						 char *new_path,
+						 struct rename *re,
+						 struct tree *tree,
+						 struct tree *o_tree,
+						 struct tree *a_tree,
+						 struct tree *b_tree,
+						 struct string_list *entries,
+						 int *clean)
+{
+	struct string_list_item *item;
+	int stage = (tree == a_tree ? 2 : 3);
+
+	/*
+	 * In all cases where we can do directory rename detection,
+	 * unpack_trees() will have read pair->two->path into the
+	 * index and the working copy.  We need to remove it so that
+	 * we can instead place it at new_path.  It is guaranteed to
+	 * not be untracked (unpack_trees() would have errored out
+	 * saying the file would have been overwritten), but it might
+	 * be dirty, though.
+	 */
+	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+
+	/* Find or create a new re->dst_entry */
+	item = string_list_lookup(entries, new_path);
+	if (item) {
+		/*
+		 * Since we're renaming on this side of history, and it's
+		 * due to a directory rename on the other side of history
+		 * (which we only allow when the directory in question no
+		 * longer exists on the other side of history), the
+		 * original entry for re->dst_entry is no longer
+		 * necessary...
+		 */
+		re->dst_entry->processed = 1;
+
+		/*
+		 * ...because we'll be using this new one.
+		 */
+		re->dst_entry = item->util;
+	} else {
+		/*
+		 * re->dst_entry is for the before-dir-rename path, and we
+		 * need it to hold information for the after-dir-rename
+		 * path.  Before creating a new entry, we need to mark the
+		 * old one as unnecessary (...unless it is shared by
+		 * src_entry, i.e. this didn't use to be a rename, in which
+		 * case we can just allow the normal processing to happen
+		 * for it).
+		 */
+		if (pair->status == 'R')
+			re->dst_entry->processed = 1;
+
+		re->dst_entry = insert_stage_data(new_path,
+						  o_tree, a_tree, b_tree,
+						  entries);
+		item = string_list_insert(entries, new_path);
+		item->util = re->dst_entry;
+	}
+
+	/*
+	 * Update the stage_data with the information about the path we are
+	 * moving into place.  That slot will be empty and available for us
+	 * to write to because of the collision checks in
+	 * handle_path_level_conflicts().  In other words,
+	 * re->dst_entry->stages[stage].oid will be the null_oid, so it's
+	 * open for us to write to.
+	 *
+	 * It may be tempting to actually update the index at this point as
+	 * well, using update_stages_for_stage_data(), but as per the big
+	 * "NOTE" in update_stages(), doing so will modify the current
+	 * in-memory index which will break calls to would_lose_untracked()
+	 * that we need to make.  Instead, we need to just make sure that
+	 * the various conflict_rename_*() functions update the index
+	 * explicitly rather than relying on unpack_trees() to have done it.
+	 */
+	get_tree_entry(tree->object.oid.hash,
+		       pair->two->path,
+		       re->dst_entry->stages[stage].oid.hash,
+		       &re->dst_entry->stages[stage].mode);
+
+	/* Update pair status */
+	if (pair->status == 'A') {
+		/*
+		 * Recording rename information for this add makes it look
+		 * like a rename/delete conflict.  Make sure we can
+		 * correctly handle this as an add that was moved to a new
+		 * directory instead of reporting a rename/delete conflict.
+		 */
+		re->add_turned_into_rename = 1;
+	}
+	/*
+	 * We don't actually look at pair->status again, but it seems
+	 * pedagogically correct to adjust it.
+	 */
+	pair->status = 'R';
+
+	/*
+	 * Finally, record the new location.
+	 */
+	pair->two->path = new_path;
+}
+
 /*
  * Get information of all renames which occurred in 'pairs', making use of
  * any implicit directory renames inferred from the other side of history.
@@ -1968,6 +2126,7 @@ static struct string_list *get_renames(struct merge_options *o,
 
 		re = xmalloc(sizeof(*re));
 		re->processed = 0;
+		re->add_turned_into_rename = 0;
 		re->pair = pair;
 		item = string_list_lookup(entries, re->pair->one->path);
 		if (!item)
@@ -1984,6 +2143,12 @@ static struct string_list *get_renames(struct merge_options *o,
 			re->dst_entry = item->util;
 		item = string_list_insert(renames, pair->one->path);
 		item->util = re;
+		if (new_path)
+			apply_directory_rename_modifications(o, pair, new_path,
+							     re, tree, o_tree,
+							     a_tree, b_tree,
+							     entries,
+							     clean_merge);
 	}
 
 	hashmap_iter_init(&collisions, &iter);
@@ -2153,7 +2318,19 @@ static int process_renames(struct merge_options *o,
 			dst_other.mode = ren1->dst_entry->stages[other_stage].mode;
 			try_merge = 0;
 
-			if (oid_eq(&src_other.oid, &null_oid)) {
+			if (oid_eq(&src_other.oid, &null_oid) &&
+			    ren1->add_turned_into_rename) {
+				setup_rename_conflict_info(RENAME_DIR,
+							   ren1->pair,
+							   NULL,
+							   branch1,
+							   branch2,
+							   ren1->dst_entry,
+							   NULL,
+							   o,
+							   NULL,
+							   NULL);
+			} else if (oid_eq(&src_other.oid, &null_oid)) {
 				setup_rename_conflict_info(RENAME_DELETE,
 							   ren1->pair,
 							   NULL,
@@ -2570,6 +2747,14 @@ static int process_entry(struct merge_options *o,
 						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 						    conflict_info);
 			break;
+		case RENAME_DIR:
+			clean_merge = 1;
+			if (conflict_rename_dir(o,
+						conflict_info->pair1,
+						conflict_info->branch1,
+						conflict_info->branch2))
+				clean_merge = -1;
+			break;
 		case RENAME_DELETE:
 			clean_merge = 0;
 			if (conflict_rename_delete(o,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index f0ccec2826..b6207763cf 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -69,7 +69,7 @@ test_expect_success '1a-setup: Simple directory rename detection' '
 	)
 '
 
-test_expect_failure '1a-check: Simple directory rename detection' '
+test_expect_success '1a-check: Simple directory rename detection' '
 	(
 		cd 1a &&
 
@@ -136,7 +136,7 @@ test_expect_success '1b-setup: Merge a directory with another' '
 	)
 '
 
-test_expect_failure '1b-check: Merge a directory with another' '
+test_expect_success '1b-check: Merge a directory with another' '
 	(
 		cd 1b &&
 
@@ -194,7 +194,7 @@ test_expect_success '1c-setup: Transitive renaming' '
 	)
 '
 
-test_expect_failure '1c-check: Transitive renaming' '
+test_expect_success '1c-check: Transitive renaming' '
 	(
 		cd 1c &&
 
@@ -263,7 +263,7 @@ test_expect_success '1d-setup: Directory renames cause a rename/rename(2to1) con
 	)
 '
 
-test_expect_failure '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
+test_expect_success '1d-check: Directory renames cause a rename/rename(2to1) conflict' '
 	(
 		cd 1d &&
 
@@ -342,7 +342,7 @@ test_expect_success '1e-setup: Renamed directory, with all files being renamed t
 	)
 '
 
-test_expect_failure '1e-check: Renamed directory, with all files being renamed too' '
+test_expect_success '1e-check: Renamed directory, with all files being renamed too' '
 	(
 		cd 1e &&
 
@@ -408,7 +408,7 @@ test_expect_success '1f-setup: Split a directory into two other directories' '
 	)
 '
 
-test_expect_failure '1f-check: Split a directory into two other directories' '
+test_expect_success '1f-check: Split a directory into two other directories' '
 	(
 		cd 1f &&
 
@@ -899,7 +899,7 @@ test_expect_success '5a-setup: Merge directories, other side adds files to origi
 	)
 '
 
-test_expect_failure '5a-check: Merge directories, other side adds files to original and target' '
+test_expect_success '5a-check: Merge directories, other side adds files to original and target' '
 	(
 		cd 5a &&
 
@@ -973,7 +973,7 @@ test_expect_success '5b-setup: Rename/delete in order to get add/add/add conflic
 	)
 '
 
-test_expect_failure '5b-check: Rename/delete in order to get add/add/add conflict' '
+test_expect_success '5b-check: Rename/delete in order to get add/add/add conflict' '
 	(
 		cd 5b &&
 
@@ -1053,7 +1053,7 @@ test_expect_success '5c-setup: Transitive rename would cause rename/rename/renam
 	)
 '
 
-test_expect_failure '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
+test_expect_success '5c-check: Transitive rename would cause rename/rename/rename/add/add/add' '
 	(
 		cd 5c &&
 
@@ -1137,7 +1137,7 @@ test_expect_success '5d-setup: Directory/file/file conflict due to directory ren
 	)
 '
 
-test_expect_failure '5d-check: Directory/file/file conflict due to directory rename' '
+test_expect_success '5d-check: Directory/file/file conflict due to directory rename' '
 	(
 		cd 5d &&
 
@@ -1575,7 +1575,7 @@ test_expect_success '7a-setup: rename-dir vs. rename-dir (NOT split evenly) PLUS
 	)
 '
 
-test_expect_failure '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
+test_expect_success '7a-check: rename-dir vs. rename-dir (NOT split evenly) PLUS add-other-file' '
 	(
 		cd 7a &&
 
@@ -1647,7 +1647,7 @@ test_expect_success '7b-setup: rename/rename(2to1), but only due to transitive r
 	)
 '
 
-test_expect_failure '7b-check: rename/rename(2to1), but only due to transitive rename' '
+test_expect_success '7b-check: rename/rename(2to1), but only due to transitive rename' '
 	(
 		cd 7b &&
 
@@ -1723,7 +1723,7 @@ test_expect_success '7c-setup: rename/rename(1to...2or3); transitive rename may
 	)
 '
 
-test_expect_failure '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
+test_expect_success '7c-check: rename/rename(1to...2or3); transitive rename may add complexity' '
 	(
 		cd 7c &&
 
@@ -1787,7 +1787,7 @@ test_expect_success '7d-setup: transitive rename involved in rename/delete; how
 	)
 '
 
-test_expect_failure '7d-check: transitive rename involved in rename/delete; how is it reported?' '
+test_expect_success '7d-check: transitive rename involved in rename/delete; how is it reported?' '
 	(
 		cd 7d &&
 
@@ -1877,7 +1877,7 @@ test_expect_success '7e-setup: transitive rename in rename/delete AND dirs in th
 	)
 '
 
-test_expect_failure '7e-check: transitive rename in rename/delete AND dirs in the way' '
+test_expect_success '7e-check: transitive rename in rename/delete AND dirs in the way' '
 	(
 		cd 7e &&
 
@@ -1968,7 +1968,7 @@ test_expect_success '8a-setup: Dual-directory rename, one into the others way' '
 	)
 '
 
-test_expect_failure '8a-check: Dual-directory rename, one into the others way' '
+test_expect_success '8a-check: Dual-directory rename, one into the others way' '
 	(
 		cd 8a &&
 
@@ -2113,7 +2113,7 @@ test_expect_success '8c-setup: rename+modify/delete' '
 	)
 '
 
-test_expect_failure '8c-check: rename+modify/delete' '
+test_expect_success '8c-check: rename+modify/delete' '
 	(
 		cd 8c &&
 
@@ -2200,7 +2200,7 @@ test_expect_success '8d-setup: rename/delete...or not?' '
 	)
 '
 
-test_expect_failure '8d-check: rename/delete...or not?' '
+test_expect_success '8d-check: rename/delete...or not?' '
 	(
 		cd 8d &&
 
@@ -2275,7 +2275,7 @@ test_expect_success '8e-setup: Both sides rename, one side adds to original dire
 	)
 '
 
-test_expect_failure '8e-check: Both sides rename, one side adds to original directory' '
+test_expect_success '8e-check: Both sides rename, one side adds to original directory' '
 	(
 		cd 8e &&
 
@@ -2362,7 +2362,7 @@ test_expect_success '9a-setup: Inner renamed directory within outer renamed dire
 	)
 '
 
-test_expect_failure '9a-check: Inner renamed directory within outer renamed directory' '
+test_expect_success '9a-check: Inner renamed directory within outer renamed directory' '
 	(
 		cd 9a &&
 
@@ -2432,7 +2432,7 @@ test_expect_success '9b-setup: Transitive rename with content merge' '
 	)
 '
 
-test_expect_failure '9b-check: Transitive rename with content merge' '
+test_expect_success '9b-check: Transitive rename with content merge' '
 	(
 		cd 9b &&
 
@@ -2522,7 +2522,7 @@ test_expect_success '9c-setup: Doubly transitive rename?' '
 	)
 '
 
-test_expect_failure '9c-check: Doubly transitive rename?' '
+test_expect_success '9c-check: Doubly transitive rename?' '
 	(
 		cd 9c &&
 
@@ -2610,7 +2610,7 @@ test_expect_success '9d-setup: N-way transitive rename?' '
 	)
 '
 
-test_expect_failure '9d-check: N-way transitive rename?' '
+test_expect_success '9d-check: N-way transitive rename?' '
 	(
 		cd 9d &&
 
@@ -2692,7 +2692,7 @@ test_expect_success '9e-setup: N-to-1 whammo' '
 	)
 '
 
-test_expect_failure C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
+test_expect_success C_LOCALE_OUTPUT '9e-check: N-to-1 whammo' '
 	(
 		cd 9e &&
 
@@ -2770,7 +2770,7 @@ test_expect_success '9f-setup: Renamed directory that only contained immediate s
 	)
 '
 
-test_expect_failure '9f-check: Renamed directory that only contained immediate subdirs' '
+test_expect_success '9f-check: Renamed directory that only contained immediate subdirs' '
 	(
 		cd 9f &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 26/31] merge-recursive: avoid clobbering untracked files with directory renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (24 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
                   ` (5 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 42 +++++++++++++++++++++++++++++++++++--
 t/t6043-merge-rename-directories.sh |  6 +++---
 2 files changed, 43 insertions(+), 5 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 7c78dc2dc1..39e161e094 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1147,6 +1147,26 @@ static int conflict_rename_dir(struct merge_options *o,
 {
 	const struct diff_filespec *dest = pair->two;
 
+	if (!o->call_depth && would_lose_untracked(dest->path)) {
+		char *alt_path = unique_path(o, dest->path, rename_branch);
+
+		output(o, 1, _("Error: Refusing to lose untracked file at %s; "
+			       "writing to %s instead."),
+		       dest->path, alt_path);
+		/*
+		 * Write the file in worktree at alt_path, but not in the
+		 * index.  Instead, write to dest->path for the index but
+		 * only at the higher appropriate stage.
+		 */
+		if (update_file(o, 0, &dest->oid, dest->mode, alt_path))
+			return -1;
+		free(alt_path);
+		return update_stages(o, dest->path, NULL,
+				     rename_branch == o->branch1 ? dest : NULL,
+				     rename_branch == o->branch1 ? NULL : dest);
+	}
+
+	/* Update dest->path both in index and in worktree */
 	if (update_file(o, 1, &dest->oid, dest->mode, dest->path))
 		return -1;
 	return 0;
@@ -1165,7 +1185,8 @@ static int handle_change_delete(struct merge_options *o,
 	const char *update_path = path;
 	int ret = 0;
 
-	if (dir_in_way(path, !o->call_depth, 0)) {
+	if (dir_in_way(path, !o->call_depth, 0) ||
+	    (!o->call_depth && would_lose_untracked(path))) {
 		update_path = alt_path = unique_path(o, path, change_branch);
 	}
 
@@ -1291,6 +1312,12 @@ static int handle_file(struct merge_options *o,
 			dst_name = unique_path(o, rename->path, cur_branch);
 			output(o, 1, _("%s is a directory in %s adding as %s instead"),
 			       rename->path, other_branch, dst_name);
+		} else if (!o->call_depth &&
+			   would_lose_untracked(rename->path)) {
+			dst_name = unique_path(o, rename->path, cur_branch);
+			output(o, 1, _("Refusing to lose untracked file at %s; "
+				       "adding as %s instead"),
+			       rename->path, dst_name);
 		}
 	}
 	if ((ret = update_file(o, 0, &rename->oid, rename->mode, dst_name)))
@@ -1416,7 +1443,18 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		remove_file(o, 0, path, 0);
+		if (would_lose_untracked(path))
+			/*
+			 * Only way we get here is if both renames were from
+			 * a directory rename AND user had an untracked file
+			 * at the location where both files end up after the
+			 * two directory renames.  See testcase 10d of t6043.
+			 */
+			output(o, 1, _("Refusing to lose untracked file at "
+				       "%s, even though it's in the way."),
+			       path);
+		else
+			remove_file(o, 0, path, 0);
 		ret = update_file(o, 0, &mfi_c1.oid, mfi_c1.mode, new_path1);
 		if (!ret)
 			ret = update_file(o, 0, &mfi_c2.oid, mfi_c2.mode,
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index b6207763cf..abb5b20f6b 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2984,7 +2984,7 @@ test_expect_success '10b-setup: Overwrite untracked with dir rename + delete' '
 	)
 '
 
-test_expect_failure '10b-check: Overwrite untracked with dir rename + delete' '
+test_expect_success '10b-check: Overwrite untracked with dir rename + delete' '
 	(
 		cd 10b &&
 
@@ -3062,7 +3062,7 @@ test_expect_success '10c-setup: Overwrite untracked with dir rename/rename(1to2)
 	)
 '
 
-test_expect_failure '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
+test_expect_success '10c-check: Overwrite untracked with dir rename/rename(1to2)' '
 	(
 		cd 10c &&
 
@@ -3137,7 +3137,7 @@ test_expect_success '10d-setup: Delete untracked with dir rename/rename(2to1)' '
 	)
 '
 
-test_expect_failure '10d-check: Delete untracked with dir rename/rename(2to1)' '
+test_expect_success '10d-check: Delete untracked with dir rename/rename(2to1)' '
 	(
 		cd 10d &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (25 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 26/31] merge-recursive: avoid clobbering untracked files with " Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-05 20:52   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

This fixes an issue that existed before my directory rename detection
patches that affects both normal renames and renames implied by
directory rename detection.  Additional codepaths that only affect
overwriting of directy files that are involved in directory rename
detection will be added in a subsequent commit.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 85 ++++++++++++++++++++++++++++---------
 merge-recursive.h                   |  2 +
 t/t3501-revert-cherry-pick.sh       |  2 +-
 t/t6043-merge-rename-directories.sh |  2 +-
 t/t7607-merge-overwrite.sh          |  2 +-
 unpack-trees.c                      |  4 +-
 unpack-trees.h                      |  4 ++
 7 files changed, 77 insertions(+), 24 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 39e161e094..fba1a0d207 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -334,32 +334,37 @@ static void init_tree_desc_from_tree(struct tree_desc *desc, struct tree *tree)
 	init_tree_desc(desc, tree->buffer, tree->size);
 }
 
-static int git_merge_trees(int index_only,
+static int git_merge_trees(struct merge_options *o,
 			   struct tree *common,
 			   struct tree *head,
 			   struct tree *merge)
 {
 	int rc;
 	struct tree_desc t[3];
-	struct unpack_trees_options opts;
 
-	memset(&opts, 0, sizeof(opts));
-	if (index_only)
-		opts.index_only = 1;
+	memset(&o->unpack_opts, 0, sizeof(o->unpack_opts));
+	if (o->call_depth)
+		o->unpack_opts.index_only = 1;
 	else
-		opts.update = 1;
-	opts.merge = 1;
-	opts.head_idx = 2;
-	opts.fn = threeway_merge;
-	opts.src_index = &the_index;
-	opts.dst_index = &the_index;
-	setup_unpack_trees_porcelain(&opts, "merge");
+		o->unpack_opts.update = 1;
+	o->unpack_opts.merge = 1;
+	o->unpack_opts.head_idx = 2;
+	o->unpack_opts.fn = threeway_merge;
+	o->unpack_opts.src_index = &the_index;
+	o->unpack_opts.dst_index = &the_index;
+	setup_unpack_trees_porcelain(&o->unpack_opts, "merge");
 
 	init_tree_desc_from_tree(t+0, common);
 	init_tree_desc_from_tree(t+1, head);
 	init_tree_desc_from_tree(t+2, merge);
 
-	rc = unpack_trees(3, t, &opts);
+	rc = unpack_trees(3, t, &o->unpack_opts);
+	/*
+	 * unpack_trees NULLifies src_index, but it's used in verify_uptodate,
+	 * so set to the new index which will usually have modification
+	 * timestamp info copied over.
+	 */
+	o->unpack_opts.src_index = &the_index;
 	cache_tree_free(&active_cache_tree);
 	return rc;
 }
@@ -792,6 +797,20 @@ static int would_lose_untracked(const char *path)
 	return !was_tracked(path) && file_exists(path);
 }
 
+static int was_dirty(struct merge_options *o, const char *path)
+{
+	struct cache_entry *ce;
+	int dirty = 1;
+
+	if (o->call_depth || !was_tracked(path))
+		return !dirty;
+
+	ce = cache_file_exists(path, strlen(path), ignore_case);
+	dirty = (ce->ce_stat_data.sd_mtime.sec > 0 &&
+		 verify_uptodate(ce, &o->unpack_opts) != 0);
+	return dirty;
+}
+
 static int make_room_for_path(struct merge_options *o, const char *path)
 {
 	int status, i;
@@ -2654,6 +2673,7 @@ static int handle_modify_delete(struct merge_options *o,
 
 static int merge_content(struct merge_options *o,
 			 const char *path,
+			 int file_in_way,
 			 struct object_id *o_oid, int o_mode,
 			 struct object_id *a_oid, int a_mode,
 			 struct object_id *b_oid, int b_mode,
@@ -2728,7 +2748,7 @@ static int merge_content(struct merge_options *o,
 				return -1;
 	}
 
-	if (df_conflict_remains) {
+	if (df_conflict_remains || file_in_way) {
 		char *new_path;
 		if (o->call_depth) {
 			remove_file_from_cache(path);
@@ -2762,6 +2782,30 @@ static int merge_content(struct merge_options *o,
 	return mfi.clean;
 }
 
+static int conflict_rename_normal(struct merge_options *o,
+				  const char *path,
+				  struct object_id *o_oid, unsigned int o_mode,
+				  struct object_id *a_oid, unsigned int a_mode,
+				  struct object_id *b_oid, unsigned int b_mode,
+				  struct rename_conflict_info *ci)
+{
+	int clean_merge;
+	int file_in_the_way = 0;
+
+	if (was_dirty(o, path)) {
+		file_in_the_way = 1;
+		output(o, 1, _("Refusing to lose dirty file at %s"), path);
+	}
+
+	/* Merge the content and write it out */
+	clean_merge = merge_content(o, path, file_in_the_way,
+				    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
+				    ci);
+	if (clean_merge > 0 && file_in_the_way)
+		clean_merge = 0;
+	return clean_merge;
+}
+
 /* Per entry merge function */
 static int process_entry(struct merge_options *o,
 			 const char *path, struct stage_data *entry)
@@ -2781,9 +2825,12 @@ static int process_entry(struct merge_options *o,
 		switch (conflict_info->rename_type) {
 		case RENAME_NORMAL:
 		case RENAME_ONE_FILE_TO_ONE:
-			clean_merge = merge_content(o, path,
-						    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
-						    conflict_info);
+			clean_merge = conflict_rename_normal(o,
+							     path,
+							     o_oid, o_mode,
+							     a_oid, a_mode,
+							     b_oid, b_mode,
+							     conflict_info);
 			break;
 		case RENAME_DIR:
 			clean_merge = 1;
@@ -2879,7 +2926,7 @@ static int process_entry(struct merge_options *o,
 	} else if (a_oid && b_oid) {
 		/* Case C: Added in both (check for same permissions) and */
 		/* case D: Modified in both, but differently. */
-		clean_merge = merge_content(o, path,
+		clean_merge = merge_content(o, path, 0 /* file_in_way */,
 					    o_oid, o_mode, a_oid, a_mode, b_oid, b_mode,
 					    NULL);
 	} else if (!o_oid && !a_oid && !b_oid) {
@@ -2920,7 +2967,7 @@ int merge_trees(struct merge_options *o,
 		return 1;
 	}
 
-	code = git_merge_trees(o->call_depth, common, head, merge);
+	code = git_merge_trees(o, common, head, merge);
 
 	if (code != 0) {
 		if (show(o, 4) || o->call_depth)
diff --git a/merge-recursive.h b/merge-recursive.h
index e1be27f57c..a557201a50 100644
--- a/merge-recursive.h
+++ b/merge-recursive.h
@@ -1,6 +1,7 @@
 #ifndef MERGE_RECURSIVE_H
 #define MERGE_RECURSIVE_H
 
+#include "unpack-trees.h"
 #include "string-list.h"
 
 struct merge_options {
@@ -27,6 +28,7 @@ struct merge_options {
 	struct strbuf obuf;
 	struct hashmap current_file_dir_set;
 	struct string_list df_conflict_file_set;
+	struct unpack_trees_options unpack_opts;
 };
 
 struct dir_rename_entry {
diff --git a/t/t3501-revert-cherry-pick.sh b/t/t3501-revert-cherry-pick.sh
index 783bdbf59d..0d89f6d0f6 100755
--- a/t/t3501-revert-cherry-pick.sh
+++ b/t/t3501-revert-cherry-pick.sh
@@ -141,7 +141,7 @@ test_expect_success 'cherry-pick "-" works with arguments' '
 	test_cmp expect actual
 '
 
-test_expect_failure 'cherry-pick works with dirty renamed file' '
+test_expect_success 'cherry-pick works with dirty renamed file' '
 	test_commit to-rename &&
 	git checkout -b unrelated &&
 	test_commit unrelated &&
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index abb5b20f6b..89b2eacf38 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3290,7 +3290,7 @@ test_expect_success '11a-setup: Avoid losing dirty contents with simple rename'
 	)
 '
 
-test_expect_failure '11a-check: Avoid losing dirty contents with simple rename' '
+test_expect_success '11a-check: Avoid losing dirty contents with simple rename' '
 	(
 		cd 11a &&
 
diff --git a/t/t7607-merge-overwrite.sh b/t/t7607-merge-overwrite.sh
index 9c422bcd7c..dd8ab7ede1 100755
--- a/t/t7607-merge-overwrite.sh
+++ b/t/t7607-merge-overwrite.sh
@@ -92,7 +92,7 @@ test_expect_success 'will not overwrite removed file with staged changes' '
 	test_cmp important c1.c
 '
 
-test_expect_failure 'will not overwrite unstaged changes in renamed file' '
+test_expect_success 'will not overwrite unstaged changes in renamed file' '
 	git reset --hard c1 &&
 	git mv c1.c other.c &&
 	git commit -m rename &&
diff --git a/unpack-trees.c b/unpack-trees.c
index 96c3327f19..f99fe9b9fd 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -1486,8 +1486,8 @@ static int verify_uptodate_1(const struct cache_entry *ce,
 		add_rejected_path(o, error_type, ce->name);
 }
 
-static int verify_uptodate(const struct cache_entry *ce,
-			   struct unpack_trees_options *o)
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o)
 {
 	if (!o->skip_sparse_checkout && (ce->ce_flags & CE_NEW_SKIP_WORKTREE))
 		return 0;
diff --git a/unpack-trees.h b/unpack-trees.h
index 6c48117b84..41178ada94 100644
--- a/unpack-trees.h
+++ b/unpack-trees.h
@@ -1,6 +1,7 @@
 #ifndef UNPACK_TREES_H
 #define UNPACK_TREES_H
 
+#include "tree-walk.h"
 #include "string-list.h"
 
 #define MAX_UNPACK_TREES 8
@@ -78,6 +79,9 @@ struct unpack_trees_options {
 extern int unpack_trees(unsigned n, struct tree_desc *t,
 		struct unpack_trees_options *options);
 
+int verify_uptodate(const struct cache_entry *ce,
+		    struct unpack_trees_options *o);
+
 int threeway_merge(const struct cache_entry * const *stages,
 		   struct unpack_trees_options *o);
 int twoway_merge(const struct cache_entry * const *src,
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (26 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-05 21:52   ` Stefan Beller
  2018-01-30 23:25 ` [PATCH v7 29/31] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
                   ` (3 subsequent siblings)
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 26 +++++++++++++++++++++++---
 t/t6043-merge-rename-directories.sh |  8 ++++----
 2 files changed, 27 insertions(+), 7 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index fba1a0d207..62e4266d21 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1320,11 +1320,23 @@ static int handle_file(struct merge_options *o,
 
 	add = filespec_from_entry(&other, dst_entry, stage ^ 1);
 	if (add) {
+		int ren_src_was_dirty = was_dirty(o, rename->path);
 		char *add_name = unique_path(o, rename->path, other_branch);
 		if (update_file(o, 0, &add->oid, add->mode, add_name))
 			return -1;
 
-		remove_file(o, 0, rename->path, 0);
+		if (ren_src_was_dirty) {
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       rename->path);
+		}
+		/*
+		 * Stupid double negatives in remove_file; it somehow manages
+		 * to repeatedly mess me up.  So, just for myself:
+		 *    1) update_wd iff !ren_src_was_dirty.
+		 *    2) no_wd iff !update_wd
+		 *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
+		 */
+		remove_file(o, 0, rename->path, ren_src_was_dirty);
 		dst_name = unique_path(o, rename->path, cur_branch);
 	} else {
 		if (dir_in_way(rename->path, !o->call_depth, 0)) {
@@ -1462,7 +1474,10 @@ static int conflict_rename_rename_2to1(struct merge_options *o,
 		char *new_path2 = unique_path(o, path, ci->branch2);
 		output(o, 1, _("Renaming %s to %s and %s to %s instead"),
 		       a->path, new_path1, b->path, new_path2);
-		if (would_lose_untracked(path))
+		if (was_dirty(o, path))
+			output(o, 1, _("Refusing to lose dirty file at %s"),
+			       path);
+		else if (would_lose_untracked(path))
 			/*
 			 * Only way we get here is if both renames were from
 			 * a directory rename AND user had an untracked file
@@ -2042,6 +2057,7 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 {
 	struct string_list_item *item;
 	int stage = (tree == a_tree ? 2 : 3);
+	int update_wd;
 
 	/*
 	 * In all cases where we can do directory rename detection,
@@ -2052,7 +2068,11 @@ static void apply_directory_rename_modifications(struct merge_options *o,
 	 * saying the file would have been overwritten), but it might
 	 * be dirty, though.
 	 */
-	remove_file(o, 1, pair->two->path, 0 /* no_wd */);
+	update_wd = !was_dirty(o, pair->two->path);
+	if (!update_wd)
+		output(o, 1, _("Refusing to lose dirty file at %s"),
+		       pair->two->path);
+	remove_file(o, 1, pair->two->path, !update_wd);
 
 	/* Find or create a new re->dst_entry */
 	item = string_list_lookup(entries, new_path);
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 89b2eacf38..a34c57d986 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3362,7 +3362,7 @@ test_expect_success '11b-setup: Avoid losing dirty file involved in directory re
 	)
 '
 
-test_expect_failure '11b-check: Avoid losing dirty file involved in directory rename' '
+test_expect_success '11b-check: Avoid losing dirty file involved in directory rename' '
 	(
 		cd 11b &&
 
@@ -3504,7 +3504,7 @@ test_expect_success '11d-setup: Avoid losing not-uptodate with rename + D/F conf
 	)
 '
 
-test_expect_failure '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
+test_expect_success '11d-check: Avoid losing not-uptodate with rename + D/F conflict' '
 	(
 		cd 11d &&
 
@@ -3583,7 +3583,7 @@ test_expect_success '11e-setup: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
-test_expect_failure '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
+test_expect_success '11e-check: Avoid deleting not-uptodate with dir rename/rename(1to2)/add' '
 	(
 		cd 11e &&
 
@@ -3659,7 +3659,7 @@ test_expect_success '11f-setup: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
-test_expect_failure '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
+test_expect_success '11f-check: Avoid deleting not-uptodate with dir rename/rename(2to1)' '
 	(
 		cd 11f &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 29/31] directory rename detection: new testcases showcasing a pair of bugs
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (27 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 30/31] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
                   ` (2 subsequent siblings)
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

Add a testcase showing spurious rename/rename(1to2) conflicts occurring
due to directory rename detection.

Also add a pair of testcases dealing with moving directory hierarchies
around that were suggested by Stefan Beller as "food for thought" during
his review of an earlier patch series, but which actually uncovered a
bug.  Round things out with a test that is a cross between the two
testcases that showed existing bugs in order to make sure we aren't
merely addressing problems in isolation but in general.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 t/t6043-merge-rename-directories.sh | 296 ++++++++++++++++++++++++++++++++++++
 1 file changed, 296 insertions(+)

diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index a34c57d986..3d292f0c5f 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -159,6 +159,7 @@ test_expect_success '1b-check: Merge a directory with another' '
 # Testcase 1c, Transitive renaming
 #   (Related to testcases 3a and 6d -- when should a transitive rename apply?)
 #   (Related to testcases 9c and 9d -- can transitivity repeat?)
+#   (Related to testcase 12b -- joint-transitivity?)
 #   Commit O: z/{b,c},   x/d
 #   Commit A: y/{b,c},   x/d
 #   Commit B: z/{b,c,d}
@@ -2863,6 +2864,68 @@ test_expect_failure '9g-check: Renamed directory that only contained immediate s
 	)
 '
 
+# Testcase 9h, Avoid implicit rename if involved as source on other side
+#   (Extremely closely related to testcase 3a)
+#   Commit O: z/{b,c,d_1}
+#   Commit A: z/{b,c,d_2}
+#   Commit B: y/{b,c}, x/d_1
+#   Expected: y/{b,c}, x/d_2
+#   NOTE: If we applied the z/ -> y/ rename to z/d, then we'd end up with
+#         a rename/rename(1to2) conflict (z/d -> y/d vs. x/d)
+test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
+	test_create_repo 9h &&
+	(
+		cd 9h &&
+
+		mkdir z &&
+		echo b >z/b &&
+		echo c >z/c &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nd\n" >z/d &&
+		git add z &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		test_tick &&
+		echo more >>z/d &&
+		git add z/d &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		mkdir y &&
+		mkdir x &&
+		git mv z/b y/ &&
+		git mv z/c y/ &&
+		git mv z/d x/ &&
+		rmdir z &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+	(
+		cd 9h &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 3 out &&
+
+		git rev-parse >actual \
+			HEAD:y/b HEAD:y/c HEAD:x/d &&
+		git rev-parse >expect \
+			O:z/b    O:z/c    A:z/d &&
+		test_cmp expect actual
+	)
+'
+
 ###########################################################################
 # Rules suggested by section 9:
 #
@@ -3696,4 +3759,237 @@ test_expect_success '11f-check: Avoid deleting not-uptodate with dir rename/rena
 	)
 '
 
+###########################################################################
+# SECTION 12: Everything else
+#
+# Tests suggested by others.  Tests added after implementation completed
+# and submitted.  Grab bag.
+###########################################################################
+
+# Testcase 12a, Moving one directory hierarchy into another
+#   (Related to testcase 9a)
+#   Commit O: node1/{leaf1,leaf2}, node2/{leaf3,leaf4}
+#   Commit A: node1/{leaf1,leaf2,node2/{leaf3,leaf4}}
+#   Commit B: node1/{leaf1,leaf2,leaf5}, node2/{leaf3,leaf4,leaf6}
+#   Expected: node1/{leaf1,leaf2,leaf5,node2/{leaf3,leaf4,leaf6}}
+
+test_expect_success '12a-setup: Moving one directory hierarchy into another' '
+	test_create_repo 12a &&
+	(
+		cd 12a &&
+
+		mkdir -p node1 node2 &&
+		echo leaf1 >node1/leaf1 &&
+		echo leaf2 >node1/leaf2 &&
+		echo leaf3 >node2/leaf3 &&
+		echo leaf4 >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		echo leaf5 >node1/leaf5 &&
+		echo leaf6 >node2/leaf6 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_success '12a-check: Moving one directory hierarchy into another' '
+	(
+		cd 12a &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 6 out &&
+
+		git rev-parse >actual \
+			HEAD:node1/leaf1 HEAD:node1/leaf2 HEAD:node1/leaf5 \
+			HEAD:node1/node2/leaf3 \
+			HEAD:node1/node2/leaf4 \
+			HEAD:node1/node2/leaf6 &&
+		git rev-parse >expect \
+			O:node1/leaf1    O:node1/leaf2    B:node1/leaf5 \
+			O:node2/leaf3 \
+			O:node2/leaf4 \
+			B:node2/leaf6 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12b, Moving two directory hierarchies into each other
+#   (Related to testcases 1c and 12c)
+#   Commit O: node1/{leaf1, leaf2}, node2/{leaf3, leaf4}
+#   Commit A: node1/{leaf1, leaf2, node2/{leaf3, leaf4}}
+#   Commit B: node2/{leaf3, leaf4, node1/{leaf1, leaf2}}
+#   Expected: node1/node2/node1/{leaf1, leaf2},
+#             node2/node1/node2/{leaf3, leaf4}
+#   NOTE: Without directory renames, we would expect
+#                   node2/node1/{leaf1, leaf2},
+#                   node1/node2/{leaf3, leaf4}
+#         with directory rename detection, we note that
+#             commit A renames node2/ -> node1/node2/
+#             commit B renames node1/ -> node2/node1/
+#         therefore, applying those directory renames to the initial result
+#         (making all four paths experience a transitive renaming), yields
+#         the expected result.
+#
+#         You may ask, is it weird to have two directories rename each other?
+#         To which, I can do no more than shrug my shoulders and say that
+#         even simple rules give weird results when given weird inputs.
+
+test_expect_success '12b-setup: Moving one directory hierarchy into another' '
+	test_create_repo 12b &&
+	(
+		cd 12b &&
+
+		mkdir -p node1 node2 &&
+		echo leaf1 >node1/leaf1 &&
+		echo leaf2 >node1/leaf2 &&
+		echo leaf3 >node2/leaf3 &&
+		echo leaf4 >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+	(
+		cd 12b &&
+
+		git checkout A^0 &&
+
+		git merge -s recursive B^0 &&
+
+		git ls-files -s >out &&
+		test_line_count = 4 out &&
+
+		git rev-parse >actual \
+			HEAD:node1/node2/node1/leaf1 \
+			HEAD:node1/node2/node1/leaf2 \
+			HEAD:node2/node1/node2/leaf3 \
+			HEAD:node2/node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
+# Testcase 12c, Moving two directory hierarchies into each other w/ content merge
+#   (Related to testcase 12b)
+#   Commit O: node1/{       leaf1_1, leaf2_1}, node2/{leaf3_1, leaf4_1}
+#   Commit A: node1/{       leaf1_2, leaf2_2,  node2/{leaf3_2, leaf4_2}}
+#   Commit B: node2/{node1/{leaf1_3, leaf2_3},        leaf3_3, leaf4_3}
+#   Expected: Content merge conflicts for each of:
+#               node1/node2/node1/{leaf1, leaf2},
+#               node2/node1/node2/{leaf3, leaf4}
+#   NOTE: This is *exactly* like 12c, except that every path is modified on
+#         each side of the merge.
+
+test_expect_success '12c-setup: Moving one directory hierarchy into another w/ content merge' '
+	test_create_repo 12c &&
+	(
+		cd 12c &&
+
+		mkdir -p node1 node2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf1\n" >node1/leaf1 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf2\n" >node1/leaf2 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf3\n" >node2/leaf3 &&
+		printf "1\n2\n3\n4\n5\n6\n7\n8\nleaf4\n" >node2/leaf4 &&
+		git add node1 node2 &&
+		test_tick &&
+		git commit -m "O" &&
+
+		git branch O &&
+		git branch A &&
+		git branch B &&
+
+		git checkout A &&
+		git mv node2/ node1/ &&
+		for i in `git ls-files`; do echo side A >>$i; done &&
+		git add -u &&
+		test_tick &&
+		git commit -m "A" &&
+
+		git checkout B &&
+		git mv node1/ node2/ &&
+		for i in `git ls-files`; do echo side B >>$i; done &&
+		git add -u &&
+		test_tick &&
+		git commit -m "B"
+	)
+'
+
+test_expect_failure '12c-check: Moving one directory hierarchy into another w/ content merge' '
+	(
+		cd 12c &&
+
+		git checkout A^0 &&
+
+		test_must_fail git merge -s recursive B^0 &&
+
+		git ls-files -u >out &&
+		test_line_count = 12 out &&
+
+		git rev-parse >actual \
+			:1:node1/node2/node1/leaf1 \
+			:1:node1/node2/node1/leaf2 \
+			:1:node2/node1/node2/leaf3 \
+			:1:node2/node1/node2/leaf4 \
+			:2:node1/node2/node1/leaf1 \
+			:2:node1/node2/node1/leaf2 \
+			:2:node2/node1/node2/leaf3 \
+			:2:node2/node1/node2/leaf4 \
+			:3:node1/node2/node1/leaf1 \
+			:3:node1/node2/node1/leaf2 \
+			:3:node2/node1/node2/leaf3 \
+			:3:node2/node1/node2/leaf4 &&
+		git rev-parse >expect \
+			O:node1/leaf1 \
+			O:node1/leaf2 \
+			O:node2/leaf3 \
+			O:node2/leaf4 \
+			A:node1/leaf1 \
+			A:node1/leaf2 \
+			A:node1/node2/leaf3 \
+			A:node1/node2/leaf4 \
+			B:node2/node1/leaf1 \
+			B:node2/node1/leaf2 \
+			B:node2/leaf3 \
+			B:node2/leaf4 &&
+		test_cmp expect actual
+	)
+'
+
 test_done
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 30/31] merge-recursive: avoid spurious rename/rename conflict from dir renames
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (28 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 29/31] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-01-30 23:25 ` [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file Elijah Newren
  2018-01-30 23:41 ` [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

If a file on one side of history was renamed, and merely modified on the
other side, then applying a directory rename to the modified side gives us
a rename/rename(1to2) conflict.  We should only apply directory renames to
pairs representing either adds or renames.

Making this change means that a directory rename testcase that was
previously reported as a rename/delete conflict will now be reported as a
modify/delete conflict.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   |  4 +--
 t/t6043-merge-rename-directories.sh | 55 +++++++++++++++++--------------------
 2 files changed, 27 insertions(+), 32 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 62e4266d21..97859e1ab7 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -1960,7 +1960,7 @@ static void compute_collisions(struct hashmap *collisions,
 		char *new_path;
 		struct diff_filepair *pair = pairs->queue[i];
 
-		if (pair->status == 'D')
+		if (pair->status != 'A' && pair->status != 'R')
 			continue;
 		dir_rename_ent = check_dir_renamed(pair->two->path,
 						   dir_renames);
@@ -2187,7 +2187,7 @@ static struct string_list *get_renames(struct merge_options *o,
 		struct diff_filepair *pair = pairs->queue[i];
 		char *new_path; /* non-NULL only with directory renames */
 
-		if (pair->status == 'D') {
+		if (pair->status != 'A' && pair->status != 'R') {
 			diff_free_filepair(pair);
 			continue;
 		}
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index 3d292f0c5f..f349f69984 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -2070,18 +2070,23 @@ test_expect_success '8b-check: Dual-directory rename, one into the others way, w
 	)
 '
 
-# Testcase 8c, rename+modify/delete
-#   (Related to testcases 5b and 8d)
+# Testcase 8c, modify/delete or rename+modify/delete?
+#   (Related to testcases 5b, 8d, and 9h)
 #   Commit O: z/{b,c,d}
 #   Commit A: y/{b,c}
 #   Commit B: z/{b,c,d_modified,e}
-#   Expected: y/{b,c,e}, CONFLICT(rename+modify/delete: x/d -> y/d or deleted)
+#   Expected: y/{b,c,e}, CONFLICT(modify/delete: on z/d)
 #
-#   Note: This testcase doesn't present any concerns for me...until you
-#         compare it with testcases 5b and 8d.  See notes in 8d for more
-#         details.
-
-test_expect_success '8c-setup: rename+modify/delete' '
+#   Note: It could easily be argued that the correct resolution here is
+#         y/{b,c,e}, CONFLICT(rename/delete: z/d -> y/d vs deleted)
+#         and that the modifed version of d should be present in y/ after
+#         the merge, just marked as conflicted.  Indeed, I previously did
+#         argue that.  But applying directory renames to the side of
+#         history where a file is merely modified results in spurious
+#         rename/rename(1to2) conflicts -- see testcase 9h.  See also
+#         notes in 8d.
+
+test_expect_success '8c-setup: modify/delete or rename+modify/delete?' '
 	test_create_repo 8c &&
 	(
 		cd 8c &&
@@ -2114,32 +2119,32 @@ test_expect_success '8c-setup: rename+modify/delete' '
 	)
 '
 
-test_expect_success '8c-check: rename+modify/delete' '
+test_expect_success '8c-check: modify/delete or rename+modify/delete' '
 	(
 		cd 8c &&
 
 		git checkout A^0 &&
 
 		test_must_fail git merge -s recursive B^0 >out &&
-		test_i18ngrep "CONFLICT (rename/delete).* z/d.*y/d" out &&
+		test_i18ngrep "CONFLICT (modify/delete).* z/d" out &&
 
 		git ls-files -s >out &&
-		test_line_count = 4 out &&
+		test_line_count = 5 out &&
 		git ls-files -u >out &&
-		test_line_count = 1 out &&
+		test_line_count = 2 out &&
 		git ls-files -o >out &&
 		test_line_count = 1 out &&
 
 		git rev-parse >actual \
-			:0:y/b :0:y/c :0:y/e :3:y/d &&
+			:0:y/b :0:y/c :0:y/e :1:z/d :3:z/d &&
 		git rev-parse >expect \
-			 O:z/b  O:z/c  B:z/e  B:z/d &&
+			 O:z/b  O:z/c  B:z/e  O:z/d  B:z/d &&
 		test_cmp expect actual &&
 
-		test_must_fail git rev-parse :1:y/d &&
-		test_must_fail git rev-parse :2:y/d &&
-		git ls-files -s y/d | grep ^100755 &&
-		test_path_is_file y/d
+		test_must_fail git rev-parse :2:z/d &&
+		git ls-files -s z/d | grep ^100755 &&
+		test_path_is_file z/d &&
+		test_path_is_missing y/d
 	)
 '
 
@@ -2153,16 +2158,6 @@ test_expect_success '8c-check: rename+modify/delete' '
 #
 #   Note: It would also be somewhat reasonable to resolve this as
 #             y/{b,c,e}, CONFLICT(rename/delete: x/d -> y/d or deleted)
-#   The logic being that the only difference between this testcase and 8c
-#   is that there is no modification to d.  That suggests that instead of a
-#   rename/modify vs. delete conflict, we should just have a rename/delete
-#   conflict, otherwise we are being inconsistent.
-#
-#   However...as far as consistency goes, we didn't report a conflict for
-#   path d_1 in testcase 5b due to a different file being in the way.  So,
-#   we seem to be forced to have cases where users can change things
-#   slightly and get what they may perceive as inconsistent results.  It
-#   would be nice to avoid that, but I'm not sure I see how.
 #
 #   In this case, I'm leaning towards: commit A was the one that deleted z/d
 #   and it did the rename of z to y, so the two "conflicts" (rename vs.
@@ -2907,7 +2902,7 @@ test_expect_success '9h-setup: Avoid dir rename on merely modified path' '
 	)
 '
 
-test_expect_failure '9h-check: Avoid dir rename on merely modified path' '
+test_expect_success '9h-check: Avoid dir rename on merely modified path' '
 	(
 		cd 9h &&
 
@@ -3951,7 +3946,7 @@ test_expect_success '12c-setup: Moving one directory hierarchy into another w/ c
 	)
 '
 
-test_expect_failure '12c-check: Moving one directory hierarchy into another w/ content merge' '
+test_expect_success '12c-check: Moving one directory hierarchy into another w/ content merge' '
 	(
 		cd 12c &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (29 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 30/31] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
@ 2018-01-30 23:25 ` Elijah Newren
  2018-02-05 21:58   ` Stefan Beller
  2018-01-30 23:41 ` [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
  31 siblings, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:25 UTC (permalink / raw)
  To: gitster; +Cc: git, sbeller, szeder.dev, jrnieder, peff, Elijah Newren

When a file is present in HEAD before the merge and the other side of the
merge does not modify that file, we try to avoid re-writing the file and
making it stat-dirty.  However, when a file is present in HEAD before the
merge and was in a directory that was renamed by the other side of the
merge, we have to move the file to a new location and re-write it.
Update the code that checks whether we can skip the update to also work in
the presence of directory renames.

Signed-off-by: Elijah Newren <newren@gmail.com>
---
 merge-recursive.c                   | 4 +---
 t/t6043-merge-rename-directories.sh | 2 +-
 2 files changed, 2 insertions(+), 4 deletions(-)

diff --git a/merge-recursive.c b/merge-recursive.c
index 97859e1ab7..1e40aff51d 100644
--- a/merge-recursive.c
+++ b/merge-recursive.c
@@ -2741,7 +2741,6 @@ static int merge_content(struct merge_options *o,
 
 	if (mfi.clean && !df_conflict_remains &&
 	    oid_eq(&mfi.oid, a_oid) && mfi.mode == a_mode) {
-		int path_renamed_outside_HEAD;
 		output(o, 3, _("Skipped %s (merged same as existing)"), path);
 		/*
 		 * The content merge resulted in the same file contents we
@@ -2749,8 +2748,7 @@ static int merge_content(struct merge_options *o,
 		 * are recorded at the correct path (which may not be true
 		 * if the merge involves a rename).
 		 */
-		path_renamed_outside_HEAD = !path2 || !strcmp(path, path2);
-		if (!path_renamed_outside_HEAD) {
+		if (was_tracked(path)) {
 			add_cacheinfo(o, mfi.mode, &mfi.oid, path,
 				      0, (!o->call_depth), 0);
 			return mfi.clean;
diff --git a/t/t6043-merge-rename-directories.sh b/t/t6043-merge-rename-directories.sh
index f349f69984..500fdd7755 100755
--- a/t/t6043-merge-rename-directories.sh
+++ b/t/t6043-merge-rename-directories.sh
@@ -3876,7 +3876,7 @@ test_expect_success '12b-setup: Moving one directory hierarchy into another' '
 	)
 '
 
-test_expect_failure '12b-check: Moving one directory hierarchy into another' '
+test_expect_success '12b-check: Moving one directory hierarchy into another' '
 	(
 		cd 12b &&
 
-- 
2.16.1.106.gf69932adfe


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 00/31] Add directory rename detection to git
  2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
                   ` (30 preceding siblings ...)
  2018-01-30 23:25 ` [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file Elijah Newren
@ 2018-01-30 23:41 ` Elijah Newren
  31 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-01-30 23:41 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Git Mailing List, Stefan Beller, SZEDER Gábor,
	Jonathan Nieder, Jeff King, Elijah Newren

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> This patchset introduces directory rename detection to merge-recursive.  See

And a meta question: Should I be trying to submit this feature some other way?

I've really appreciated all the reviews[1], and the testcases are
undoubtedly better because of them.  The code too, but the reviews
have focused almost exclusively on the testcases so far.  Is
merge-recursive just too hairy and my changes too large for folks to
stomach tackling?  Is there something different I could do?

I'm curious for any thoughts on the matter.

Thanks!
Elijah

[1] Especially Stefan who looked through all the patches, but also for
spot reviews from Junio, Johannes Schindelin, Johanness Sixt, Adam
Dinwoodie, and SZEDER Gábor.  Also, a few interested comments from
Philip Oakley and Jacob Keller.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 12/31] merge-recursive: move the get_renames() function
  2018-01-30 23:25 ` [PATCH v7 12/31] merge-recursive: move the get_renames() function Elijah Newren
@ 2018-02-02 23:27   ` Stefan Beller
       [not found]     ` <CABPp-BFDgDDa_fPSFJQUSzR1k5-ix0SWrviUPFu+SCoyWfG5cQ@mail.gmail.com>
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-02 23:27 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> I want to re-use some other functions in the file without moving those
> other functions or dealing with a handful of annoying split function
> declarations and definitions.

    git log --grep "I want" --format="%ad %an %s"

Nowadays we write commit messages slightly differently, more
passively, but I let others comment on the wording of the commit
message if they feel inclined to do so. I can fully understand what
this commit is all about.


> +               struct string_list_item *item;
> +               struct rename *re;
> +               struct diff_filepair *pair = diff_queued_diff.queue[i];
> +
> +               if (pair->status != 'R') {

It is not an exact move, you snuck in an empty line
between the variable declarations and this condition. :)
No big deal, actually I like it for readability.
(related https://public-inbox.org/git/CAGZ79kZX2FsEjD04zr5-oufU6dLhiOhBkxv4u8VEwL0OPRFtiA@mail.gmail.com/)

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic
  2018-01-30 23:25 ` [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic Elijah Newren
@ 2018-02-02 23:36   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-02 23:36 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> The amount of logic in merge_trees() relative to renames was just a few
> lines, but split it out into new handle_renames() and cleanup_renames()
> functions to prepare for additional logic to be added to each.  No code or
> logic changes, just a new place to put stuff for when the rename detection
> gains additional checks.
>
> Note that process_renames() records pointers to various information (such
> as diff_filepairs) into rename_conflict_info structs.  Even though the
> rename string_lists are not directly used once handle_renames() completes,
> we should not immediately free the lists at the end of that function
> because they store the information referenced in the rename_conflict_info,
> which is used later in process_entry().  Thus the reason for a separate
> cleanup_renames().
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs
  2018-01-30 23:25 ` [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
@ 2018-02-02 23:41   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-02 23:41 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> get_renames() has always zero'ed out diff_queued_diff.nr while only
> manually free'ing diff_filepairs that did not correspond to renames.
> Further, it allocated struct renames that were tucked away in the
> return string_list.  Make sure all of these are deallocated when we
> are done with them.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

Thanks for spotting the memleaks and fixing them in here.

If this series takes longer to review/cook, this patch
could go in separately? (with the last patch undone :/)

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious
  2018-01-30 23:25 ` [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
@ 2018-02-02 23:48   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-02 23:48 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> Previously, if !o->detect_rename then get_renames() would return an
> empty string_list, and then process_renames() would have nothing to
> iterate over.  It seems more straightforward to simply avoid calling
> either function in that case.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 1986af79a9..4e6d0c248e 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1338,8 +1338,6 @@ static struct string_list *get_renames(struct merge_options *o,
>         struct diff_options opts;
>
>         renames = xcalloc(1, sizeof(struct string_list));
> -       if (!o->detect_rename)
> -               return renames;
>
>         diff_setup(&opts);
>         opts.flags.recursive = 1;
> @@ -1657,6 +1655,12 @@ static int handle_renames(struct merge_options *o,
>                           struct string_list *entries,
>                           struct rename_info *ri)
>  {
> +       ri->head_renames = NULL;
> +       ri->merge_renames = NULL;
> +
> +       if (!o->detect_rename)
> +               return 1;
> +
>         ri->head_renames  = get_renames(o, head, common, head, merge, entries);
>         ri->merge_renames = get_renames(o, merge, common, head, merge, entries);

So this pulls the condition out and we just do not call get_renames.
Makes sense as then we also do not allocate memory for the stringlists
in get_renames

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs
  2018-01-30 23:25 ` [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs Elijah Newren
@ 2018-02-03  0:06   ` Stefan Beller
  2018-02-03  1:43     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-03  0:06 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

> @@ -1354,10 +1345,43 @@ static struct string_list *get_renames(struct merge_options *o,
>         diffcore_std(&opts);
>         if (opts.needed_rename_limit > o->needed_rename_limit)
>                 o->needed_rename_limit = opts.needed_rename_limit;
> -       for (i = 0; i < diff_queued_diff.nr; ++i) {
> +
> +       ret = malloc(sizeof(struct diff_queue_struct));

Please use xmalloc() and while at it, please use "*ret" as the argument
to sizeof. The reason is slightly better maintainability, as then the type
of ret can be changed at the declaration and the sizeof computation is still
correct.

> +       ret->queue = diff_queued_diff.queue;
> +       ret->nr = diff_queued_diff.nr;
> +       /* Ignore diff_queued_diff.alloc; we won't be changing size at all */
> +
> +       opts.output_format = DIFF_FORMAT_NO_OUTPUT;
> +       diff_queued_diff.nr = 0;
> +       diff_queued_diff.queue = NULL;
> +       diff_flush(&opts);

The comment is rather meant for the later lines or the former lines
(where ret is assigned), the empty line seems like it could go before
the comment?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-01-30 23:25 ` [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames Elijah Newren
@ 2018-02-03  0:26   ` Stefan Beller
  2018-02-03 21:34     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-03  0:26 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> This just adds dir_rename_entry and the associated functions; code using
> these will be added in subsequent commits.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 35 +++++++++++++++++++++++++++++++++++
>  merge-recursive.h |  8 ++++++++
>  2 files changed, 43 insertions(+)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 8ac69e1cbb..3b6d0e3f70 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -49,6 +49,41 @@ static unsigned int path_hash(const char *path)
>         return ignore_case ? strihash(path) : strhash(path);
>  }
>
> +static struct dir_rename_entry *dir_rename_find_entry(struct hashmap *hashmap,
> +                                                     char *dir)
> +{
> +       struct dir_rename_entry key;
> +
> +       if (dir == NULL)
> +               return NULL;
> +       hashmap_entry_init(&key, strhash(dir));
> +       key.dir = dir;
> +       return hashmap_get(hashmap, &key, NULL);
> +}
> +
> +static int dir_rename_cmp(void *unused_cmp_data,
> +                         const struct dir_rename_entry *e1,
> +                         const struct dir_rename_entry *e2,
> +                         const void *unused_keydata)
> +{
> +       return strcmp(e1->dir, e2->dir);
> +}
> +
> +static void dir_rename_init(struct hashmap *map)
> +{
> +       hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);

See 55c965f3a2 (Merge branch 'sb/hashmap-cleanup', 2017-08-11) or rather
201c14e375 (attr.c: drop hashmap_cmp_fn cast, 2017-06-30) for an attempt to
keep out the casting in the init, but cast the void->type inside the function.

> +static void dir_rename_entry_init(struct dir_rename_entry *entry,
> +                                 char *directory)
> +{
> +       hashmap_entry_init(entry, strhash(directory));
> +       entry->dir = directory;
> +       entry->non_unique_new_dir = 0;
> +       strbuf_init(&entry->new_dir, 0);
> +       string_list_init(&entry->possible_new_dirs, 0);
> +}
> +
>  static void flush_output(struct merge_options *o)
>  {
>         if (o->buffer_output < 2 && o->obuf.len) {
> diff --git a/merge-recursive.h b/merge-recursive.h
> index 80d69d1401..d7f4cc80c1 100644
> --- a/merge-recursive.h
> +++ b/merge-recursive.h
> @@ -29,6 +29,14 @@ struct merge_options {
>         struct string_list df_conflict_file_set;
>  };
>
> +struct dir_rename_entry {
> +       struct hashmap_entry ent; /* must be the first member! */
> +       char *dir;
> +       unsigned non_unique_new_dir:1;
> +       struct strbuf new_dir;
> +       struct string_list possible_new_dirs;
> +};

Could you add comments what these are and if they have any constraints?
e.g. is 'dir' the full path from the root of the repo and does it end
with a '/' ?

Having a stringlist of potentially new dirs sounds like the algorithm is
at least n^2, but how do I know? I'll read on.

This patch only adds static functions, so some compilers may even refuse
to compile after this patch (-Werror -Wunused). I am not sure how strict we
are there, but as git-bisect still hasn't learned about going only
into the first
parent to have bisect working on a "per series" level rather than a "per commit"
level, it is possible that someone will end up on this commit in the future and
it won't compile well. :/

Not sure what to recommend, maybe squash this with the patch that makes
use of these functions?

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames
  2018-01-30 23:25 ` [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
@ 2018-02-03  0:31   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-03  0:31 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> In anticipation of more involved cleanup to come, make a helper function
> for doing the cleanup at the end of handle_renames.  Rename the already
> existing cleanup_rename[s]() to final_cleanup_rename[s](), name the new
> helper initial_cleanup_rename(), and leave the big comment in the code
> about why we can't do all the cleanup at once.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

This mostly touches new code of the series, so it feels a bit unnecessary.
I don't feel strongly about it being a separate patch, as the patches up to
now are easy to review. (and squashing may harm reviewability)

But the code makes sense.

Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-01-30 23:25 ` [PATCH v7 19/31] merge-recursive: add get_directory_renames() Elijah Newren
@ 2018-02-03  1:02   ` Stefan Beller
  2018-02-03 22:32     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-03  1:02 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> This populates a list of directory renames for us.  The list of
> directory renames is not yet used, but will be in subsequent commits.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c | 155 ++++++++++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 152 insertions(+), 3 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 40ed8e1f39..c75d3a5139 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1393,6 +1393,138 @@ static struct diff_queue_struct *get_diffpairs(struct merge_options *o,
>         return ret;
>  }
>
> +static void get_renamed_dir_portion(const char *old_path, const char *new_path,
> +                                   char **old_dir, char **new_dir)
> +{
> +       char *end_of_old, *end_of_new;
> +       int old_len, new_len;
> +
> +       *old_dir = NULL;
> +       *new_dir = NULL;
> +
> +       /* For

comment style.

    /*
     * we prefer to keep the beginning
     * and ending line of a comment free.
     */
    /* unless you write single line comments */

> +        *    "a/b/c/d/foo.c" -> "a/b/something-else/d/foo.c"
> +        * the "d/foo.c" part is the same, we just want to know that
> +        *    "a/b/c" was renamed to "a/b/something-else"
> +        * so, for this example, this function returns "a/b/c" in
> +        * *old_dir and "a/b/something-else" in *new_dir.

So we would not see multi-directory renames?

    "a/b/c/d/foo.c" -> "a/b/something-else/e/foo.c"

could be detected as

    "a/b/{c/d/ => something-else/e}/foo.c"

I assume this patch series is not bringing that to the table.
(which is fine, I am just wondering)

> +        *
> +        * Also, if the basename of the file changed, we don't care.  We
> +        * want to know which portion of the directory, if any, changed.
> +        */
> +       end_of_old = strrchr(old_path, '/');
> +       end_of_new = strrchr(new_path, '/');
> +
> +       if (end_of_old == NULL || end_of_new == NULL)
> +               return;

return early as the files are in the top level, and apparently we do
not support top level renaming?

What about a commit like 81b50f3ce4 (Move 'builtin-*' into
a 'builtin/' subdirectory, 2010-02-22) ?

Well that specific commit left many files outside the new builtin dir,
but conceptually we could see a directory rename of

    /* => /src/*

> +       while (*--end_of_new == *--end_of_old &&
> +              end_of_old != old_path &&
> +              end_of_new != new_path)
> +               ; /* Do nothing; all in the while loop */

We have to compare manually as we'd want to find
the first non-equal and there doesn't seem to be a good
library function for that.

Assuming many repos are UTF8 (including in their paths),
how does this work with display characters longer than one char?
It should be fine as we cut at the slash?

> +       /*
> +        * We've found the first non-matching character in the directory
> +        * paths.  That means the current directory we were comparing
> +        * represents the rename.  Move end_of_old and end_of_new back
> +        * to the full directory name.
> +        */
> +       if (*end_of_old == '/')
> +               end_of_old++;
> +       if (*end_of_old != '/')
> +               end_of_new++;
> +       end_of_old = strchr(end_of_old, '/');
> +       end_of_new = strchr(end_of_new, '/');
> +
> +       /*
> +        * It may have been the case that old_path and new_path were the same
> +        * directory all along.  Don't claim a rename if they're the same.
> +        */
> +       old_len = end_of_old - old_path;
> +       new_len = end_of_new - new_path;
> +
> +       if (old_len != new_len || strncmp(old_path, new_path, old_len)) {

How often are we going to see this string-is-equal case?
Would it make sense to do that first in the function?

> +               *old_dir = xstrndup(old_path, old_len);
> +               *new_dir = xstrndup(new_path, new_len);
> +       }
> +}
> +
> +static struct hashmap *get_directory_renames(struct diff_queue_struct *pairs,
> +                                            struct tree *tree)
> +{
> +       struct hashmap *dir_renames;
> +       struct hashmap_iter iter;
> +       struct dir_rename_entry *entry;
> +       int i;
> +
> +       dir_renames = malloc(sizeof(struct hashmap));

xmalloc

> +       dir_rename_init(dir_renames);
> +       for (i = 0; i < pairs->nr; ++i) {
> +               struct string_list_item *item;
> +               int *count;
> +               struct diff_filepair *pair = pairs->queue[i];
> +               char *old_dir, *new_dir;
> +
> +               /* File not part of directory rename if it wasn't renamed */
> +               if (pair->status != 'R')
> +                       continue;
> +
> +               get_renamed_dir_portion(pair->one->path, pair->two->path,
> +                                       &old_dir,        &new_dir);
> +               if (!old_dir)
> +                       /* Directory didn't change at all; ignore this one. */
> +                       continue;


So the first loop is about counting the number of files in each directory
that are renamed and the later while loop is about mapping them?

> +               /* Strings were xstrndup'ed before inserting into string-list,
> +                * so ask string_list to remove the entries for us.
> +                */

comment style.

> +               entry->possible_new_dirs.strdup_strings = 1;

Why do we need to set strdup_strings here (so late in the
game, we are about to clear it?) Could we set it earlier?

Or rather have the string list duplicate the strings instead of
get_renamed_dir_portion ?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs
  2018-02-03  0:06   ` Stefan Beller
@ 2018-02-03  1:43     ` Elijah Newren
  0 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-03  1:43 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Fri, Feb 2, 2018 at 4:06 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

>> +       ret = malloc(sizeof(struct diff_queue_struct));
>
> Please use xmalloc() and while at it, please use "*ret" as the argument
> to sizeof. The reason is slightly better maintainability, as then the type
> of ret can be changed at the declaration and the sizeof computation is still
> correct.

Will do.

>> +       ret->queue = diff_queued_diff.queue;
>> +       ret->nr = diff_queued_diff.nr;
>> +       /* Ignore diff_queued_diff.alloc; we won't be changing size at all */
>> +
>> +       opts.output_format = DIFF_FORMAT_NO_OUTPUT;
>> +       diff_queued_diff.nr = 0;
>> +       diff_queued_diff.queue = NULL;
>> +       diff_flush(&opts);
>
> The comment is rather meant for the later lines or the former lines
> (where ret is assigned), the empty line seems like it could go before
> the comment?

Perhaps I should just replaced the first three lines, including the
comment, with
  *ret = diff_queued_diff;
?  I was probably thinking along that track, which is why I had the
comment grouped with lines above it, but was for whatever reason just
assigning each value and then noting that one of them was actually
unnecessary.  But it wouldn't hurt to copy either.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-03  0:26   ` Stefan Beller
@ 2018-02-03 21:34     ` Elijah Newren
  2018-02-04  8:54       ` Johannes Sixt
  2018-02-05 19:44       ` Stefan Beller
  0 siblings, 2 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-03 21:34 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Fri, Feb 2, 2018 at 4:26 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

>> +static void dir_rename_init(struct hashmap *map)
>> +{
>> +       hashmap_init(map, (hashmap_cmp_fn) dir_rename_cmp, NULL, 0);
>
> See 55c965f3a2 (Merge branch 'sb/hashmap-cleanup', 2017-08-11) or rather
> 201c14e375 (attr.c: drop hashmap_cmp_fn cast, 2017-06-30) for an attempt to
> keep out the casting in the init, but cast the void->type inside the function.

Fixed (locally).

>> +struct dir_rename_entry {
>> +       struct hashmap_entry ent; /* must be the first member! */
>> +       char *dir;
>> +       unsigned non_unique_new_dir:1;
>> +       struct strbuf new_dir;
>> +       struct string_list possible_new_dirs;
>> +};
>
> Could you add comments what these are and if they have any constraints?
> e.g. is 'dir' the full path from the root of the repo and does it end
> with a '/' ?

Good idea, will do.  To help you out with the rest of your review:

yes, dir is full path from the root of the repo, but no it does not
end with a '/'.  The same two constraints hold for how I store
directories in other variables as well.

If someone renames multiple files in a single directory to several
different other directories, then possible_new_dirs is used
(temporarily) to track what those other directories are and how many
files were moved from `dir` to that directory.  The 'winner' (the
target directory with the most renames from `dir` to it) will be
recorded in new_dir.  If there is no 'winner' (a tie for target
directory with the most renames), then non_unique_new_dir is set.
Once we can set either non_unique_new_dir or possible_new_dirs, then
possible_new_dirs is unnecessary and can be freed.

> Having a stringlist of potentially new dirs sounds like the algorithm is
> at least n^2, but how do I know? I'll read on.

Yes, I suppose it's technically n^2, but n is expected to be O(1).
While one can trivially construct a case making n arbitrarily large,
statistically for real world repositories, I expected the mode of n to
be 1 and the mean to be less than 2.  My original idea was to use a
hash for possible_new_dirs, but since hashes are so painful in C and n
should be very small anyway, I didn't bother.  If anyone can find an
example of a real world open source repository (linux, webkit, git,
etc.) with a merge where n is greater than about 10, I'll be
surprised.

Does that address your concern, or does it sound like I'm kicking the
can down the road?  If it's the latter, we can switch it out.

> This patch only adds static functions, so some compilers may even refuse
> to compile after this patch (-Werror -Wunused). I am not sure how strict we
> are there, but as git-bisect still hasn't learned about going only
> into the first
> parent to have bisect working on a "per series" level rather than a "per commit"
> level, it is possible that someone will end up on this commit in the future and
> it won't compile well. :/
>
> Not sure what to recommend, maybe squash this with the patch that makes
> use of these functions?

I'm unsure as well; I can combine it with 19/31 but I thought that
keeping them separate made it slightly easier to review both.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-02-03  1:02   ` Stefan Beller
@ 2018-02-03 22:32     ` Elijah Newren
  2018-02-04  2:04       ` Elijah Newren
  2018-02-05 19:39       ` Stefan Beller
  0 siblings, 2 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-03 22:32 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Fri, Feb 2, 2018 at 5:02 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

>> +       /* For
>
> comment style.

Fixed it and looked through the file for any other violations and
fixed the ones introduced in this series.  (The ones I added years ago
in other places of the file I just left.)

>> +        *    "a/b/c/d/foo.c" -> "a/b/something-else/d/foo.c"
>> +        * the "d/foo.c" part is the same, we just want to know that
>> +        *    "a/b/c" was renamed to "a/b/something-else"
>> +        * so, for this example, this function returns "a/b/c" in
>> +        * *old_dir and "a/b/something-else" in *new_dir.
>
> So we would not see multi-directory renames?
>
>     "a/b/c/d/foo.c" -> "a/b/something-else/e/foo.c"
>
> could be detected as
>
>     "a/b/{c/d/ => something-else/e}/foo.c"
>
> I assume this patch series is not bringing that to the table.
> (which is fine, I am just wondering)

I fully intended to support that, and believe the code does.  I
changed the comment as follows to try to make it clearer:

    /*
     * For
     *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
     * the "e/foo.c" part is the same, we just want to know that
     *    "a/b/c/d" was renamed to "a/b/some/thing/else"
     * so, for this example, this function returns "a/b/c/d" in
     * *old_dir and "a/b/some/thing/else" in *new_dir.
     *
     * Also, if the basename of the file changed, we don't care.  We
     * want to know which portion of the directory, if any, changed.
     */

Is that better?

>> +        *
>> +        * Also, if the basename of the file changed, we don't care.  We
>> +        * want to know which portion of the directory, if any, changed.
>> +        */
>> +       end_of_old = strrchr(old_path, '/');
>> +       end_of_new = strrchr(new_path, '/');
>> +
>> +       if (end_of_old == NULL || end_of_new == NULL)
>> +               return;
>
> return early as the files are in the top level, and apparently we do
> not support top level renaming?
>
> What about a commit like 81b50f3ce4 (Move 'builtin-*' into
> a 'builtin/' subdirectory, 2010-02-22) ?
>
> Well that specific commit left many files outside the new builtin dir,
> but conceptually we could see a directory rename of
>
>     /* => /src/*

We had some internal usecases for want to support a "rename" of the
toplevel directory into a subdirectory of itself.  However, attempting
to support that opened much too big a can of worms for me.  We'd open
up some big surprises somewhere.

In particular, note that not supporting a "rename" of the toplevel
directory is a special case of not supporting a "rename" of any
directory to a subdirectory below itself, which in turn is a very
special case of excluding partial directory renames.  I addressed this
in the cover letter of my original submission, as follows:

"""
Further, there's a basic question about when directory rename detection
should be applied at all.  I have a simple rule:

  3) If a given directory still exists on both sides of a merge, we do
     not consider it to have been renamed.

Rule 3 may sound obvious at first, but it will probably arise as a
question for some users -- what if someone "mostly" moved a directory but
still left some files around, or, equivalently (from the perspective of the
three-way merge that merge-recursive performs), fully renamed a directory
in one commmit and then recreated that directory in a later commit adding
some new files and then tried to merge?  See the big comment in section 4
of the new t6043 for further discussion of this rule.
"""

Patch 04/31 is the one that adds that big comment with further discussion.

Maybe there's a way to support toplevel renames, but I think it'd make
this series longer and more complicated...and may cause more edge
cases that confuse users.

>> +       while (*--end_of_new == *--end_of_old &&
>> +              end_of_old != old_path &&
>> +              end_of_new != new_path)
>> +               ; /* Do nothing; all in the while loop */
>
> We have to compare manually as we'd want to find
> the first non-equal and there doesn't seem to be a good
> library function for that.
>
> Assuming many repos are UTF8 (including in their paths),
> how does this work with display characters longer than one char?
> It should be fine as we cut at the slash?

Oh, UTF-8.  Ugh.
Can UTF-8 characters, other than '/', have a byte whose value matches
(unsigned char)('/')?  If so, then I'll need to figure out how to do
utf-8 character parsing.  Anyone have pointers?

>> +       /*
>> +        * We've found the first non-matching character in the directory
>> +        * paths.  That means the current directory we were comparing
>> +        * represents the rename.  Move end_of_old and end_of_new back
>> +        * to the full directory name.
>> +        */
>> +       if (*end_of_old == '/')
>> +               end_of_old++;
>> +       if (*end_of_old != '/')
>> +               end_of_new++;
>> +       end_of_old = strchr(end_of_old, '/');
>> +       end_of_new = strchr(end_of_new, '/');
>> +
>> +       /*
>> +        * It may have been the case that old_path and new_path were the same
>> +        * directory all along.  Don't claim a rename if they're the same.
>> +        */
>> +       old_len = end_of_old - old_path;
>> +       new_len = end_of_new - new_path;
>> +
>> +       if (old_len != new_len || strncmp(old_path, new_path, old_len)) {
>
> How often are we going to see this string-is-equal case?
> Would it make sense to do that first in the function?

We don't have old_len at that point, and old_len != strlen(old_path).
In particular, this is for comparing directories, and old_path and
new_path both stored file paths.
In particular, I think the most common case is someone renaming e.g.
   a/b/file.c -> a/b/newfile.c
The filenames are different, but the directory name is not.  Trying to
compare at the beginning of the function would thus give us the wrong
information.

So the check really needs to be done at this point of the function.

>> +       dir_renames = malloc(sizeof(struct hashmap));
>
> xmalloc

Thanks; I also looked for any other malloc uses I introduced, but it
looks like you caught both of them.  I'll fix them up.

>
>> +       dir_rename_init(dir_renames);
>> +       for (i = 0; i < pairs->nr; ++i) {
>> +               struct string_list_item *item;
>> +               int *count;
>> +               struct diff_filepair *pair = pairs->queue[i];
>> +               char *old_dir, *new_dir;
>> +
>> +               /* File not part of directory rename if it wasn't renamed */
>> +               if (pair->status != 'R')
>> +                       continue;
>> +
>> +               get_renamed_dir_portion(pair->one->path, pair->two->path,
>> +                                       &old_dir,        &new_dir);
>> +               if (!old_dir)
>> +                       /* Directory didn't change at all; ignore this one. */
>> +                       continue;
>
>
> So the first loop is about counting the number of files in each directory
> that are renamed and the later while loop is about mapping them?

Close; would adding the following comment at the top of the function help?

    /*
     * Typically, we think of a directory rename as all files from a
     * certain directory being moved to a target directory.  However,
     * what if someone first moved two files from the original
     * directory in one commit, and then renamed the directory
     * somewhere else in a later commit?  At merge time, we just know
     * that files from the original directory went to two different
     * places, and that the bulk of them ended up in the same place.
     * We want each directory rename to follow the bulk of the files
     * from that directory.  This function exists to find where the
     * bulk of the files went.
     *
     * The first loop below simply iterates through the list of
     * renames, adding up all the rename source->target locations with
     * a count.
     *
     * The second loop compares the count for each renamed directory
     * and declares the "winning" target location.
     */

Is there any part that remains unclear with that comment?  (Also, is
that comment too long?)

>> +               /* Strings were xstrndup'ed before inserting into string-list,
>> +                * so ask string_list to remove the entries for us.
>> +                */
>
> comment style.

Thanks.

>> +               entry->possible_new_dirs.strdup_strings = 1;
>
> Why do we need to set strdup_strings here (so late in the
> game, we are about to clear it?) Could we set it earlier?
>
> Or rather have the string list duplicate the strings instead of
> get_renamed_dir_portion ?

We didn't strdup the original string (a file path) as-is, we
strndup'ed a subset of the original string (just the relevant portion
of the directory).  Since we already had to allocate a special string
for that, making the string list duplicate the strings would have
caused an extra unnecessary allocation and required us to free the
memory allocated by get_renamed_dir_portion manually.  So, we do need
this here.

Does this comment make it clearer?:

        /*
         * The relevant directory sub-portion of the original full
         * filepaths were xstrndup'ed before inserting into
         * possible_new_dirs, and instead of manually iterating the
         * list and free'ing each, just lie and tell
         * possible_new_dirs that it did the strdup'ing so that it
         * will free them for us.
         */
        entry->possible_new_dirs.strdup_strings = 1;
        string_list_clear(&entry->possible_new_dirs, 1);

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-02-03 22:32     ` Elijah Newren
@ 2018-02-04  2:04       ` Elijah Newren
  2018-02-04  4:42         ` Eric Sunshine
  2018-02-05 19:39       ` Stefan Beller
  1 sibling, 1 reply; 63+ messages in thread
From: Elijah Newren @ 2018-02-04  2:04 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Sat, Feb 3, 2018 at 2:32 PM, Elijah Newren <newren@gmail.com> wrote:
> On Fri, Feb 2, 2018 at 5:02 PM, Stefan Beller <sbeller@google.com> wrote:
>> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

>>> +       while (*--end_of_new == *--end_of_old &&
>>> +              end_of_old != old_path &&
>>> +              end_of_new != new_path)
>>> +               ; /* Do nothing; all in the while loop */
>>
>> We have to compare manually as we'd want to find
>> the first non-equal and there doesn't seem to be a good
>> library function for that.
>>
>> Assuming many repos are UTF8 (including in their paths),
>> how does this work with display characters longer than one char?
>> It should be fine as we cut at the slash?
>
> Oh, UTF-8.  Ugh.
> Can UTF-8 characters, other than '/', have a byte whose value matches
> (unsigned char)('/')?  If so, then I'll need to figure out how to do
> utf-8 character parsing.  Anyone have pointers?

Well, after digging around for a while, I found this claim on the
Wikipedia page for UTF-8:

  Since ASCII bytes do not occur when encoding non-ASCII code points
into UTF-8, UTF-8 is safe to use within most programming and document
languages that interpret certain ASCII characters in a special way,
such as "/" in filenames, "\" in escape sequences, and "%" in printf.

So, unless I'm reading something wrong here, I think that means this
code is just fine as it is.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-02-04  2:04       ` Elijah Newren
@ 2018-02-04  4:42         ` Eric Sunshine
  2018-02-04  4:44           ` Eric Sunshine
  0 siblings, 1 reply; 63+ messages in thread
From: Eric Sunshine @ 2018-02-04  4:42 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Junio C Hamano, git, SZEDER Gábor,
	Jonathan Nieder, Jeff King

On Sat, Feb 3, 2018 at 9:04 PM, Elijah Newren <newren@gmail.com> wrote:
> On Sat, Feb 3, 2018 at 2:32 PM, Elijah Newren <newren@gmail.com> wrote:
>> On Fri, Feb 2, 2018 at 5:02 PM, Stefan Beller <sbeller@google.com> wrote:
>>> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
>>>> +       while (*--end_of_new == *--end_of_old &&
>>>> +              end_of_old != old_path &&
>>>> +              end_of_new != new_path)
>>>> +               ; /* Do nothing; all in the while loop */
>>>
>>> Assuming many repos are UTF8 (including in their paths),
>>> how does this work with display characters longer than one char?
>>> It should be fine as we cut at the slash?
>>
>> Can UTF-8 characters, other than '/', have a byte whose value matches
>> (unsigned char)('/')?  If so, then I'll need to figure out how to do
>> utf-8 character parsing.  Anyone have pointers?
>
> Well, after digging around for a while, I found this claim on the
> Wikipedia page for UTF-8:
>
>   Since ASCII bytes do not occur when encoding non-ASCII code points
> into UTF-8, UTF-8 is safe to use within most programming and document
> languages that interpret certain ASCII characters in a special way,
> such as "/" in filenames, "\" in escape sequences, and "%" in printf.
>
> So, unless I'm reading something wrong here, I think that means this
> code is just fine as it is.

You're reading it correctly. Unicode values greater than \U007f
encoded with UTF-8 will never contain bytes which can be confused with
any 7-bit ASCII character.

It's possible that Stefan was thinking of "combining characters"[1]
which may be "precomposed" and "decomposed"[2], but which appear the
same when rendered. For instance, "ö" might be a single Unicode
codepoint or two codepoints, such as "o" combined with a diaeresis.
It's a potential problem if you're comparing, byte by byte, two
filenames which look the same. However, Git takes pains[3] to avoid
this problem by ensuring (if possible) that filenames are precomposed
within Git even if they happen to be decomposed on the actual
filesystem. So, most likely, your code is okay as-is.

[1]: https://en.wikipedia.org/wiki/Combining_character
[2]: https://en.wikipedia.org/wiki/Diaeresis_(diacritic)
[3]: https://github.com/git/git/blob/master/compat/precompose_utf8.c

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-02-04  4:42         ` Eric Sunshine
@ 2018-02-04  4:44           ` Eric Sunshine
  0 siblings, 0 replies; 63+ messages in thread
From: Eric Sunshine @ 2018-02-04  4:44 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Junio C Hamano, git, SZEDER Gábor,
	Jonathan Nieder, Jeff King

On Sat, Feb 3, 2018 at 11:42 PM, Eric Sunshine <sunshine@sunshineco.com> wrote:
> [2]: https://en.wikipedia.org/wiki/Diaeresis_(diacritic)

Correction:
[2]: https://en.wikipedia.org/wiki/Precomposed_character

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-03 21:34     ` Elijah Newren
@ 2018-02-04  8:54       ` Johannes Sixt
  2018-02-05 14:56         ` Elijah Newren
  2018-02-05 20:01         ` Junio C Hamano
  2018-02-05 19:44       ` Stefan Beller
  1 sibling, 2 replies; 63+ messages in thread
From: Johannes Sixt @ 2018-02-04  8:54 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Stefan Beller, Junio C Hamano, git, SZEDER Gábor,
	Jonathan Nieder, Jeff King

Am 03.02.2018 um 22:34 schrieb Elijah Newren:
>  If anyone can find an
> example of a real world open source repository (linux, webkit, git,
> etc.) with a merge where n is greater than about 10, I'll be
> surprised.

git rev-list --parents --merges master |
 grep " .* .* .* .* .* .* .* .* .* .* "

returns quite a few hits in the Linux repository. Most notable is
fa623d1b0222adbe8f822e53c08003b9679a410c; spoiler: it has 30 parents.

-- Hannes

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-04  8:54       ` Johannes Sixt
@ 2018-02-05 14:56         ` Elijah Newren
  2018-02-05 20:01         ` Junio C Hamano
  1 sibling, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-05 14:56 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

// Re-sending because of bounce

On Sun, Feb 4, 2018 at 12:54 AM, Johannes Sixt <j6t@kdbg.org> wrote:
> Am 03.02.2018 um 22:34 schrieb Elijah Newren:
>>  If anyone can find an
>> example of a real world open source repository (linux, webkit, git,
>> etc.) with a merge where n is greater than about 10, I'll be
>> surprised.
>
> git rev-list --parents --merges master |
>  grep " .* .* .* .* .* .* .* .* .* .* "
>
> returns quite a few hits in the Linux repository. Most notable is
> fa623d1b0222adbe8f822e53c08003b9679a410c; spoiler: it has 30 parents.
>
> -- Hannes

Sorry, I didn't make it very clear what n represented. This is in the
context of detecting directory renames, and in this discussion n
represents the number of distinct directories that files were renamed
into from a single original directory on a given side of the merge. So
your example of number of parents of a commit isn't directly relevant
to this case (also, even along your tangent, merge-recursive is only
invoked when the number of parents is precisely two). However, I find
your nugget rather interesting, even if unrelated. I had known of
merges with more than 10 parents, but somehow was unaware that the
limit of 16 parents had been lifted. And digging through the history,
it was apparently removed quite a while ago. I love the commit message
that lifted the limit too:

"There is no really good reason to have a merge with more than 16
parents, but we have a history of giving our users rope."

:-)

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 12/31] merge-recursive: move the get_renames() function
       [not found]     ` <CABPp-BFDgDDa_fPSFJQUSzR1k5-ix0SWrviUPFu+SCoyWfG5cQ@mail.gmail.com>
@ 2018-02-05 18:57       ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 18:57 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

> Move this function so it can re-use some others (without either
> moving all of them or adding an annoying split between function
> declarations and definitions).  Cheat slightly by adding a blank line
> for readability, and in order to silence checkpatch.pl.
>
> Does that sound better?

Yeah, that sounds perfect (no red flags on reading)

Thanks!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 19/31] merge-recursive: add get_directory_renames()
  2018-02-03 22:32     ` Elijah Newren
  2018-02-04  2:04       ` Elijah Newren
@ 2018-02-05 19:39       ` Stefan Beller
  1 sibling, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 19:39 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

>     /*
>      * For
>      *    "a/b/c/d/e/foo.c" -> "a/b/some/thing/else/e/foo.c"
>      * the "e/foo.c" part is the same, we just want to know that
>      *    "a/b/c/d" was renamed to "a/b/some/thing/else"
>      * so, for this example, this function returns "a/b/c/d" in
>      * *old_dir and "a/b/some/thing/else" in *new_dir.
>      *
>      * Also, if the basename of the file changed, we don't care.  We
>      * want to know which portion of the directory, if any, changed.
>      */
>
> Is that better?

yes, that helps me understanding pretty much.

>
>>> +        *
>>> +        * Also, if the basename of the file changed, we don't care.  We
>>> +        * want to know which portion of the directory, if any, changed.
>>> +        */
>>> +       end_of_old = strrchr(old_path, '/');
>>> +       end_of_new = strrchr(new_path, '/');
>>> +
>>> +       if (end_of_old == NULL || end_of_new == NULL)
>>> +               return;
>>
>> return early as the files are in the top level, and apparently we do
>> not support top level renaming?
>>
>> What about a commit like 81b50f3ce4 (Move 'builtin-*' into
>> a 'builtin/' subdirectory, 2010-02-22) ?
>>
>> Well that specific commit left many files outside the new builtin dir,
>> but conceptually we could see a directory rename of
>>
>>     /* => /src/*
>
> We had some internal usecases for want to support a "rename" of the
> toplevel directory into a subdirectory of itself.  However, attempting
> to support that opened much too big a can of worms for me.  We'd open
> up some big surprises somewhere.


ok, I was just trying to understand the code.
(As said before, I was not asking for having this implemented.)

> In particular, note that not supporting a "rename" of the toplevel
> directory is a special case of not supporting a "rename" of any
> directory to a subdirectory below itself, which in turn is a very
> special case of excluding partial directory renames.  I addressed this
> in the cover letter of my original submission, as follows:
>
> """
> Further, there's a basic question about when directory rename detection
> should be applied at all.  I have a simple rule:
>
>   3) If a given directory still exists on both sides of a merge, we do
>      not consider it to have been renamed.
>
> Rule 3 may sound obvious at first, but it will probably arise as a
> question for some users -- what if someone "mostly" moved a directory but
> still left some files around, or, equivalently (from the perspective of the
> three-way merge that merge-recursive performs), fully renamed a directory
> in one commmit and then recreated that directory in a later commit adding
> some new files and then tried to merge?  See the big comment in section 4
> of the new t6043 for further discussion of this rule.
> """
>
> Patch 04/31 is the one that adds that big comment with further discussion.
>
> Maybe there's a way to support toplevel renames, but I think it'd make
> this series longer and more complicated...and may cause more edge
> cases that confuse users.

Thanks for reminding!

>
>>> +       while (*--end_of_new == *--end_of_old &&
>>> +              end_of_old != old_path &&
>>> +              end_of_new != new_path)
>>> +               ; /* Do nothing; all in the while loop */
>>
>> We have to compare manually as we'd want to find
>> the first non-equal and there doesn't seem to be a good
>> library function for that.
>>
>> Assuming many repos are UTF8 (including in their paths),
>> how does this work with display characters longer than one char?
>> It should be fine as we cut at the slash?
>
> Oh, UTF-8.  Ugh.
> Can UTF-8 characters, other than '/', have a byte whose value matches
> (unsigned char)('/')?  If so, then I'll need to figure out how to do
> utf-8 character parsing.  Anyone have pointers?

Oh right we are always cutting at '/', which hex 2F, so we cannot split
a codepoint in half accidentally (finding the slash inside a codepoint).

And renaming a directory from one utf8 codepoint to another, which
have the same prefix is not an issue at this layer, too.
(Think of abc -> abd just all of abc/d in one code point, but there
but even for multi code points/ASCII we repeat the prefix when
printing the rename)

>> So the first loop is about counting the number of files in each directory
>> that are renamed and the later while loop is about mapping them?
>
> Close; would adding the following comment at the top of the function help?
>
>     /*
>      * Typically, we think of a directory rename as all files from a
>      * certain directory being moved to a target directory.  However,
>      * what if someone first moved two files from the original
>      * directory in one commit, and then renamed the directory
>      * somewhere else in a later commit?  At merge time, we just know
>      * that files from the original directory went to two different
>      * places, and that the bulk of them ended up in the same place.
>      * We want each directory rename to follow the bulk of the files
>      * from that directory.  This function exists to find where the
>      * bulk of the files went.
>      *
>      * The first loop below simply iterates through the list of
>      * renames, adding up all the rename source->target locations with
>      * a count.
>      *
>      * The second loop compares the count for each renamed directory
>      * and declares the "winning" target location.
>      */
>
> Is there any part that remains unclear with that comment?  (Also, is
> that comment too long?)

Sounds good to me. Thanks!


>
>>> +               /* Strings were xstrndup'ed before inserting into string-list,
>>> +                * so ask string_list to remove the entries for us.
>>> +                */
>>
>> comment style.
>
> Thanks.
>
>>> +               entry->possible_new_dirs.strdup_strings = 1;
>>
>> Why do we need to set strdup_strings here (so late in the
>> game, we are about to clear it?) Could we set it earlier?
>>
>> Or rather have the string list duplicate the strings instead of
>> get_renamed_dir_portion ?
>
> We didn't strdup the original string (a file path) as-is, we
> strndup'ed a subset of the original string (just the relevant portion
> of the directory).  Since we already had to allocate a special string
> for that, making the string list duplicate the strings would have
> caused an extra unnecessary allocation and required us to free the
> memory allocated by get_renamed_dir_portion manually.  So, we do need
> this here.
>
> Does this comment make it clearer?:
>
>         /*
>          * The relevant directory sub-portion of the original full
>          * filepaths were xstrndup'ed before inserting into
>          * possible_new_dirs, and instead of manually iterating the
>          * list and free'ing each, just lie and tell
>          * possible_new_dirs that it did the strdup'ing so that it
>          * will free them for us.
>          */
>         entry->possible_new_dirs.strdup_strings = 1;
>         string_list_clear(&entry->possible_new_dirs, 1);

Phrased this way, it makes sense.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-03 21:34     ` Elijah Newren
  2018-02-04  8:54       ` Johannes Sixt
@ 2018-02-05 19:44       ` Stefan Beller
  2018-02-05 21:27         ` Elijah Newren
  1 sibling, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 19:44 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

>> Having a stringlist of potentially new dirs sounds like the algorithm is
>> at least n^2, but how do I know? I'll read on.
>
> Yes, I suppose it's technically n^2, but n is expected to be O(1).
> While one can trivially construct a case making n arbitrarily large,
> statistically for real world repositories, I expected the mode of n to
> be 1 and the mean to be less than 2.  My original idea was to use a
> hash for possible_new_dirs, but since hashes are so painful in C and n
> should be very small anyway, I didn't bother.  If anyone can find an
> example of a real world open source repository (linux, webkit, git,
> etc.) with a merge where n is greater than about 10, I'll be
> surprised.
>
> Does that address your concern, or does it sound like I'm kicking the
> can down the road?  If it's the latter, we can switch it out.

I think that is fine for now; the real world usage matters more
than the big O notation. But maybe you want to hint at the possibility of
speedup (in the commit message or in code?), once someone produces
a slow case and digs up the code.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 20/31] merge-recursive: check for directory level conflicts
  2018-01-30 23:25 ` [PATCH v7 20/31] merge-recursive: check for directory level conflicts Elijah Newren
@ 2018-02-05 20:00   ` Stefan Beller
  2018-02-05 21:12     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 20:00 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> Before trying to apply directory renames to paths within the given
> directories, we want to make sure that there aren't conflicts at the
> directory level.  There will be additional checks at the individual
> file level too, which will be added later.


> +static int tree_has_path(struct tree *tree, const char *path)
> +{
> +       unsigned char hashy[20];

What is 20? ;)
I think you want to use GIT_MAX_RAWSZ instead.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-04  8:54       ` Johannes Sixt
  2018-02-05 14:56         ` Elijah Newren
@ 2018-02-05 20:01         ` Junio C Hamano
  1 sibling, 0 replies; 63+ messages in thread
From: Junio C Hamano @ 2018-02-05 20:01 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Elijah Newren, Stefan Beller, git, SZEDER Gábor,
	Jonathan Nieder, Jeff King

Johannes Sixt <j6t@kdbg.org> writes:

> Am 03.02.2018 um 22:34 schrieb Elijah Newren:
>>  If anyone can find an
>> example of a real world open source repository (linux, webkit, git,
>> etc.) with a merge where n is greater than about 10, I'll be
>> surprised.
>
> git rev-list --parents --merges master |
>  grep " .* .* .* .* .* .* .* .* .* .* "

Twinkle twinkle little stars? ;-)

rev-list can take --min-parents=10 option for a thing like this.  In
fact, --merges is implemented in terms of that option.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions
  2018-01-30 23:25 ` [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions Elijah Newren
@ 2018-02-05 20:02   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 20:02 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> Directory renames with the ability to merge directories opens up the
> possibility of add/add/add/.../add conflicts, if each of the N
> directories being merged into one target directory all had a file with
> the same name.  We need a way to check for and report on such
> collisions; this hashmap will be used for this purpose.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---

The same comment on compiling applies here.
Nice to review, but it may not compile due to unused functions
that a strict compiler may not be happy about.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames
  2018-01-30 23:25 ` [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
@ 2018-02-05 20:52   ` Stefan Beller
  2018-02-05 21:26     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 20:52 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> This fixes an issue that existed before my directory rename detection
> patches that affects both normal renames and renames implied by
> directory rename detection.  Additional codepaths that only affect
> overwriting of directy files that are involved in directory rename
> detection will be added in a subsequent commit.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>

So this fixes bugs in the current rename detection with
overwriting untracked files? Then this is an additional
selling point of this series, maybe worth covering in the
cover letter!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 20/31] merge-recursive: check for directory level conflicts
  2018-02-05 20:00   ` Stefan Beller
@ 2018-02-05 21:12     ` Elijah Newren
  0 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-05 21:12 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Mon, Feb 5, 2018 at 12:00 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
>> Before trying to apply directory renames to paths within the given
>> directories, we want to make sure that there aren't conflicts at the
>> directory level.  There will be additional checks at the individual
>> file level too, which will be added later.
>
>
>> +static int tree_has_path(struct tree *tree, const char *path)
>> +{
>> +       unsigned char hashy[20];
>
> What is 20? ;)
> I think you want to use GIT_MAX_RAWSZ instead.

Indeed, thanks!

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames
  2018-02-05 20:52   ` Stefan Beller
@ 2018-02-05 21:26     ` Elijah Newren
  0 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-05 21:26 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Mon, Feb 5, 2018 at 12:52 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
>> This fixes an issue that existed before my directory rename detection
>> patches that affects both normal renames and renames implied by
>> directory rename detection.  Additional codepaths that only affect
>> overwriting of directy files that are involved in directory rename

Ugh, "dirty" not "directy".  I must have gotten my fingers trained to
type "directory" too much.  I'll fix that up.

>> detection will be added in a subsequent commit.
>>
>> Signed-off-by: Elijah Newren <newren@gmail.com>
>
> So this fixes bugs in the current rename detection with
> overwriting untracked files? Then this is an additional
> selling point of this series, maybe worth covering in the
> cover letter!

Yes, with a nitpick: the existing issue it fixes is with dirty files
(by which I mean uncommitted changes to tracked files) involved in
renames rather than being an issue with untracked files.

I did mention this fix in my original cover letter[1], but it would
have been really easy to miss because it was a really long cover
letter, and the mention came at the very end.  Quoting from it:

"""
These last three deal with untracked and dirty file overwriting
headaches.  The middle patch in particular, isn't just a fix for
directory rename detection but fixes a bug in current versions of git
in overwriting dirty files that are involved in a rename.  That patch
could be backported and submitted independent of this series, but the
final patch depends heavily on it.
"""

[1] https://public-inbox.org/git/20171110190550.27059-1-newren@gmail.com/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames
  2018-02-05 19:44       ` Stefan Beller
@ 2018-02-05 21:27         ` Elijah Newren
  0 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-05 21:27 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Mon, Feb 5, 2018 at 11:44 AM, Stefan Beller <sbeller@google.com> wrote:
>>> Having a stringlist of potentially new dirs sounds like the algorithm is
>>> at least n^2, but how do I know? I'll read on.
>>
>> Yes, I suppose it's technically n^2, but n is expected to be O(1).
>> While one can trivially construct a case making n arbitrarily large,
>> statistically for real world repositories, I expected the mode of n to
>> be 1 and the mean to be less than 2.  My original idea was to use a
>> hash for possible_new_dirs, but since hashes are so painful in C and n
>> should be very small anyway, I didn't bother.  If anyone can find an
>> example of a real world open source repository (linux, webkit, git,
>> etc.) with a merge where n is greater than about 10, I'll be
>> surprised.
>>
>> Does that address your concern, or does it sound like I'm kicking the
>> can down the road?  If it's the latter, we can switch it out.
>
> I think that is fine for now; the real world usage matters more
> than the big O notation. But maybe you want to hint at the possibility of
> speedup (in the commit message or in code?), once someone produces
> a slow case and digs up the code.

Sounds like a good idea; I'll add a comment about this issue to the
commit message.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases
  2018-01-30 23:25 ` [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
@ 2018-02-05 21:52   ` Stefan Beller
  2018-02-05 22:18     ` Elijah Newren
  0 siblings, 1 reply; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 21:52 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c                   | 26 +++++++++++++++++++++++---
>  t/t6043-merge-rename-directories.sh |  8 ++++----
>  2 files changed, 27 insertions(+), 7 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index fba1a0d207..62e4266d21 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c
> @@ -1320,11 +1320,23 @@ static int handle_file(struct merge_options *o,
>
>         add = filespec_from_entry(&other, dst_entry, stage ^ 1);
>         if (add) {
> +               int ren_src_was_dirty = was_dirty(o, rename->path);
>                 char *add_name = unique_path(o, rename->path, other_branch);
>                 if (update_file(o, 0, &add->oid, add->mode, add_name))
>                         return -1;
>
> -               remove_file(o, 0, rename->path, 0);
> +               if (ren_src_was_dirty) {
> +                       output(o, 1, _("Refusing to lose dirty file at %s"),
> +                              rename->path);
> +               }
> +               /*
> +                * Stupid double negatives in remove_file; it somehow manages
> +                * to repeatedly mess me up.  So, just for myself:
> +                *    1) update_wd iff !ren_src_was_dirty.
> +                *    2) no_wd iff !update_wd
> +                *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
> +                */

Not sure iff this comment is at the right place and is a good addition to
the code base.

However it hints at the underlying issue of a bad API that is provided
by remove_file ?

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file
  2018-01-30 23:25 ` [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file Elijah Newren
@ 2018-02-05 21:58   ` Stefan Beller
  0 siblings, 0 replies; 63+ messages in thread
From: Stefan Beller @ 2018-02-05 21:58 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:
> When a file is present in HEAD before the merge and the other side of the
> merge does not modify that file, we try to avoid re-writing the file and
> making it stat-dirty.  However, when a file is present in HEAD before the
> merge and was in a directory that was renamed by the other side of the
> merge, we have to move the file to a new location and re-write it.
> Update the code that checks whether we can skip the update to also work in
> the presence of directory renames.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---

Now I did not just review the tests, but also the code of this series.
(though I think I was more detail oriented on Friday)
All patches that had no comments are fine with me,
I left some comments on others.

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases
  2018-02-05 21:52   ` Stefan Beller
@ 2018-02-05 22:18     ` Elijah Newren
  0 siblings, 0 replies; 63+ messages in thread
From: Elijah Newren @ 2018-02-05 22:18 UTC (permalink / raw)
  To: Stefan Beller
  Cc: Junio C Hamano, git, SZEDER Gábor, Jonathan Nieder, Jeff King

On Mon, Feb 5, 2018 at 1:52 PM, Stefan Beller <sbeller@google.com> wrote:
> On Tue, Jan 30, 2018 at 3:25 PM, Elijah Newren <newren@gmail.com> wrote:

>> +               /*
>> +                * Stupid double negatives in remove_file; it somehow manages
>> +                * to repeatedly mess me up.  So, just for myself:
>> +                *    1) update_wd iff !ren_src_was_dirty.
>> +                *    2) no_wd iff !update_wd
>> +                *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
>> +                */
>
> Not sure iff this comment is at the right place and is a good addition to
> the code base.

Fair enough, and I should apologize for letting my frustration come
through there.  However, what if I replaced the first two lines of the
comment with:

"Because the double negatives somehow keep confusing me..."

so that it reads:
        /*
         * Because the double negatives somehow keep confusing me...
         *    1) update_wd iff !ren_src_was_dirty.
         *    2) no_wd iff !update_wd
         *    3) so, no_wd == !!ren_src_was_dirty == ren_src_was_dirty
         */

Even if my wording was suboptimal, the rest of the comment did seem
pretty important because I messed up the line after the comment
multiple times.  (You'd think that the odds of getting it right should
be 50/50 and that a simple inversion would fix it, so one could only
mess the line up once, but I'm apparently special).  And then I came
back to look at it later and was still confused.  For some reason, I
seem to need the longer explanation.

> However it hints at the underlying issue of a bad API that is provided
> by remove_file ?

I'd definitely agree with that.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames
  2018-01-30 23:25 ` [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames Elijah Newren
@ 2018-02-16  1:14   ` SZEDER Gábor
  0 siblings, 0 replies; 63+ messages in thread
From: SZEDER Gábor @ 2018-02-16  1:14 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, Git mailing list, Stefan Beller, Jonathan Nieder,
	Jeff King

On Wed, Jan 31, 2018 at 12:25 AM, Elijah Newren <newren@gmail.com> wrote:
> This commit hooks together all the directory rename logic by making the
> necessary changes to the rename struct, it's dst_entry, and the
> diff_filepair under consideration.
>
> Signed-off-by: Elijah Newren <newren@gmail.com>
> ---
>  merge-recursive.c                   | 187 +++++++++++++++++++++++++++++++++++-
>  t/t6043-merge-rename-directories.sh |  50 +++++-----
>  2 files changed, 211 insertions(+), 26 deletions(-)
>
> diff --git a/merge-recursive.c b/merge-recursive.c
> index 38dc0eefaf..7c78dc2dc1 100644
> --- a/merge-recursive.c
> +++ b/merge-recursive.c

> @@ -641,6 +643,27 @@ static int update_stages(struct merge_options *opt, const char *path,
>         return 0;
>  }
>
> +static int update_stages_for_stage_data(struct merge_options *opt,
> +                                       const char *path,
> +                                       const struct stage_data *stage_data)
> +{
> +       struct diff_filespec o, a, b;
> +
> +       o.mode = stage_data->stages[1].mode;
> +       oidcpy(&o.oid, &stage_data->stages[1].oid);
> +
> +       a.mode = stage_data->stages[2].mode;
> +       oidcpy(&a.oid, &stage_data->stages[2].oid);
> +
> +       b.mode = stage_data->stages[3].mode;
> +       oidcpy(&b.oid, &stage_data->stages[3].oid);
> +
> +       return update_stages(opt, path,
> +                            is_null_sha1(o.oid.hash) ? NULL : &o,
> +                            is_null_sha1(a.oid.hash) ? NULL : &a,
> +                            is_null_sha1(b.oid.hash) ? NULL : &b);

Please use is_null_oid(&o.oid) etc. instead of is_null_sha1().

^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2018-02-16  1:14 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-01-30 23:25 [PATCH v7 00/31] Add directory rename detection to git Elijah Newren
2018-01-30 23:25 ` [PATCH v7 01/31] directory rename detection: basic testcases Elijah Newren
2018-01-30 23:25 ` [PATCH v7 02/31] directory rename detection: directory splitting testcases Elijah Newren
2018-01-30 23:25 ` [PATCH v7 03/31] directory rename detection: testcases to avoid taking detection too far Elijah Newren
2018-01-30 23:25 ` [PATCH v7 04/31] directory rename detection: partially renamed directory testcase/discussion Elijah Newren
2018-01-30 23:25 ` [PATCH v7 05/31] directory rename detection: files/directories in the way of some renames Elijah Newren
2018-01-30 23:25 ` [PATCH v7 06/31] directory rename detection: testcases checking which side did the rename Elijah Newren
2018-01-30 23:25 ` [PATCH v7 07/31] directory rename detection: more involved edge/corner testcases Elijah Newren
2018-01-30 23:25 ` [PATCH v7 08/31] directory rename detection: testcases exploring possibly suboptimal merges Elijah Newren
2018-01-30 23:25 ` [PATCH v7 09/31] directory rename detection: miscellaneous testcases to complete coverage Elijah Newren
2018-01-30 23:25 ` [PATCH v7 10/31] directory rename detection: tests for handling overwriting untracked files Elijah Newren
2018-01-30 23:25 ` [PATCH v7 11/31] directory rename detection: tests for handling overwriting dirty files Elijah Newren
2018-01-30 23:25 ` [PATCH v7 12/31] merge-recursive: move the get_renames() function Elijah Newren
2018-02-02 23:27   ` Stefan Beller
     [not found]     ` <CABPp-BFDgDDa_fPSFJQUSzR1k5-ix0SWrviUPFu+SCoyWfG5cQ@mail.gmail.com>
2018-02-05 18:57       ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 13/31] merge-recursive: introduce new functions to handle rename logic Elijah Newren
2018-02-02 23:36   ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 14/31] merge-recursive: fix leaks of allocated renames and diff_filepairs Elijah Newren
2018-02-02 23:41   ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 15/31] merge-recursive: make !o->detect_rename codepath more obvious Elijah Newren
2018-02-02 23:48   ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 16/31] merge-recursive: split out code for determining diff_filepairs Elijah Newren
2018-02-03  0:06   ` Stefan Beller
2018-02-03  1:43     ` Elijah Newren
2018-01-30 23:25 ` [PATCH v7 17/31] merge-recursive: add a new hashmap for storing directory renames Elijah Newren
2018-02-03  0:26   ` Stefan Beller
2018-02-03 21:34     ` Elijah Newren
2018-02-04  8:54       ` Johannes Sixt
2018-02-05 14:56         ` Elijah Newren
2018-02-05 20:01         ` Junio C Hamano
2018-02-05 19:44       ` Stefan Beller
2018-02-05 21:27         ` Elijah Newren
2018-01-30 23:25 ` [PATCH v7 18/31] merge-recursive: make a helper function for cleanup for handle_renames Elijah Newren
2018-02-03  0:31   ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 19/31] merge-recursive: add get_directory_renames() Elijah Newren
2018-02-03  1:02   ` Stefan Beller
2018-02-03 22:32     ` Elijah Newren
2018-02-04  2:04       ` Elijah Newren
2018-02-04  4:42         ` Eric Sunshine
2018-02-04  4:44           ` Eric Sunshine
2018-02-05 19:39       ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 20/31] merge-recursive: check for directory level conflicts Elijah Newren
2018-02-05 20:00   ` Stefan Beller
2018-02-05 21:12     ` Elijah Newren
2018-01-30 23:25 ` [PATCH v7 21/31] merge-recursive: add a new hashmap for storing file collisions Elijah Newren
2018-02-05 20:02   ` Stefan Beller
2018-01-30 23:25 ` [PATCH v7 22/31] merge-recursive: add computation of collisions due to dir rename & merging Elijah Newren
2018-01-30 23:25 ` [PATCH v7 23/31] merge-recursive: check for file level conflicts then get new name Elijah Newren
2018-01-30 23:25 ` [PATCH v7 24/31] merge-recursive: when comparing files, don't include trees Elijah Newren
2018-01-30 23:25 ` [PATCH v7 25/31] merge-recursive: apply necessary modifications for directory renames Elijah Newren
2018-02-16  1:14   ` SZEDER Gábor
2018-01-30 23:25 ` [PATCH v7 26/31] merge-recursive: avoid clobbering untracked files with " Elijah Newren
2018-01-30 23:25 ` [PATCH v7 27/31] merge-recursive: fix overwriting dirty files involved in renames Elijah Newren
2018-02-05 20:52   ` Stefan Beller
2018-02-05 21:26     ` Elijah Newren
2018-01-30 23:25 ` [PATCH v7 28/31] merge-recursive: fix remaining directory rename + dirty overwrite cases Elijah Newren
2018-02-05 21:52   ` Stefan Beller
2018-02-05 22:18     ` Elijah Newren
2018-01-30 23:25 ` [PATCH v7 29/31] directory rename detection: new testcases showcasing a pair of bugs Elijah Newren
2018-01-30 23:25 ` [PATCH v7 30/31] merge-recursive: avoid spurious rename/rename conflict from dir renames Elijah Newren
2018-01-30 23:25 ` [PATCH v7 31/31] merge-recursive: ensure we write updates for directory-renamed file Elijah Newren
2018-02-05 21:58   ` Stefan Beller
2018-01-30 23:41 ` [PATCH v7 00/31] Add directory rename detection to git Elijah Newren

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).