* [PATCH 0/3] Fix dual rename into each other plus conflicting adds @ 2022-06-22 4:20 Elijah Newren via GitGitGadget 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget ` (4 more replies) 0 siblings, 5 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-22 4:20 UTC (permalink / raw) To: git; +Cc: Elijah Newren This series adds some testcases based on the tensorflow repository issue reported by Glen Choo at [1], demonstrating bugs in both the ort and recursive strategies. It also provides a fix for the ort strategy. [1] https://lore.kernel.org/git/kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com/ Elijah Newren (3): t6423: add tests of dual directory rename plus add/add conflict merge-ort: shuffle the computation and cleanup of potential collisions merge-ort: fix issue with dual rename and add/add conflict merge-ort.c | 60 ++++++++++------ t/t6423-merge-rename-directories.sh | 102 ++++++++++++++++++++++++++++ 2 files changed, 142 insertions(+), 20 deletions(-) base-commit: e54793a95afeea1e10de1e5ad7eab914e7416250 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1268%2Fnewren%2Ffix-dual-rename-into-each-other-plus-conflicting-adds-v1 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1268/newren/fix-dual-rename-into-each-other-plus-conflicting-adds-v1 Pull-Request: https://github.com/gitgitgadget/git/pull/1268 -- gitgitgadget ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget @ 2022-06-22 4:20 ` Elijah Newren via GitGitGadget 2022-06-27 18:20 ` Jonathan Tan 2022-06-27 22:30 ` Calvin Wan 2022-06-22 4:20 ` [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget ` (3 subsequent siblings) 4 siblings, 2 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-22 4:20 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> This is an attempt at minimalizing a testcase reported by Glen Choo with tensorflow where merge-ort would report an assertion failure: Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 reversing the direction of the merge provides a different error: error: cache entry has null sha1: ... fatal: unable to write .git/index so we add testcases for both. With these new testcases, the recursive strategy differs in that it returns the latter error for both merge directions. These testcases are somehow a little different than Glen's original tensorflow testcase in that these ones trigger a bug with the recursive algorithm whereas his testcase didn't. I figure that means these testcases somehow manage to be more comprehensive. Reported-by: Glen Choo <chooglen@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> --- t/t6423-merge-rename-directories.sh | 102 ++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 479db32cd62..296c04f8046 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5199,6 +5199,108 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' ) ' +# Testcase 12l, Both sides rename a directory into the other side, both add +# a file with after directory renames are the same filename +# Commit O: sub1/file, sub2/other +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} +# +# In words: +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 +# +# Expected: sub3/{file, newfile, sub2/other} +# CONFLICT (add/add): sub3/sub2/new_add_add_file + +test_setup_12l () { + test_create_repo 12l_$1 && + ( + cd 12l_$1 && + + mkdir -p sub1 sub2 + echo file >sub1/file && + echo other >sub2/other && + git add sub1 sub2 && + git commit -m "O" && + + git branch O && + git branch A && + git branch B && + + git checkout A && + git mv sub1 sub3 && + echo conflicting >sub2/new_add_add_file && + git add sub2 && + test_tick && + git add -u && + git commit -m "A" && + + git checkout B && + echo dissimilar >sub2/new_add_add_file && + echo brand >sub1/newfile && + git add sub1 sub2 && + git mv sub2 sub1 && + test_tick && + git commit -m "B" + ) +} + +test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' + test_setup_12l BintoA && + ( + cd 12l_BintoA && + + git checkout -q A^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && + + git ls-files -s >out && + test_line_count = 5 out && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + A:sub2/new_add_add_file \ + B:sub1/sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect out >expect && + test_cmp expect actual + ) +' + +test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' + test_setup_12l AintoB && + ( + cd 12l_AintoB && + + git checkout -q B^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && + + git ls-files -s >out && + test_line_count = 5 out && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + B:sub1/sub2/new_add_add_file \ + A:sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect out >expect && + test_cmp expect actual + ) +' + ########################################################################### # SECTION 13: Checking informational and conflict messages # -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-06-27 18:20 ` Jonathan Tan 2022-06-30 0:06 ` Elijah Newren 2022-06-27 22:30 ` Calvin Wan 1 sibling, 1 reply; 45+ messages in thread From: Jonathan Tan @ 2022-06-27 18:20 UTC (permalink / raw) To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > From: Elijah Newren <newren@gmail.com> > > This is an attempt at minimalizing a testcase reported by Glen Choo > with tensorflow where merge-ort would report an assertion failure: > > Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 First of all, thanks for this fix - I've verified with Glen Choo's test cases and it does fix it. > diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh > index 479db32cd62..296c04f8046 100755 > --- a/t/t6423-merge-rename-directories.sh > +++ b/t/t6423-merge-rename-directories.sh > @@ -5199,6 +5199,108 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' > ) > ' > > +# Testcase 12l, Both sides rename a directory into the other side, both add > +# a file with after directory renames are the same filename > +# Commit O: sub1/file, sub2/other > +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} > +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} > +# > +# In words: > +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 > +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 > +# > +# Expected: sub3/{file, newfile, sub2/other} > +# CONFLICT (add/add): sub3/sub2/new_add_add_file The "expected" might need to be updated. After all patches are applied, this is the tree I get (note that it's "sub1/sub2/new_add_add_file, not "sub3/sub2/new_add_add_file"): . |-- sub1 | `-- sub2 | `-- new_add_add_file `-- sub3 |-- file |-- newfile `-- sub2 `-- other Also, at first glance, "newfile" shouldn't belong in a minimal reproduction, but it somehow changes the output. When I apply this diff (to the state after all patches are applied): diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 4286ae987c..9472fb7233 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5237,7 +5237,6 @@ test_setup_12l () { git checkout B && echo dissimilar >sub2/new_add_add_file && - echo brand >sub1/newfile && git add sub1 sub2 && git mv sub2 sub1 && test_tick && @@ -5255,14 +5254,14 @@ test_expect_merge_algorithm failure success '12l (B into A): Rename into each ot test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && git ls-files -s >out && - test_line_count = 5 out && + test_line_count = 4 out && git rev-parse >actual \ - :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :0:sub3/file :0:sub3/sub2/other \ :2:sub1/sub2/new_add_add_file \ :3:sub1/sub2/new_add_add_file && git rev-parse >expect \ - O:sub1/file B:sub1/newfile O:sub2/other \ + O:sub1/file O:sub2/other \ A:sub2/new_add_add_file \ B:sub1/sub2/new_add_add_file && test_cmp expect actual && I get: . |-- sub1 | `-- sub2 | |-- new_add_add_file | `-- other `-- sub3 `-- file (Note the path to "other".) I haven't figured out what's going on, though. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-27 18:20 ` Jonathan Tan @ 2022-06-30 0:06 ` Elijah Newren 2022-06-30 22:32 ` Jonathan Tan 0 siblings, 1 reply; 45+ messages in thread From: Elijah Newren @ 2022-06-30 0:06 UTC (permalink / raw) To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, Git Mailing List On Mon, Jun 27, 2022 at 11:20 AM Jonathan Tan <jonathantanmy@google.com> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > From: Elijah Newren <newren@gmail.com> > > > > This is an attempt at minimalizing a testcase reported by Glen Choo > > with tensorflow where merge-ort would report an assertion failure: > > > > Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 > > First of all, thanks for this fix - I've verified with Glen Choo's test > cases and it does fix it. :-) > > diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh > > index 479db32cd62..296c04f8046 100755 > > --- a/t/t6423-merge-rename-directories.sh > > +++ b/t/t6423-merge-rename-directories.sh > > @@ -5199,6 +5199,108 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' > > ) > > ' > > > > +# Testcase 12l, Both sides rename a directory into the other side, both add > > +# a file with after directory renames are the same filename > > +# Commit O: sub1/file, sub2/other > > +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} > > +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} > > +# > > +# In words: > > +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 > > +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 > > +# > > +# Expected: sub3/{file, newfile, sub2/other} > > +# CONFLICT (add/add): sub3/sub2/new_add_add_file > > The "expected" might need to be updated. Oops! Yes, I kept changing and editing the testcase and the comments...but didn't quite get all the updates I needed when I was revising. > After all patches are applied, > this is the tree I get (note that it's "sub1/sub2/new_add_add_file, not > "sub3/sub2/new_add_add_file"): > > . > |-- sub1 > | `-- sub2 > | `-- new_add_add_file > `-- sub3 > |-- file > |-- newfile > `-- sub2 > `-- other Yes, that's right. > Also, at first glance, "newfile" shouldn't belong in a minimal > reproduction Ah, I can see you've looked at this very closely. Thanks for digging in! I know it's counter-intuitive at first, but the file is necessary in order to get the sub1/ -> sub3/ rename. The reasoning is as follows: We don't need to detect a directory rename for a directory where the other side added no new files into that directory...because the whole point of directory renames is to move new files in a directory to the new location. Therefore, no new files in the directory on one side, means no need to detect a directory rename for it on the other side. For a deeper discussion of this, see commit c64432aacd (t6423: more involved rules for renaming directories into each other, 2020-10-15). >, but it somehow changes the output. When I apply this diff > (to the state after all patches are applied): > > diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh > index 4286ae987c..9472fb7233 100755 > --- a/t/t6423-merge-rename-directories.sh > +++ b/t/t6423-merge-rename-directories.sh > @@ -5237,7 +5237,6 @@ test_setup_12l () { > > git checkout B && > echo dissimilar >sub2/new_add_add_file && > - echo brand >sub1/newfile && > git add sub1 sub2 && > git mv sub2 sub1 && > test_tick && > @@ -5255,14 +5254,14 @@ test_expect_merge_algorithm failure success '12l (B into A): Rename into each ot > test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && > > git ls-files -s >out && > - test_line_count = 5 out && > + test_line_count = 4 out && > > git rev-parse >actual \ > - :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ > + :0:sub3/file :0:sub3/sub2/other \ > :2:sub1/sub2/new_add_add_file \ > :3:sub1/sub2/new_add_add_file && > git rev-parse >expect \ > - O:sub1/file B:sub1/newfile O:sub2/other \ > + O:sub1/file O:sub2/other \ > A:sub2/new_add_add_file \ > B:sub1/sub2/new_add_add_file && > test_cmp expect actual && > > I get: > > . > |-- sub1 > | `-- sub2 > | |-- new_add_add_file > | `-- other > `-- sub3 > `-- file > > (Note the path to "other".) I haven't figured out what's going on, > though. Yeah, this is the result of having no directory rename due to having no new files that need to be moved by a directory rename. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-30 0:06 ` Elijah Newren @ 2022-06-30 22:32 ` Jonathan Tan 2022-07-01 2:57 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Jonathan Tan @ 2022-06-30 22:32 UTC (permalink / raw) To: Elijah Newren Cc: Jonathan Tan, Elijah Newren via GitGitGadget, Git Mailing List Elijah Newren <newren@gmail.com> writes: > Ah, I can see you've looked at this very closely. Thanks for digging > in! I know it's counter-intuitive at first, but the file is > necessary in order to get the sub1/ -> sub3/ rename. The reasoning is > as follows: We don't need to detect a directory rename for a directory > where the other side added no new files into that directory...because > the whole point of directory renames is to move new files in a > directory to the new location. Therefore, no new files in the > directory on one side, means no need to detect a directory rename for > it on the other side. For a deeper discussion of this, see commit > c64432aacd (t6423: more involved rules for renaming directories into > each other, 2020-10-15). Thanks! This makes sense. Might be worth including as a comment (explaining why "newfile" is there) in the test. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-30 22:32 ` Jonathan Tan @ 2022-07-01 2:57 ` Elijah Newren 0 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-07-01 2:57 UTC (permalink / raw) To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, Git Mailing List On Thu, Jun 30, 2022 at 3:32 PM Jonathan Tan <jonathantanmy@google.com> wrote: > > Elijah Newren <newren@gmail.com> writes: > > Ah, I can see you've looked at this very closely. Thanks for digging > > in! I know it's counter-intuitive at first, but the file is > > necessary in order to get the sub1/ -> sub3/ rename. The reasoning is > > as follows: We don't need to detect a directory rename for a directory > > where the other side added no new files into that directory...because > > the whole point of directory renames is to move new files in a > > directory to the new location. Therefore, no new files in the > > directory on one side, means no need to detect a directory rename for > > it on the other side. For a deeper discussion of this, see commit > > c64432aacd (t6423: more involved rules for renaming directories into > > each other, 2020-10-15). > > Thanks! This makes sense. Might be worth including as a comment > (explaining why "newfile" is there) in the test. Sure, will do. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-06-27 18:20 ` Jonathan Tan @ 2022-06-27 22:30 ` Calvin Wan 2022-06-30 0:07 ` Elijah Newren 1 sibling, 1 reply; 45+ messages in thread From: Calvin Wan @ 2022-06-27 22:30 UTC (permalink / raw) To: Elijah Newren via GitGitGadget; +Cc: Calvin Wan, git, Elijah Newren "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > +# Testcase 12l, Both sides rename a directory into the other side, both add > +# a file with after directory renames are the same filename > +# Commit O: sub1/file, sub2/other > +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} > +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} > +# > +# In words: > +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 > +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 > +# > +# Expected: sub3/{file, newfile, sub2/other} > +# CONFLICT (add/add): sub3/sub2/new_add_add_file Grammatically, I could not understand "both add a file with after directory renames are the same filename" I also found the same issue with `Expected` that Jonathan mentions. I ran the command separately and got CONFLICT (add/add): Merge conflict in sub1/sub2/new_add_add_file ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-27 22:30 ` Calvin Wan @ 2022-06-30 0:07 ` Elijah Newren 0 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-06-30 0:07 UTC (permalink / raw) To: Calvin Wan; +Cc: Elijah Newren via GitGitGadget, Git Mailing List On Mon, Jun 27, 2022 at 3:30 PM Calvin Wan <calvinwan@google.com> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > +# Testcase 12l, Both sides rename a directory into the other side, both add > > +# a file with after directory renames are the same filename > > +# Commit O: sub1/file, sub2/other > > +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} > > +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} > > +# > > +# In words: > > +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 > > +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 > > +# > > +# Expected: sub3/{file, newfile, sub2/other} > > +# CONFLICT (add/add): sub3/sub2/new_add_add_file > > Grammatically, I could not understand > > "both add a file with after directory renames are the same filename" Oops, that should be "with" -> "which". > I also found the same issue with `Expected` that Jonathan mentions. I ran the > command separately and got > > CONFLICT (add/add): Merge conflict in sub1/sub2/new_add_add_file Yeah, repeatedly revising stuff is fine as long as you remember to update all the parts. (I was swapping which directories were named "sub1", "sub2" and "sub3" to see if lexicographic ordering might be related to why this simpler test triggered a bug in the "recursive" strategy but Glen's bigger testcase didn't.). Anyway, I just missed updating one of the locations while doing that revision; sorry about that. Will fix. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-06-22 4:20 ` Elijah Newren via GitGitGadget 2022-06-27 18:48 ` Jonathan Tan 2022-06-22 4:20 ` [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget ` (2 subsequent siblings) 4 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-22 4:20 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> Run compute_collisions() for renames on both sides of history before any calls to collect_renames(), and do not free the computed collisions until after both calls to collect_renames(). This is just a code reorganization at this point that doesn't make sense on its own, but will permit us to use the computed collision info from both sides within each call to collect_renames() in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 56 ++++++++++++++++++++++++++++++++++------------------- 1 file changed, 36 insertions(+), 20 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 8545354dafd..fa6667de18c 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2259,6 +2259,27 @@ static void compute_collisions(struct strmap *collisions, } } +static void free_collisions(struct strmap *collisions) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + /* Free each value in the collisions map */ + strmap_for_each_entry(collisions, &iter, entry) { + struct collision_info *info = entry->value; + string_list_clear(&info->source_files, 0); + } + /* + * In compute_collisions(), we set collisions.strdup_strings to 0 + * so that we wouldn't have to make another copy of the new_path + * allocated by apply_dir_rename(). But now that we've used them + * and have no other references to these strings, it is time to + * deallocate them. + */ + free_strmap_strings(collisions); + strmap_clear(collisions, 1); +} + static char *check_for_directory_rename(struct merge_options *opt, const char *path, unsigned side_index, @@ -2314,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, } new_path = handle_path_level_conflicts(opt, path, side_index, - rename_info, collisions); + rename_info, + &collisions[side_index]); *clean_merge &= (new_path != NULL); return new_path; @@ -3023,18 +3045,15 @@ static int detect_regular_renames(struct merge_options *opt, static int collect_renames(struct merge_options *opt, struct diff_queue_struct *result, unsigned side_index, + struct strmap *collisions, struct strmap *dir_renames_for_side, struct strmap *rename_exclusions) { int i, clean = 1; - struct strmap collisions; struct diff_queue_struct *side_pairs; - struct hashmap_iter iter; - struct strmap_entry *entry; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; - compute_collisions(&collisions, dir_renames_for_side, side_pairs); for (i = 0; i < side_pairs->nr; ++i) { struct diff_filepair *p = side_pairs->queue[i]; @@ -3050,7 +3069,7 @@ static int collect_renames(struct merge_options *opt, side_index, dir_renames_for_side, rename_exclusions, - &collisions, + collisions, &clean); possibly_cache_new_pair(renames, p, side_index, new_path); @@ -3076,20 +3095,6 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - /* Free each value in the collisions map */ - strmap_for_each_entry(&collisions, &iter, entry) { - struct collision_info *info = entry->value; - string_list_clear(&info->source_files, 0); - } - /* - * In compute_collisions(), we set collisions.strdup_strings to 0 - * so that we wouldn't have to make another copy of the new_path - * allocated by apply_dir_rename(). But now that we've used them - * and have no other references to these strings, it is time to - * deallocate them. - */ - free_strmap_strings(&collisions); - strmap_clear(&collisions, 1); return clean; } @@ -3100,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, { struct diff_queue_struct combined = { 0 }; struct rename_info *renames = &opt->priv->renames; + struct strmap collisions[3]; int need_dir_renames, s, i, clean = 1; unsigned detection_run = 0; @@ -3149,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, ALLOC_GROW(combined.queue, renames->pairs[1].nr + renames->pairs[2].nr, combined.alloc); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + int other_side = 3 - i; + compute_collisions(&collisions[i], + &renames->dir_renames[other_side], + &renames->pairs[i]); + } clean &= collect_renames(opt, &combined, MERGE_SIDE1, + collisions, &renames->dir_renames[2], &renames->dir_renames[1]); clean &= collect_renames(opt, &combined, MERGE_SIDE2, + collisions, &renames->dir_renames[1], &renames->dir_renames[2]); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) + free_collisions(&collisions[i]); STABLE_QSORT(combined.queue, combined.nr, compare_pairs); trace2_region_leave("merge", "directory renames", opt->repo); -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-22 4:20 ` [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-06-27 18:48 ` Jonathan Tan 2022-06-27 21:04 ` Calvin Wan 0 siblings, 1 reply; 45+ messages in thread From: Jonathan Tan @ 2022-06-27 18:48 UTC (permalink / raw) To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > @@ -2314,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, > } > > new_path = handle_path_level_conflicts(opt, path, side_index, > - rename_info, collisions); > + rename_info, > + &collisions[side_index]); Is this a fix of a latent bug? handle_path_level_conflicts() is not changed in this patch. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-27 18:48 ` Jonathan Tan @ 2022-06-27 21:04 ` Calvin Wan 2022-06-30 0:05 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Calvin Wan @ 2022-06-27 21:04 UTC (permalink / raw) To: Jonathan Tan Cc: Calvin Wan, Elijah Newren via GitGitGadget, git, Elijah Newren Jonathan Tan <jonathantanmy@google.com> writes: > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > @@ -2314,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, > > } > > > > new_path = handle_path_level_conflicts(opt, path, side_index, > > - rename_info, collisions); > > + rename_info, > > + &collisions[side_index]); > > Is this a fix of a latent bug? handle_path_level_conflicts() is not > changed in this patch. > I don't think so. IIUC this is what's happening given the callstack: detect_and_process_renames() - Now defines `struct strmap collisions[3];` and computes all three collisions here - Passes collisions into collect_renames() collect_renames() - Originally defined as `struct strmap collisions;` and computed collisions in here - Now takes collisions as an argument - Passes collisions into check_for_directory_rename() check_for_directory_rename() - Collisions isn't used in this function at all except to pass into handle_path_level_conflicts handle_path_level_conflicts() - Expecting pointer to singular collisions, not an array so side_index is now required ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-27 21:04 ` Calvin Wan @ 2022-06-30 0:05 ` Elijah Newren 0 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-06-30 0:05 UTC (permalink / raw) To: Calvin Wan; +Cc: Jonathan Tan, Elijah Newren via GitGitGadget, Git Mailing List On Mon, Jun 27, 2022 at 2:04 PM Calvin Wan <calvinwan@google.com> wrote: > > Jonathan Tan <jonathantanmy@google.com> writes: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > @@ -2314,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, > > > } > > > > > > new_path = handle_path_level_conflicts(opt, path, side_index, > > > - rename_info, collisions); > > > + rename_info, > > > + &collisions[side_index]); > > > > Is this a fix of a latent bug? handle_path_level_conflicts() is not > > changed in this patch. > > > > I don't think so. IIUC this is what's happening given the callstack: > > detect_and_process_renames() > - Now defines `struct strmap collisions[3];` and computes all > three collisions here > - Passes collisions into collect_renames() > collect_renames() > - Originally defined as `struct strmap collisions;` and computed > collisions in here > - Now takes collisions as an argument > - Passes collisions into check_for_directory_rename() > check_for_directory_rename() > - Collisions isn't used in this function at all except to pass into > handle_path_level_conflicts > handle_path_level_conflicts() > - Expecting pointer to singular collisions, not an array so side_index > is now required Sweet, thanks for answering for me. This is exactly right. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-06-22 4:20 ` [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-06-22 4:20 ` Elijah Newren via GitGitGadget 2022-06-27 18:47 ` Jonathan Tan 2022-06-22 4:36 ` [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget 4 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-22 4:20 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> There is code in both merge-recursive and merge-ort for avoiding doubly transitive renames (i.e. one side renames directory A/ -> B/, and the other side renames directory B/ -> C/), because this combination would otherwise make a mess for new files added to A/ on the first side and wondering which directory they end up in -- especially if there were even more renames such as the first side renaming C/ -> D/. In such cases, it just turns "off" directory rename detection for the higher order transitive cases. The testcases added in t6423 a couple commits ago are slightly different but similar in principle. They involve a similar case of paired renaming but instead of A/ -> B/ and B/ -> C/, the second side renames a leading directory of B/ to C/. And both sides add a new file somewhere under the directory that the other side will rename. While the new files added start within different directories and thus could logically end up within different directories, it is weird for a file on one side to end up where the other one started and not move along with it. So, let's just turn off directory rename detection in this case as well. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 4 ++++ t/t6423-merge-rename-directories.sh | 4 ++-- 2 files changed, 6 insertions(+), 2 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index fa6667de18c..5bcb9a4980b 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2292,9 +2292,13 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap_entry *rename_info; struct strmap_entry *otherinfo = NULL; const char *new_dir; + int other_side = 3 - side_index; + /* Cases where there is no new path, so we return NULL */ if (strmap_empty(dir_renames)) return new_path; + if (strmap_get(&collisions[other_side], path)) + return new_path; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) return new_path; diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 296c04f8046..4286ae987c4 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5245,7 +5245,7 @@ test_setup_12l () { ) } -test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (B into A): Rename into each other + add/add conflict' ' test_setup_12l BintoA && ( cd 12l_BintoA && @@ -5273,7 +5273,7 @@ test_expect_merge_algorithm failure failure '12l (B into A): Rename into each ot ) ' -test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (A into B): Rename into each other + add/add conflict' ' test_setup_12l AintoB && ( cd 12l_AintoB && -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-22 4:20 ` [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-06-27 18:47 ` Jonathan Tan 2022-06-30 0:05 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Jonathan Tan @ 2022-06-27 18:47 UTC (permalink / raw) To: Elijah Newren via GitGitGadget; +Cc: Jonathan Tan, git, Elijah Newren "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > The testcases added in t6423 a couple commits ago are slightly different > but similar in principle. They involve a similar case of paired > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames > a leading directory of B/ to C/. Hmm...one side moved sub1 -> sub3 and the other moved sub2 from the root to under sub1, right? So it's more like A/ -> B/ and C/ -> A/C/. > And both sides add a new file > somewhere under the directory that the other side will rename. Rename or move, I think. > While > the new files added start within different directories and thus could > logically end up within different directories, it is weird for a file > on one side to end up where the other one started and not move along > with it. So, let's just turn off directory rename detection in this > case as well. Makes sense. > diff --git a/merge-ort.c b/merge-ort.c > index fa6667de18c..5bcb9a4980b 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -2292,9 +2292,13 @@ static char *check_for_directory_rename(struct merge_options *opt, > struct strmap_entry *rename_info; > struct strmap_entry *otherinfo = NULL; > const char *new_dir; > + int other_side = 3 - side_index; > > + /* Cases where there is no new path, so we return NULL */ What do you mean by "no new path"? > if (strmap_empty(dir_renames)) > return new_path; > + if (strmap_get(&collisions[other_side], path)) > + return new_path; So as far as I understand, collisions here, for a given side, is a map. The map's keys are all the filenames of added and renamed files (I'm assuming that's what 'A' and 'R' are) from that side after any directory moves on the other side are applied. So, for example, if we add "foo/a" on side A and rename "foo/" to "bar/" on side B, then side A's collision map would have a key "bar/a". So I'm not sure if "collision" is the right name here, but the existing code already uses it so I'll leave it be. It makes sense that this situation (of side A having "bar/a" because side B renamed "foo/" to "bar/", and at the same time, side B adds its own "bar/a") is weird, as stated in the commit message, so I don't mind disabling checking for directory rename in this case. However, in theory, I don't see how disabling this check would fix anything, since the bug seems to be about two different files on different sides being renamed to the same filename through some convoluted means. (Unless this is the only convoluted means to do it, and disabling it means that the bug wouldn't occur.) ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-27 18:47 ` Jonathan Tan @ 2022-06-30 0:05 ` Elijah Newren 2022-07-06 17:25 ` Jonathan Tan 0 siblings, 1 reply; 45+ messages in thread From: Elijah Newren @ 2022-06-30 0:05 UTC (permalink / raw) To: Jonathan Tan; +Cc: Elijah Newren via GitGitGadget, Git Mailing List On Mon, Jun 27, 2022 at 11:47 AM Jonathan Tan <jonathantanmy@google.com> wrote: > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > The testcases added in t6423 a couple commits ago are slightly different > > but similar in principle. They involve a similar case of paired > > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames > > a leading directory of B/ to C/. > > Hmm...one side moved sub1 -> sub3 and the other moved sub2 from the root > to under sub1, right? So it's more like A/ -> B/ and C/ -> A/C/. Hmm, maybe I should have been more explicit in my mapping: A = sub2 B = sub1/sub2 leading directory of B = sub1 C = sub3 > > And both sides add a new file > > somewhere under the directory that the other side will rename. > > Rename or move, I think. I'm confused; I don't understand what this comment means. Renames tend to be created using "git mv", before or after making content changes, so to me a file being renamed or a file being moved to a different location are synonymous. What distinction are you making between renames and moves? > > While > > the new files added start within different directories and thus could > > logically end up within different directories, it is weird for a file > > on one side to end up where the other one started and not move along > > with it. So, let's just turn off directory rename detection in this > > case as well. > > Makes sense. > > > diff --git a/merge-ort.c b/merge-ort.c > > index fa6667de18c..5bcb9a4980b 100644 > > --- a/merge-ort.c > > +++ b/merge-ort.c > > @@ -2292,9 +2292,13 @@ static char *check_for_directory_rename(struct merge_options *opt, > > struct strmap_entry *rename_info; > > struct strmap_entry *otherinfo = NULL; > > const char *new_dir; > > + int other_side = 3 - side_index; > > > > + /* Cases where there is no new path, so we return NULL */ > > What do you mean by "no new path"? Hmm, perhaps I should change this to: /* Cases where we don't have or don't want a directory rename for this path, so we return NULL */ The purpose of this function is to check whether a given path would be modified by directory renaming to get a new path. So, "no new path" means we don't have an applicable directory rename or don't want to use it. > > if (strmap_empty(dir_renames)) > > return new_path; > > + if (strmap_get(&collisions[other_side], path)) > > + return new_path; > > So as far as I understand, collisions here, for a given side, is a map. > The map's keys are all the filenames of added and renamed files (I'm > assuming that's what 'A' and 'R' are) from that side after any directory > moves on the other side are applied. So, for example, if we add "foo/a" > on side A and rename "foo/" to "bar/" on side B, then side A's collision > map would have a key "bar/a". So I'm not sure if "collision" is the > right name here, but the existing code already uses it so I'll leave it > be. Let's take your example a bit further, to discuss the original kind of usecase that "collisions" was written for. In addition to adding "foo/a" on side A, we also add "foo2/a" and "foo3/a". And then in addition to renaming "foo/" to "bar/" on side B, we also rename "foo2/" to "bar/" and "foo3/" to "bar/", thus merging all of foo/, foo2/, and foo3/ into a single directory named bar/. If the files in foo/, foo2/, and foo3/ on side B were all unique, you can see how there'd be no problem merging these directories together. The problem only comes at merge time when you attempt to apply the directory renames from side B to the new files on side A. That's when you get collisions, in this case three files that would be named bar/a. Checking for such collisions was the purpose of the "collisions" metadata, so I think the name is fitting. The only problem is that we're reusing that data now for a slightly different purpose, though I think it's still "collision-y". > It makes sense that this situation (of side A having "bar/a" because > side B renamed "foo/" to "bar/", and at the same time, side B adds its > own "bar/a") is weird, as stated in the commit message, so I don't mind > disabling checking for directory rename in this case. However, in > theory, I don't see how disabling this check would fix anything, since > the bug seems to be about two different files on different sides being > renamed to the same filename through some convoluted means. (Unless this > is the only convoluted means to do it, and disabling it means that the > bug wouldn't occur.) Hmm...let me see if I can explain this a different way. The short version of the issue here is that if a directory rename wants to rename NEWFILE -> ALTERNATE_NEWFILE, but there is another directory rename on the other side of history that wants to rename ANOTHER_FILE -> NEWFILE, then we run into problems and have to turn off the NEWFILE -> ALTERNATE_NEWFILE. This check here is all about this case. To see why this is the problematic setup... The primary data structure in merge-ort is opt->priv->paths, a strmap which maps: (path involved in the merge) -> (conflict_info). (Technically, it could have a merged_info instead of a conflict_info if the file was trivially resolvable early on but since merged_info is the first entry in a conflict_info, logically it can still be thought of as a conflict_info just with less data.). Now a conflict_info stores information about the OIDs and modes for all three sides of the merge (i.e. both sides of the merge and the base). Whenever a rename is processed, we have to update this map, because the rename makes the conflict_info now apply to a different path. In the simple cases, the conflict_info from the source path needs to be merged with the conflict_info for the target path, and the source path's conflict_info needs to be marked as null (literally setting the .is_null field to 1). rename/rename and such can make this a bit more complicated. Normally, directory renames would actually be a simpler case for this merging of conflict_info structs rather than a more complicated case. There cannot be a directory rename if the directory exists on both sides, so we don't need to worry about there already being a file on the other side whose conflict_info we need to merge with the source conflict_info. So, the code just added an assertion that there wasn't one. The problem is, that _another_ directory rename for the other side of history can move a file into the way of where our directory rename wants to operate on. Let's jump into the example testcase I added to make this more concrete: # Commit O: sub1/file, sub2/other # Commit A: sub3/file, sub2/{other, new_add_add_file_1} # Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} # # In words: # A: sub1/ -> sub3/, add sub2/new_add_add_file_1 # B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 Here, the sub2/sub1/sub2/ rename on sideB will rename A's sub2/new_add_add_file to sub1/sub2/new_add_add_file, which is at the same location as B's sub1/sub2/new_add_add_file (and which A's sub1/ -> sub3/ directory rename would normally operate on). Given our opt->priv->paths data structure, if we wanted to let both directory renames take effect, then the order would matter: * If the sub1/ -> sub3/ directory rename is applied first, then: * B's sub1/sub2/new_add_add_file gets renamed to sub3/sub2/new_add_add_file * sub1/sub2/new_add_add_file is marked as .is_null=1 * A's sub2/new_add_add_file gets renamed to sub1/sub2/new_add_add_file (which was already marked as null) This set of steps seems to trigger the "error: cache entry has null sha1" fatal error I mentioned earlier. In contrast, if we take the other order: * If the sub2/ -> sub1/sub2/ rename is applied first, then: * A's sub2/new_add_add_file gets renamed to sub1/sub2/new_add_add_file * the conflict_info for the two sub1/sub2/new_add_add_file's are now merged * the sub1/ -> sub3/ directory rename is applied to move this file to sub3/sub2/new_add_add_file This second order may not sound so bad. And, in fact, you can get this behavior simply by relaxing (or commenting out) the assertion Glen reported hitting. However, that results in making the merge have a fatal error when you reverse its direction (i.e. when swapping HEAD and MERGE_HEAD), and seems somewhat confusing to me given that A's sub2/new_add_add_file was renamed twice, going against the general "avoid-mutiply-transitive-renames" rule employed elsewhere in directory rename detection. To avoid this order dependence, and the weird multiple-renaming of a single path, we just want to turn off directory renames when the source of a directory rename on one side is the target of a directory rename on the other. That's what this patch does. Does that help? Or did I make it more confusing? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-30 0:05 ` Elijah Newren @ 2022-07-06 17:25 ` Jonathan Tan 0 siblings, 0 replies; 45+ messages in thread From: Jonathan Tan @ 2022-07-06 17:25 UTC (permalink / raw) To: Elijah Newren Cc: Jonathan Tan, Elijah Newren via GitGitGadget, Git Mailing List Sorry for getting back to this so late. I don't have any issues with the patches, but just to close the loop: Elijah Newren <newren@gmail.com> writes: > On Mon, Jun 27, 2022 at 11:47 AM Jonathan Tan <jonathantanmy@google.com> wrote: > > > > "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > > > The testcases added in t6423 a couple commits ago are slightly different > > > but similar in principle. They involve a similar case of paired > > > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames > > > a leading directory of B/ to C/. > > > > Hmm...one side moved sub1 -> sub3 and the other moved sub2 from the root > > to under sub1, right? So it's more like A/ -> B/ and C/ -> A/C/. > > Hmm, maybe I should have been more explicit in my mapping: > A = sub2 > B = sub1/sub2 > leading directory of B = sub1 > C = sub3 Substituting into A/ -> B/ and "a leading directory of B/ to C/", we get: sub2 -> sub1/sub2 and sub1 -> sub3 which is indeed what is happening. I see, thanks. > > > And both sides add a new file > > > somewhere under the directory that the other side will rename. > > > > Rename or move, I think. > > I'm confused; I don't understand what this comment means. Renames > tend to be created using "git mv", before or after making content > changes, so to me a file being renamed or a file being moved to a > different location are synonymous. What distinction are you making > between renames and moves? I was thinking that a rename means that the directory still has the same parent directory, whereas a move means that the directory keeps its basename but has a different parent directory. Maybe it's just the way I think about things, but the important thing here is that a directory was moved so that its parent directory is a directory that is different and that already exists, and I think that this meaning is lost when we say "rename". But it might just be me. > Hmm, perhaps I should change this to: > > /* Cases where we don't have or don't want a directory rename for this > path, so we return NULL */ > > The purpose of this function is to check whether a given path would be > modified by directory renaming to get a new path. So, "no new path" > means we don't have an applicable directory rename or don't want to > use it. I see - the new text makes sense. > > > if (strmap_empty(dir_renames)) > > > return new_path; > > > + if (strmap_get(&collisions[other_side], path)) > > > + return new_path; > > > > So as far as I understand, collisions here, for a given side, is a map. > > The map's keys are all the filenames of added and renamed files (I'm > > assuming that's what 'A' and 'R' are) from that side after any directory > > moves on the other side are applied. So, for example, if we add "foo/a" > > on side A and rename "foo/" to "bar/" on side B, then side A's collision > > map would have a key "bar/a". So I'm not sure if "collision" is the > > right name here, but the existing code already uses it so I'll leave it > > be. > > Let's take your example a bit further, to discuss the original kind of > usecase that "collisions" was written for. In addition to adding > "foo/a" on side A, we also add "foo2/a" and "foo3/a". And then in > addition to renaming "foo/" to "bar/" on side B, we also rename > "foo2/" to "bar/" and "foo3/" to "bar/", thus merging all of foo/, > foo2/, and foo3/ into a single directory named bar/. If the files in > foo/, foo2/, and foo3/ on side B were all unique, you can see how > there'd be no problem merging these directories together. The problem > only comes at merge time when you attempt to apply the directory > renames from side B to the new files on side A. That's when you get > collisions, in this case three files that would be named bar/a. > > Checking for such collisions was the purpose of the "collisions" > metadata, so I think the name is fitting. The only problem is that > we're reusing that data now for a slightly different purpose, though I > think it's still "collision-y". That makes sense, thanks. > > It makes sense that this situation (of side A having "bar/a" because > > side B renamed "foo/" to "bar/", and at the same time, side B adds its > > own "bar/a") is weird, as stated in the commit message, so I don't mind > > disabling checking for directory rename in this case. However, in > > theory, I don't see how disabling this check would fix anything, since > > the bug seems to be about two different files on different sides being > > renamed to the same filename through some convoluted means. (Unless this > > is the only convoluted means to do it, and disabling it means that the > > bug wouldn't occur.) [snip] > Hmm...let me see if I can explain this a different way. > > The short version of the issue here is that if a directory rename > wants to rename NEWFILE -> ALTERNATE_NEWFILE, but there is another > directory rename on the other side of history that wants to rename > ANOTHER_FILE -> NEWFILE, then we run into problems and have to turn > off the NEWFILE -> ALTERNATE_NEWFILE. This check here is all about > this case. > > To see why this is the problematic setup... > > Now a conflict_info > stores information about the OIDs and modes for all three sides of the > merge (i.e. both sides of the merge and the base). Whenever a rename > is processed, we have to update this map, because the rename makes the > conflict_info now apply to a different path. In the simple cases, the > conflict_info from the source path needs to be merged with the > conflict_info for the target path, and the source path's conflict_info > needs to be marked as null (literally setting the .is_null field to > 1). rename/rename and such can make this a bit more complicated. Ah, I think I was missing this part. The intention is that processing one side in this way (and thus modifying its conflict_info) would not affect the processing of any other sides, but there is a bug that makes it not so. (And the intention is not, say, when processing all sides, to write the output in a new data structure so that the result is the same no matter the order.) So I think the answer to my question is: no, we do not want to be able to rename two different files on different sides to the same filename through any convoluted means, and if that happens, it's a bug. [snip illustration of what happens in either merge order] > To avoid this order dependence, and the weird multiple-renaming of a > single path, we just want to turn off directory renames when the > source of a directory rename on one side is the target of a directory > rename on the other. That's what this patch does. > > > Does that help? Or did I make it more confusing? I think that helped, thanks. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH 0/3] Fix dual rename into each other plus conflicting adds 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (2 preceding siblings ...) 2022-06-22 4:20 ` [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-06-22 4:36 ` Elijah Newren 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget 4 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-06-22 4:36 UTC (permalink / raw) To: Elijah Newren via GitGitGadget; +Cc: Git Mailing List On Tue, Jun 21, 2022 at 9:21 PM Elijah Newren via GitGitGadget <gitgitgadget@gmail.com> wrote: > > This series adds some testcases based on the tensorflow repository issue > reported by Glen Choo at [1], demonstrating bugs in both the ort and > recursive strategies. It also provides a fix for the ort strategy. Also, just to be clear, this is not a regression since 2.36. It's been with the ort strategy since it was introduced, and with the recursive strategy for a number of years. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v2 0/3] Fix dual rename into each other plus conflicting adds 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (3 preceding siblings ...) 2022-06-22 4:36 ` [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren @ 2022-06-30 6:57 ` Elijah Newren via GitGitGadget 2022-06-30 6:57 ` [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget ` (3 more replies) 4 siblings, 4 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-30 6:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Elijah Newren This series adds some testcases based on the tensorflow repository issue reported by Glen Choo at [1], demonstrating bugs in both the ort and recursive strategies. It also provides a fix for the ort strategy. Changes since v1: * Fixed some wording issues in comments, and added a bit more details to one of the commit messages [1] https://lore.kernel.org/git/kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com/ Elijah Newren (3): t6423: add tests of dual directory rename plus add/add conflict merge-ort: shuffle the computation and cleanup of potential collisions merge-ort: fix issue with dual rename and add/add conflict merge-ort.c | 63 +++++++++++------ t/t6423-merge-rename-directories.sh | 102 ++++++++++++++++++++++++++++ 2 files changed, 145 insertions(+), 20 deletions(-) base-commit: e54793a95afeea1e10de1e5ad7eab914e7416250 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1268%2Fnewren%2Ffix-dual-rename-into-each-other-plus-conflicting-adds-v2 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1268/newren/fix-dual-rename-into-each-other-plus-conflicting-adds-v2 Pull-Request: https://github.com/gitgitgadget/git/pull/1268 Range-diff vs v1: 1: 69d62041843 ! 1: bf4c03d01d5 t6423: add tests of dual directory rename plus add/add conflict @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename ' +# Testcase 12l, Both sides rename a directory into the other side, both add -+# a file with after directory renames are the same filename ++# a file which after directory renames are the same filename +# Commit O: sub1/file, sub2/other +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 +# +# Expected: sub3/{file, newfile, sub2/other} -+# CONFLICT (add/add): sub3/sub2/new_add_add_file ++# CONFLICT (add/add): sub1/sub2/new_add_add_file + +test_setup_12l () { + test_create_repo 12l_$1 && 2: d8c13e56209 = 2: cfa38f01481 merge-ort: shuffle the computation and cleanup of potential collisions 3: bb2badccb71 ! 3: da3ae38e390 merge-ort: fix issue with dual rename and add/add conflict @@ Commit message with it. So, let's just turn off directory rename detection in this case as well. + Another way to look at this is that if the source name involved in a + directory rename on one side is the target name of a directory rename + operation for a file from the other side, then we avoid the doubly + transitive rename. (More concretely, if a directory rename on side D + wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D + already had a file named NEW_NAME, and a directory rename on side E + wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the + directory rename detection for NEW_NAME to prevent the + NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add + conflict on NEW_NAME.) + Signed-off-by: Elijah Newren <newren@gmail.com> ## merge-ort.c ## @@ merge-ort.c: static char *check_for_directory_rename(struct merge_options *opt, const char *new_dir; + int other_side = 3 - side_index; -+ /* Cases where there is no new path, so we return NULL */ ++ /* ++ * Cases where we don't have or don't want a directory rename for ++ * this path, so we return NULL. ++ */ if (strmap_empty(dir_renames)) return new_path; + if (strmap_get(&collisions[other_side], path)) -- gitgitgadget ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget @ 2022-06-30 6:57 ` Elijah Newren via GitGitGadget 2022-06-30 10:21 ` Ævar Arnfjörð Bjarmason 2022-06-30 6:57 ` [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget ` (2 subsequent siblings) 3 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-30 6:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> This is an attempt at minimalizing a testcase reported by Glen Choo with tensorflow where merge-ort would report an assertion failure: Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 reversing the direction of the merge provides a different error: error: cache entry has null sha1: ... fatal: unable to write .git/index so we add testcases for both. With these new testcases, the recursive strategy differs in that it returns the latter error for both merge directions. These testcases are somehow a little different than Glen's original tensorflow testcase in that these ones trigger a bug with the recursive algorithm whereas his testcase didn't. I figure that means these testcases somehow manage to be more comprehensive. Reported-by: Glen Choo <chooglen@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> --- t/t6423-merge-rename-directories.sh | 102 ++++++++++++++++++++++++++++ 1 file changed, 102 insertions(+) diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 479db32cd62..db13bb995f3 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5199,6 +5199,108 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' ) ' +# Testcase 12l, Both sides rename a directory into the other side, both add +# a file which after directory renames are the same filename +# Commit O: sub1/file, sub2/other +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} +# +# In words: +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 +# +# Expected: sub3/{file, newfile, sub2/other} +# CONFLICT (add/add): sub1/sub2/new_add_add_file + +test_setup_12l () { + test_create_repo 12l_$1 && + ( + cd 12l_$1 && + + mkdir -p sub1 sub2 + echo file >sub1/file && + echo other >sub2/other && + git add sub1 sub2 && + git commit -m "O" && + + git branch O && + git branch A && + git branch B && + + git checkout A && + git mv sub1 sub3 && + echo conflicting >sub2/new_add_add_file && + git add sub2 && + test_tick && + git add -u && + git commit -m "A" && + + git checkout B && + echo dissimilar >sub2/new_add_add_file && + echo brand >sub1/newfile && + git add sub1 sub2 && + git mv sub2 sub1 && + test_tick && + git commit -m "B" + ) +} + +test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' + test_setup_12l BintoA && + ( + cd 12l_BintoA && + + git checkout -q A^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && + + git ls-files -s >out && + test_line_count = 5 out && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + A:sub2/new_add_add_file \ + B:sub1/sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect out >expect && + test_cmp expect actual + ) +' + +test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' + test_setup_12l AintoB && + ( + cd 12l_AintoB && + + git checkout -q B^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && + + git ls-files -s >out && + test_line_count = 5 out && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + B:sub1/sub2/new_add_add_file \ + A:sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect out >expect && + test_cmp expect actual + ) +' + ########################################################################### # SECTION 13: Checking informational and conflict messages # -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-30 6:57 ` [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-06-30 10:21 ` Ævar Arnfjörð Bjarmason 2022-07-01 2:57 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-30 10:21 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Jonathan Tan, Calvin Wan, Elijah Newren On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > +test_setup_12l () { > + test_create_repo 12l_$1 && Can & should just be "git init", see f0d4d398e28 (test-lib: split up and deprecate test_create_repo(), 2021-05-10). > + ( > + cd 12l_$1 && > + > + mkdir -p sub1 sub2 The "-p" here isn't needed, you're not creating recursive directories. > + git ls-files -s >out && > + test_line_count = 5 out && Can't this just use test_stdout_line_count? Except... > + > + git rev-parse >actual \ > + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ > + :2:sub1/sub2/new_add_add_file \ > + :3:sub1/sub2/new_add_add_file && > + git rev-parse >expect \ > + O:sub1/file B:sub1/newfile O:sub2/other \ > + A:sub2/new_add_add_file \ > + B:sub1/sub2/new_add_add_file && > + test_cmp expect actual && > + > + git ls-files -o >actual && > + test_write_lines actual expect out >expect && > + test_cmp expect actual This test seems a bit odd, here we're just checking that we've created scratch files for the internal use of our test, why is it important that we have an "out" file, when the only reason we have it is because we needed it for checking the "ls-files" line count above? > + ) > +' > + > +test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' > + test_setup_12l AintoB && > + ( > + cd 12l_AintoB && > + > + git checkout -q B^0 && > + > + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && > + > + git ls-files -s >out && > + test_line_count = 5 out && ditto. > + git rev-parse >actual \ > + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ > + :2:sub1/sub2/new_add_add_file \ > + :3:sub1/sub2/new_add_add_file && > + git rev-parse >expect \ > + O:sub1/file B:sub1/newfile O:sub2/other \ > + B:sub1/sub2/new_add_add_file \ > + A:sub2/new_add_add_file && > + test_cmp expect actual && > + > + git ls-files -o >actual && > + test_write_lines actual expect out >expect && ditto ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-06-30 10:21 ` Ævar Arnfjörð Bjarmason @ 2022-07-01 2:57 ` Elijah Newren 2022-07-01 9:29 ` Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 45+ messages in thread From: Elijah Newren @ 2022-07-01 2:57 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan On Thu, Jun 30, 2022 at 3:26 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@gmail.com> > > > +test_setup_12l () { > > + test_create_repo 12l_$1 && > > Can & should just be "git init", see f0d4d398e28 (test-lib: split up and > deprecate test_create_repo(), 2021-05-10). I've switched to "git init" and even encouraged others to do the same in other test scripts, but since literally every other test in this file uses test_create_repo, I think they should all be converted at once and just be consistent here. But, so we can stop having this conversation, after this series lands, I'll send one in to update the various merge testfiles that make heavy use of test_create_repo and convert them over. > > + ( > > + cd 12l_$1 && > > + > > + mkdir -p sub1 sub2 > > The "-p" here isn't needed, you're not creating recursive directories. Indeed; will fix. > > + git ls-files -s >out && > > + test_line_count = 5 out && > > Can't this just use test_stdout_line_count? Except... Ooh, new function from late last year that I was unaware of. Yeah, I'm happy to start using it. > > + > > + git rev-parse >actual \ > > + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ > > + :2:sub1/sub2/new_add_add_file \ > > + :3:sub1/sub2/new_add_add_file && > > + git rev-parse >expect \ > > + O:sub1/file B:sub1/newfile O:sub2/other \ > > + A:sub2/new_add_add_file \ > > + B:sub1/sub2/new_add_add_file && > > + test_cmp expect actual && > > + > > + git ls-files -o >actual && > > + test_write_lines actual expect out >expect && > > + test_cmp expect actual > > This test seems a bit odd, here we're just checking that we've created > scratch files for the internal use of our test, why is it important that > we have an "out" file, when the only reason we have it is because we > needed it for checking the "ls-files" line count above? Nah, you've misunderstood the purpose of the check. It isn't "make sure that these untracked files are present among whatever else might also be present", it's "make sure the merge step did not introduce *any* untracked files" (something the recursive backend periodically did, and they weren't cruft untracked files but stored actual merge results). There wasn't a nice easy check for that, the closest was to translate the requirement to "make sure the only untracked files are the ones explicitly added by this test script", which is the check you see here. I don't actually care about "actual", "expect", or "out", I just care that there aren't any _other_ untracked files. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict 2022-07-01 2:57 ` Elijah Newren @ 2022-07-01 9:29 ` Ævar Arnfjörð Bjarmason 0 siblings, 0 replies; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-07-01 9:29 UTC (permalink / raw) To: Elijah Newren Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan On Thu, Jun 30 2022, Elijah Newren wrote: > On Thu, Jun 30, 2022 at 3:26 AM Ævar Arnfjörð Bjarmason > <avarab@gmail.com> wrote: >> >> On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: >> >> > From: Elijah Newren <newren@gmail.com> >> >> > +test_setup_12l () { >> > + test_create_repo 12l_$1 && >> >> Can & should just be "git init", see f0d4d398e28 (test-lib: split up and >> deprecate test_create_repo(), 2021-05-10). > > I've switched to "git init" and even encouraged others to do the same > in other test scripts, but since literally every other test in this > file uses test_create_repo, I think they should all be converted at > once and just be consistent here. But, so we can stop having this > conversation, after this series lands, I'll send one in to update the > various merge testfiles that make heavy use of test_create_repo and > convert them over. Sorry, genuinely I didn't mean to mention it again, just saw it scroll past & wondered if it was intentional. I'm fine with keeping it... >> > + ( >> > + cd 12l_$1 && >> > + >> > + mkdir -p sub1 sub2 >> >> The "-p" here isn't needed, you're not creating recursive directories. > > Indeed; will fix. Thanks! >> > + git ls-files -s >out && >> > + test_line_count = 5 out && >> >> Can't this just use test_stdout_line_count? Except... > > Ooh, new function from late last year that I was unaware of. Yeah, > I'm happy to start using it. > >> > + >> > + git rev-parse >actual \ >> > + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ >> > + :2:sub1/sub2/new_add_add_file \ >> > + :3:sub1/sub2/new_add_add_file && >> > + git rev-parse >expect \ >> > + O:sub1/file B:sub1/newfile O:sub2/other \ >> > + A:sub2/new_add_add_file \ >> > + B:sub1/sub2/new_add_add_file && >> > + test_cmp expect actual && >> > + >> > + git ls-files -o >actual && >> > + test_write_lines actual expect out >expect && >> > + test_cmp expect actual >> >> This test seems a bit odd, here we're just checking that we've created >> scratch files for the internal use of our test, why is it important that >> we have an "out" file, when the only reason we have it is because we >> needed it for checking the "ls-files" line count above? > > Nah, you've misunderstood the purpose of the check. It isn't "make > sure that these untracked files are present among whatever else might > also be present", it's "make sure the merge step did not introduce > *any* untracked files" (something the recursive backend periodically > did, and they weren't cruft untracked files but stored actual merge > results). Ah, thanks! > There wasn't a nice easy check for that, the closest was to > translate the requirement to "make sure the only untracked files are > the ones explicitly added by this test script", which is the check you > see here. I don't actually care about "actual", "expect", or "out", I > just care that there aren't any _other_ untracked files. I'm fine with keeping this as-is, but FWIW perhaps this pattern is more explicit about the intent: test_expect_success 'do merge stuff' ' ... && rm -f expect actual && git ls-files -o ':!out' >out && test_must_be_empty out ' Or piping it to ".git/out", to avoid the path exclude, but like this is also fine:) Thanks. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget 2022-06-30 6:57 ` [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-06-30 6:57 ` Elijah Newren via GitGitGadget 2022-06-30 10:28 ` Ævar Arnfjörð Bjarmason 2022-06-30 6:57 ` [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 3 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-30 6:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> Run compute_collisions() for renames on both sides of history before any calls to collect_renames(), and do not free the computed collisions until after both calls to collect_renames(). This is just a code reorganization at this point that doesn't make sense on its own, but will permit us to use the computed collision info from both sides within each call to collect_renames() in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 56 ++++++++++++++++++++++++++++++++++------------------- 1 file changed, 36 insertions(+), 20 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 8545354dafd..fa6667de18c 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2259,6 +2259,27 @@ static void compute_collisions(struct strmap *collisions, } } +static void free_collisions(struct strmap *collisions) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + /* Free each value in the collisions map */ + strmap_for_each_entry(collisions, &iter, entry) { + struct collision_info *info = entry->value; + string_list_clear(&info->source_files, 0); + } + /* + * In compute_collisions(), we set collisions.strdup_strings to 0 + * so that we wouldn't have to make another copy of the new_path + * allocated by apply_dir_rename(). But now that we've used them + * and have no other references to these strings, it is time to + * deallocate them. + */ + free_strmap_strings(collisions); + strmap_clear(collisions, 1); +} + static char *check_for_directory_rename(struct merge_options *opt, const char *path, unsigned side_index, @@ -2314,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, } new_path = handle_path_level_conflicts(opt, path, side_index, - rename_info, collisions); + rename_info, + &collisions[side_index]); *clean_merge &= (new_path != NULL); return new_path; @@ -3023,18 +3045,15 @@ static int detect_regular_renames(struct merge_options *opt, static int collect_renames(struct merge_options *opt, struct diff_queue_struct *result, unsigned side_index, + struct strmap *collisions, struct strmap *dir_renames_for_side, struct strmap *rename_exclusions) { int i, clean = 1; - struct strmap collisions; struct diff_queue_struct *side_pairs; - struct hashmap_iter iter; - struct strmap_entry *entry; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; - compute_collisions(&collisions, dir_renames_for_side, side_pairs); for (i = 0; i < side_pairs->nr; ++i) { struct diff_filepair *p = side_pairs->queue[i]; @@ -3050,7 +3069,7 @@ static int collect_renames(struct merge_options *opt, side_index, dir_renames_for_side, rename_exclusions, - &collisions, + collisions, &clean); possibly_cache_new_pair(renames, p, side_index, new_path); @@ -3076,20 +3095,6 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - /* Free each value in the collisions map */ - strmap_for_each_entry(&collisions, &iter, entry) { - struct collision_info *info = entry->value; - string_list_clear(&info->source_files, 0); - } - /* - * In compute_collisions(), we set collisions.strdup_strings to 0 - * so that we wouldn't have to make another copy of the new_path - * allocated by apply_dir_rename(). But now that we've used them - * and have no other references to these strings, it is time to - * deallocate them. - */ - free_strmap_strings(&collisions); - strmap_clear(&collisions, 1); return clean; } @@ -3100,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, { struct diff_queue_struct combined = { 0 }; struct rename_info *renames = &opt->priv->renames; + struct strmap collisions[3]; int need_dir_renames, s, i, clean = 1; unsigned detection_run = 0; @@ -3149,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, ALLOC_GROW(combined.queue, renames->pairs[1].nr + renames->pairs[2].nr, combined.alloc); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + int other_side = 3 - i; + compute_collisions(&collisions[i], + &renames->dir_renames[other_side], + &renames->pairs[i]); + } clean &= collect_renames(opt, &combined, MERGE_SIDE1, + collisions, &renames->dir_renames[2], &renames->dir_renames[1]); clean &= collect_renames(opt, &combined, MERGE_SIDE2, + collisions, &renames->dir_renames[1], &renames->dir_renames[2]); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) + free_collisions(&collisions[i]); STABLE_QSORT(combined.queue, combined.nr, compare_pairs); trace2_region_leave("merge", "directory renames", opt->repo); -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-30 6:57 ` [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-06-30 10:28 ` Ævar Arnfjörð Bjarmason 2022-07-01 3:02 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-30 10:28 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Jonathan Tan, Calvin Wan, Elijah Newren On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > Run compute_collisions() for renames on both sides of history before > any calls to collect_renames(), and do not free the computed collisions > until after both calls to collect_renames(). This is just a code > reorganization at this point that doesn't make sense on its own, but > will permit us to use the computed collision info from both sides > within each call to collect_renames() in a subsequent commit. I think this would be easier to follow if split in two, since... > +static void free_collisions(struct strmap *collisions) > +{ > + struct hashmap_iter iter; > + struct strmap_entry *entry; > + > + /* Free each value in the collisions map */ > + strmap_for_each_entry(collisions, &iter, entry) { > + struct collision_info *info = entry->value; > + string_list_clear(&info->source_files, 0); > + } > + /* > + * In compute_collisions(), we set collisions.strdup_strings to 0 > + * so that we wouldn't have to make another copy of the new_path > + * allocated by apply_dir_rename(). But now that we've used them > + * and have no other references to these strings, it is time to > + * deallocate them. > + */ > + free_strmap_strings(collisions); > + strmap_clear(collisions, 1); > +} ...a large part of it... > > - /* Free each value in the collisions map */ > - strmap_for_each_entry(&collisions, &iter, entry) { > - struct collision_info *info = entry->value; > - string_list_clear(&info->source_files, 0); > - } > - /* > - * In compute_collisions(), we set collisions.strdup_strings to 0 > - * so that we wouldn't have to make another copy of the new_path > - * allocated by apply_dir_rename(). But now that we've used them > - * and have no other references to these strings, it is time to > - * deallocate them. > - */ > - free_strmap_strings(&collisions); > - strmap_clear(&collisions, 1); ..is moving existing code into a utility function... > return clean; > } > > @@ -3100,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, > { > struct diff_queue_struct combined = { 0 }; > struct rename_info *renames = &opt->priv->renames; > + struct strmap collisions[3]; > int need_dir_renames, s, i, clean = 1; > unsigned detection_run = 0; > > @@ -3149,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, > ALLOC_GROW(combined.queue, > renames->pairs[1].nr + renames->pairs[2].nr, > combined.alloc); > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { > + int other_side = 3 - i; > + compute_collisions(&collisions[i], > + &renames->dir_renames[other_side], > + &renames->pairs[i]); > + } > clean &= collect_renames(opt, &combined, MERGE_SIDE1, > + collisions, > &renames->dir_renames[2], > &renames->dir_renames[1]); > clean &= collect_renames(opt, &combined, MERGE_SIDE2, > + collisions, > &renames->dir_renames[1], > &renames->dir_renames[2]); > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) > + free_collisions(&collisions[i]); > STABLE_QSORT(combined.queue, combined.nr, compare_pairs); > trace2_region_leave("merge", "directory renames", opt->repo); ...which when looking at it makes the functional change harder to follow than it otherwise would be. ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions 2022-06-30 10:28 ` Ævar Arnfjörð Bjarmason @ 2022-07-01 3:02 ` Elijah Newren 0 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-07-01 3:02 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan On Thu, Jun 30, 2022 at 3:29 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@gmail.com> > > > > Run compute_collisions() for renames on both sides of history before > > any calls to collect_renames(), and do not free the computed collisions > > until after both calls to collect_renames(). This is just a code > > reorganization at this point that doesn't make sense on its own, but > > will permit us to use the computed collision info from both sides > > within each call to collect_renames() in a subsequent commit. > > I think this would be easier to follow if split in two, since... > > > +static void free_collisions(struct strmap *collisions) > > +{ > > + struct hashmap_iter iter; > > + struct strmap_entry *entry; > > + > > + /* Free each value in the collisions map */ > > + strmap_for_each_entry(collisions, &iter, entry) { > > + struct collision_info *info = entry->value; > > + string_list_clear(&info->source_files, 0); > > + } > > + /* > > + * In compute_collisions(), we set collisions.strdup_strings to 0 > > + * so that we wouldn't have to make another copy of the new_path > > + * allocated by apply_dir_rename(). But now that we've used them > > + * and have no other references to these strings, it is time to > > + * deallocate them. > > + */ > > + free_strmap_strings(collisions); > > + strmap_clear(collisions, 1); > > +} > > ...a large part of it... > > > > - /* Free each value in the collisions map */ > > - strmap_for_each_entry(&collisions, &iter, entry) { > > - struct collision_info *info = entry->value; > > - string_list_clear(&info->source_files, 0); > > - } > > - /* > > - * In compute_collisions(), we set collisions.strdup_strings to 0 > > - * so that we wouldn't have to make another copy of the new_path > > - * allocated by apply_dir_rename(). But now that we've used them > > - * and have no other references to these strings, it is time to > > - * deallocate them. > > - */ > > - free_strmap_strings(&collisions); > > - strmap_clear(&collisions, 1); > > ..is moving existing code into a utility function... > > > return clean; > > } > > > > @@ -3100,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, > > { > > struct diff_queue_struct combined = { 0 }; > > struct rename_info *renames = &opt->priv->renames; > > + struct strmap collisions[3]; > > int need_dir_renames, s, i, clean = 1; > > unsigned detection_run = 0; > > > > @@ -3149,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, > > ALLOC_GROW(combined.queue, > > renames->pairs[1].nr + renames->pairs[2].nr, > > combined.alloc); > > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { > > + int other_side = 3 - i; > > + compute_collisions(&collisions[i], > > + &renames->dir_renames[other_side], > > + &renames->pairs[i]); > > + } > > clean &= collect_renames(opt, &combined, MERGE_SIDE1, > > + collisions, > > &renames->dir_renames[2], > > &renames->dir_renames[1]); > > clean &= collect_renames(opt, &combined, MERGE_SIDE2, > > + collisions, > > &renames->dir_renames[1], > > &renames->dir_renames[2]); > > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) > > + free_collisions(&collisions[i]); > > STABLE_QSORT(combined.queue, combined.nr, compare_pairs); > > trace2_region_leave("merge", "directory renames", opt->repo); > > ...which when looking at it makes the functional change harder to follow > than it otherwise would be. Good point; will split. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget 2022-06-30 6:57 ` [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-06-30 6:57 ` [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-06-30 6:57 ` Elijah Newren via GitGitGadget 2022-06-30 10:31 ` Ævar Arnfjörð Bjarmason 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 3 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-06-30 6:57 UTC (permalink / raw) To: git; +Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> There is code in both merge-recursive and merge-ort for avoiding doubly transitive renames (i.e. one side renames directory A/ -> B/, and the other side renames directory B/ -> C/), because this combination would otherwise make a mess for new files added to A/ on the first side and wondering which directory they end up in -- especially if there were even more renames such as the first side renaming C/ -> D/. In such cases, it just turns "off" directory rename detection for the higher order transitive cases. The testcases added in t6423 a couple commits ago are slightly different but similar in principle. They involve a similar case of paired renaming but instead of A/ -> B/ and B/ -> C/, the second side renames a leading directory of B/ to C/. And both sides add a new file somewhere under the directory that the other side will rename. While the new files added start within different directories and thus could logically end up within different directories, it is weird for a file on one side to end up where the other one started and not move along with it. So, let's just turn off directory rename detection in this case as well. Another way to look at this is that if the source name involved in a directory rename on one side is the target name of a directory rename operation for a file from the other side, then we avoid the doubly transitive rename. (More concretely, if a directory rename on side D wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D already had a file named NEW_NAME, and a directory rename on side E wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the directory rename detection for NEW_NAME to prevent the NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add conflict on NEW_NAME.) Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 7 +++++++ t/t6423-merge-rename-directories.sh | 4 ++-- 2 files changed, 9 insertions(+), 2 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index fa6667de18c..17db4c30e5b 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2292,9 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap_entry *rename_info; struct strmap_entry *otherinfo = NULL; const char *new_dir; + int other_side = 3 - side_index; + /* + * Cases where we don't have or don't want a directory rename for + * this path, so we return NULL. + */ if (strmap_empty(dir_renames)) return new_path; + if (strmap_get(&collisions[other_side], path)) + return new_path; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) return new_path; diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index db13bb995f3..1300a01206a 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5245,7 +5245,7 @@ test_setup_12l () { ) } -test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (B into A): Rename into each other + add/add conflict' ' test_setup_12l BintoA && ( cd 12l_BintoA && @@ -5273,7 +5273,7 @@ test_expect_merge_algorithm failure failure '12l (B into A): Rename into each ot ) ' -test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (A into B): Rename into each other + add/add conflict' ' test_setup_12l AintoB && ( cd 12l_AintoB && -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-30 6:57 ` [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-06-30 10:31 ` Ævar Arnfjörð Bjarmason 2022-07-01 3:03 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-06-30 10:31 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Jonathan Tan, Calvin Wan, Elijah Newren On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > There is code in both merge-recursive and merge-ort for avoiding doubly > transitive renames (i.e. one side renames directory A/ -> B/, and the > other side renames directory B/ -> C/), because this combination would > otherwise make a mess for new files added to A/ on the first side and > wondering which directory they end up in -- especially if there were > even more renames such as the first side renaming C/ -> D/. In such > cases, it just turns "off" directory rename detection for the higher > order transitive cases. > > The testcases added in t6423 a couple commits ago are slightly different > but similar in principle. They involve a similar case of paired > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames > a leading directory of B/ to C/. And both sides add a new file > somewhere under the directory that the other side will rename. While > the new files added start within different directories and thus could > logically end up within different directories, it is weird for a file > on one side to end up where the other one started and not move along > with it. So, let's just turn off directory rename detection in this > case as well. > > Another way to look at this is that if the source name involved in a > directory rename on one side is the target name of a directory rename > operation for a file from the other side, then we avoid the doubly > transitive rename. (More concretely, if a directory rename on side D > wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D > already had a file named NEW_NAME, and a directory rename on side E > wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the > directory rename detection for NEW_NAME to prevent the > NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add > conflict on NEW_NAME.) > > Signed-off-by: Elijah Newren <newren@gmail.com> > --- > merge-ort.c | 7 +++++++ > t/t6423-merge-rename-directories.sh | 4 ++-- > 2 files changed, 9 insertions(+), 2 deletions(-) > > diff --git a/merge-ort.c b/merge-ort.c > index fa6667de18c..17db4c30e5b 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -2292,9 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt, > struct strmap_entry *rename_info; > struct strmap_entry *otherinfo = NULL; > const char *new_dir; > + int other_side = 3 - side_index; > > + /* > + * Cases where we don't have or don't want a directory rename for > + * this path, so we return NULL. > + */ > if (strmap_empty(dir_renames)) > return new_path; > + if (strmap_get(&collisions[other_side], path)) > + return new_path; I realize from looking at merge-recursive.c that this is carrying forward some legacy debt, but I find this code and the need for a comment more complex than it needs to. On top of master this will work just as well: diff --git a/merge-ort.c b/merge-ort.c index b5015b9afd4..f5a02b1ff6f 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2268,16 +2268,16 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap *collisions, int *clean_merge) { - char *new_path = NULL; + char *new_path; struct strmap_entry *rename_info; struct strmap_entry *otherinfo = NULL; const char *new_dir; if (strmap_empty(dir_renames)) - return new_path; + return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) - return new_path; + return NULL; /* old_dir = rename_info->key; */ new_dir = rename_info->value; I.e. we're really just making the reader squint to see that we're actually returning NULL here, it's only later that we have an actual "new path". Wouldn't sticking that earlier in this series be an improvement? The reason you need to explain "so we return NULL" is because we're carrying forward this oddity, but if we just don't initialize it and return NULL instead... If you want to keep this pretty much as-is I think you should add a \n before the (not seen in your context) "old_dir" comment seen in the context here above, to make it visually clear that your new comment is referring to these chains of returns. That could also be made clearer with (again, on top of master, and could be combined with the above): diff --git a/merge-ort.c b/merge-ort.c index b5015b9afd4..a418f81a3eb 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2278,8 +2278,6 @@ static char *check_for_directory_rename(struct merge_options *opt, rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) return new_path; - /* old_dir = rename_info->key; */ - new_dir = rename_info->value; /* * This next part is a little weird. We do not want to do an @@ -2305,6 +2303,7 @@ static char *check_for_directory_rename(struct merge_options *opt, * As it turns out, this also prevents N-way transient rename * confusion; See testcases 9c and 9d of t6043. */ + new_dir = rename_info->value; /* old_dir = rename_info->key; */ otherinfo = strmap_get_entry(dir_rename_exclusions, new_dir); if (otherinfo) { path_msg(opt, rename_info->key, 1, ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict 2022-06-30 10:31 ` Ævar Arnfjörð Bjarmason @ 2022-07-01 3:03 ` Elijah Newren 0 siblings, 0 replies; 45+ messages in thread From: Elijah Newren @ 2022-07-01 3:03 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan On Thu, Jun 30, 2022 at 3:41 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Thu, Jun 30 2022, Elijah Newren via GitGitGadget wrote: > > > From: Elijah Newren <newren@gmail.com> > > > > There is code in both merge-recursive and merge-ort for avoiding doubly > > transitive renames (i.e. one side renames directory A/ -> B/, and the > > other side renames directory B/ -> C/), because this combination would > > otherwise make a mess for new files added to A/ on the first side and > > wondering which directory they end up in -- especially if there were > > even more renames such as the first side renaming C/ -> D/. In such > > cases, it just turns "off" directory rename detection for the higher > > order transitive cases. > > > > The testcases added in t6423 a couple commits ago are slightly different > > but similar in principle. They involve a similar case of paired > > renaming but instead of A/ -> B/ and B/ -> C/, the second side renames > > a leading directory of B/ to C/. And both sides add a new file > > somewhere under the directory that the other side will rename. While > > the new files added start within different directories and thus could > > logically end up within different directories, it is weird for a file > > on one side to end up where the other one started and not move along > > with it. So, let's just turn off directory rename detection in this > > case as well. > > > > Another way to look at this is that if the source name involved in a > > directory rename on one side is the target name of a directory rename > > operation for a file from the other side, then we avoid the doubly > > transitive rename. (More concretely, if a directory rename on side D > > wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D > > already had a file named NEW_NAME, and a directory rename on side E > > wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the > > directory rename detection for NEW_NAME to prevent the > > NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add > > conflict on NEW_NAME.) > > > > Signed-off-by: Elijah Newren <newren@gmail.com> > > --- > > merge-ort.c | 7 +++++++ > > t/t6423-merge-rename-directories.sh | 4 ++-- > > 2 files changed, 9 insertions(+), 2 deletions(-) > > > > diff --git a/merge-ort.c b/merge-ort.c > > index fa6667de18c..17db4c30e5b 100644 > > --- a/merge-ort.c > > +++ b/merge-ort.c > > @@ -2292,9 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt, > > struct strmap_entry *rename_info; > > struct strmap_entry *otherinfo = NULL; > > const char *new_dir; > > + int other_side = 3 - side_index; > > > > + /* > > + * Cases where we don't have or don't want a directory rename for > > + * this path, so we return NULL. > > + */ > > if (strmap_empty(dir_renames)) > > return new_path; > > + if (strmap_get(&collisions[other_side], path)) > > + return new_path; > > I realize from looking at merge-recursive.c that this is carrying > forward some legacy debt ...which was introduced by me as well... > , but I find this code and the need for a > comment more complex than it needs to. On top of master this will work > just as well: > > diff --git a/merge-ort.c b/merge-ort.c > index b5015b9afd4..f5a02b1ff6f 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -2268,16 +2268,16 @@ static char *check_for_directory_rename(struct merge_options *opt, > struct strmap *collisions, > int *clean_merge) > { > - char *new_path = NULL; > + char *new_path; > struct strmap_entry *rename_info; > struct strmap_entry *otherinfo = NULL; > const char *new_dir; > > if (strmap_empty(dir_renames)) > - return new_path; > + return NULL; > rename_info = check_dir_renamed(path, dir_renames); > if (!rename_info) > - return new_path; > + return NULL; > /* old_dir = rename_info->key; */ > new_dir = rename_info->value; You know, while making this patch series I was asking myself, "Why didn't I just explicitly return NULL here?" I don't know why I did it this way, but it now seems slightly odd to me too. I decided to not fix it up and just provide a more targeted fix, but since it tripped you up, I'm happy to add this as a preparatory cleanup. > I.e. we're really just making the reader squint to see that we're > actually returning NULL here, it's only later that we have an actual > "new path". > > Wouldn't sticking that earlier in this series be an improvement? The > reason you need to explain "so we return NULL" is because we're carrying > forward this oddity, but if we just don't initialize it and return NULL > instead... I still think the comment makes sense to add, other than the "so we return NULL" bit, even after this change. But yeah, we can strike the "so we return NULL" part of it after this cleanup. > If you want to keep this pretty much as-is I think you should add a \n > before the (not seen in your context) "old_dir" comment seen in the > context here above, to make it visually clear that your new comment is > referring to these chains of returns. That could also be made clearer > with (again, on top of master, and could be combined with the above): > > diff --git a/merge-ort.c b/merge-ort.c > index b5015b9afd4..a418f81a3eb 100644 > --- a/merge-ort.c > +++ b/merge-ort.c > @@ -2278,8 +2278,6 @@ static char *check_for_directory_rename(struct merge_options *opt, > rename_info = check_dir_renamed(path, dir_renames); > if (!rename_info) > return new_path; > - /* old_dir = rename_info->key; */ > - new_dir = rename_info->value; > > /* > * This next part is a little weird. We do not want to do an > @@ -2305,6 +2303,7 @@ static char *check_for_directory_rename(struct merge_options *opt, > * As it turns out, this also prevents N-way transient rename > * confusion; See testcases 9c and 9d of t6043. > */ > + new_dir = rename_info->value; /* old_dir = rename_info->key; */ > otherinfo = strmap_get_entry(dir_rename_exclusions, new_dir); > if (otherinfo) { > path_msg(opt, rename_info->key, 1, Sure, seems fine. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget ` (2 preceding siblings ...) 2022-06-30 6:57 ` [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget ` (5 more replies) 3 siblings, 6 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren This series adds some testcases based on the tensorflow repository issue reported by Glen Choo at [1], demonstrating bugs in both the ort and recursive strategies. It also provides a fix for the ort strategy. Changes since v2: * Added a couple preparatory cleanup patches * Added a comment about why sub1/newfile is important to the new testcases * A couple other minor code cleanups Changes since v1: * Fixed some wording issues in comments, and added a bit more details to one of the commit messages [1] https://lore.kernel.org/git/kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com/ Elijah Newren (5): t6423: add tests of dual directory rename plus add/add conflict merge-ort: small cleanups of check_for_directory_rename merge-ort: make a separate function for freeing struct collisions merge-ort: shuffle the computation and cleanup of potential collisions merge-ort: fix issue with dual rename and add/add conflict merge-ort.c | 74 +++++++++++++------- t/t6423-merge-rename-directories.sh | 105 ++++++++++++++++++++++++++++ 2 files changed, 153 insertions(+), 26 deletions(-) base-commit: e54793a95afeea1e10de1e5ad7eab914e7416250 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1268%2Fnewren%2Ffix-dual-rename-into-each-other-plus-conflicting-adds-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1268/newren/fix-dual-rename-into-each-other-plus-conflicting-adds-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1268 Range-diff vs v2: 1: bf4c03d01d5 ! 1: a16a1c4d947 t6423: add tests of dual directory rename plus add/add conflict @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename +# +# Expected: sub3/{file, newfile, sub2/other} +# CONFLICT (add/add): sub1/sub2/new_add_add_file ++# ++# Note that sub1/newfile is not extraneous. Directory renames are only ++# detected if they are needed, and they are only needed if the old directory ++# had a new file added on the opposite side of history. So sub1/newfile ++# is needed for there to be a sub1/ -> sub3/ rename. + +test_setup_12l () { + test_create_repo 12l_$1 && + ( + cd 12l_$1 && + -+ mkdir -p sub1 sub2 ++ mkdir sub1 sub2 + echo file >sub1/file && + echo other >sub2/other && + git add sub1 sub2 && @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename + + test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && + -+ git ls-files -s >out && -+ test_line_count = 5 out && ++ test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename + test_cmp expect actual && + + git ls-files -o >actual && -+ test_write_lines actual expect out >expect && ++ test_write_lines actual expect >expect && + test_cmp expect actual + ) +' @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename + + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && + -+ git ls-files -s >out && -+ test_line_count = 5 out && ++ test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ @@ t/t6423-merge-rename-directories.sh: test_expect_success '12k: Directory rename + test_cmp expect actual && + + git ls-files -o >actual && -+ test_write_lines actual expect out >expect && ++ test_write_lines actual expect >expect && + test_cmp expect actual + ) +' -: ----------- > 2: 297fef60b19 merge-ort: small cleanups of check_for_directory_rename -: ----------- > 3: f5f87acbbd2 merge-ort: make a separate function for freeing struct collisions 2: cfa38f01481 ! 4: d3eac3d0bf6 merge-ort: shuffle the computation and cleanup of potential collisions @@ Commit message Signed-off-by: Elijah Newren <newren@gmail.com> ## merge-ort.c ## -@@ merge-ort.c: static void compute_collisions(struct strmap *collisions, - } - } - -+static void free_collisions(struct strmap *collisions) -+{ -+ struct hashmap_iter iter; -+ struct strmap_entry *entry; -+ -+ /* Free each value in the collisions map */ -+ strmap_for_each_entry(collisions, &iter, entry) { -+ struct collision_info *info = entry->value; -+ string_list_clear(&info->source_files, 0); -+ } -+ /* -+ * In compute_collisions(), we set collisions.strdup_strings to 0 -+ * so that we wouldn't have to make another copy of the new_path -+ * allocated by apply_dir_rename(). But now that we've used them -+ * and have no other references to these strings, it is time to -+ * deallocate them. -+ */ -+ free_strmap_strings(collisions); -+ strmap_clear(collisions, 1); -+} -+ - static char *check_for_directory_rename(struct merge_options *opt, - const char *path, - unsigned side_index, @@ merge-ort.c: static char *check_for_directory_rename(struct merge_options *opt, } @@ merge-ort.c: static int detect_regular_renames(struct merge_options *opt, int i, clean = 1; - struct strmap collisions; struct diff_queue_struct *side_pairs; -- struct hashmap_iter iter; -- struct strmap_entry *entry; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; @@ merge-ort.c: static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } -- /* Free each value in the collisions map */ -- strmap_for_each_entry(&collisions, &iter, entry) { -- struct collision_info *info = entry->value; -- string_list_clear(&info->source_files, 0); -- } -- /* -- * In compute_collisions(), we set collisions.strdup_strings to 0 -- * so that we wouldn't have to make another copy of the new_path -- * allocated by apply_dir_rename(). But now that we've used them -- * and have no other references to these strings, it is time to -- * deallocate them. -- */ -- free_strmap_strings(&collisions); -- strmap_clear(&collisions, 1); +- free_collisions(&collisions); return clean; } 3: da3ae38e390 ! 5: 121761e26e2 merge-ort: fix issue with dual rename and add/add conflict @@ Commit message ## merge-ort.c ## @@ merge-ort.c: static char *check_for_directory_rename(struct merge_options *opt, struct strmap_entry *rename_info; - struct strmap_entry *otherinfo = NULL; + struct strmap_entry *otherinfo; const char *new_dir; + int other_side = 3 - side_index; +- /* Cases where we don't have a directory rename for this path */ + /* + * Cases where we don't have or don't want a directory rename for -+ * this path, so we return NULL. ++ * this path. + */ if (strmap_empty(dir_renames)) - return new_path; + return NULL; + if (strmap_get(&collisions[other_side], path)) -+ return new_path; ++ return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) - return new_path; + return NULL; ## t/t6423-merge-rename-directories.sh ## @@ t/t6423-merge-rename-directories.sh: test_setup_12l () { -- gitgitgadget ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 1/5] t6423: add tests of dual directory rename plus add/add conflict 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget ` (4 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> This is an attempt at minimalizing a testcase reported by Glen Choo with tensorflow where merge-ort would report an assertion failure: Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 reversing the direction of the merge provides a different error: error: cache entry has null sha1: ... fatal: unable to write .git/index so we add testcases for both. With these new testcases, the recursive strategy differs in that it returns the latter error for both merge directions. These testcases are somehow a little different than Glen's original tensorflow testcase in that these ones trigger a bug with the recursive algorithm whereas his testcase didn't. I figure that means these testcases somehow manage to be more comprehensive. Reported-by: Glen Choo <chooglen@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> --- t/t6423-merge-rename-directories.sh | 105 ++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 479db32cd62..ed5586de28c 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5199,6 +5199,111 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' ) ' +# Testcase 12l, Both sides rename a directory into the other side, both add +# a file which after directory renames are the same filename +# Commit O: sub1/file, sub2/other +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} +# +# In words: +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 +# +# Expected: sub3/{file, newfile, sub2/other} +# CONFLICT (add/add): sub1/sub2/new_add_add_file +# +# Note that sub1/newfile is not extraneous. Directory renames are only +# detected if they are needed, and they are only needed if the old directory +# had a new file added on the opposite side of history. So sub1/newfile +# is needed for there to be a sub1/ -> sub3/ rename. + +test_setup_12l () { + test_create_repo 12l_$1 && + ( + cd 12l_$1 && + + mkdir sub1 sub2 + echo file >sub1/file && + echo other >sub2/other && + git add sub1 sub2 && + git commit -m "O" && + + git branch O && + git branch A && + git branch B && + + git checkout A && + git mv sub1 sub3 && + echo conflicting >sub2/new_add_add_file && + git add sub2 && + test_tick && + git add -u && + git commit -m "A" && + + git checkout B && + echo dissimilar >sub2/new_add_add_file && + echo brand >sub1/newfile && + git add sub1 sub2 && + git mv sub2 sub1 && + test_tick && + git commit -m "B" + ) +} + +test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' + test_setup_12l BintoA && + ( + cd 12l_BintoA && + + git checkout -q A^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && + + test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + A:sub2/new_add_add_file \ + B:sub1/sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect >expect && + test_cmp expect actual + ) +' + +test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' + test_setup_12l AintoB && + ( + cd 12l_AintoB && + + git checkout -q B^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && + + test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + B:sub1/sub2/new_add_add_file \ + A:sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect >expect && + test_cmp expect actual + ) +' + ########################################################################### # SECTION 13: Checking informational and conflict messages # -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v3 2/5] merge-ort: small cleanups of check_for_directory_rename 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget ` (3 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@palantir.com> No functional changes, just some preparatory cleanups. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Elijah Newren <newren@palantir.com> --- merge-ort.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 8545354dafd..ff037cca8d2 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2267,18 +2267,17 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap *collisions, int *clean_merge) { - char *new_path = NULL; + char *new_path; struct strmap_entry *rename_info; - struct strmap_entry *otherinfo = NULL; + struct strmap_entry *otherinfo; const char *new_dir; + /* Cases where we don't have a directory rename for this path */ if (strmap_empty(dir_renames)) - return new_path; + return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) - return new_path; - /* old_dir = rename_info->key; */ - new_dir = rename_info->value; + return NULL; /* * This next part is a little weird. We do not want to do an @@ -2304,6 +2303,7 @@ static char *check_for_directory_rename(struct merge_options *opt, * As it turns out, this also prevents N-way transient rename * confusion; See testcases 9c and 9d of t6043. */ + new_dir = rename_info->value; /* old_dir = rename_info->key; */ otherinfo = strmap_get_entry(dir_rename_exclusions, new_dir); if (otherinfo) { path_msg(opt, rename_info->key, 1, -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v3 3/5] merge-ort: make a separate function for freeing struct collisions 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget ` (2 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@palantir.com> This commit makes no functional changes, it's just some code movement in preparation for later changes. Signed-off-by: Elijah Newren <newren@palantir.com> --- merge-ort.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index ff037cca8d2..1514dd173c0 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2259,6 +2259,27 @@ static void compute_collisions(struct strmap *collisions, } } +static void free_collisions(struct strmap *collisions) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + /* Free each value in the collisions map */ + strmap_for_each_entry(collisions, &iter, entry) { + struct collision_info *info = entry->value; + string_list_clear(&info->source_files, 0); + } + /* + * In compute_collisions(), we set collisions.strdup_strings to 0 + * so that we wouldn't have to make another copy of the new_path + * allocated by apply_dir_rename(). But now that we've used them + * and have no other references to these strings, it is time to + * deallocate them. + */ + free_strmap_strings(collisions); + strmap_clear(collisions, 1); +} + static char *check_for_directory_rename(struct merge_options *opt, const char *path, unsigned side_index, @@ -3029,8 +3050,6 @@ static int collect_renames(struct merge_options *opt, int i, clean = 1; struct strmap collisions; struct diff_queue_struct *side_pairs; - struct hashmap_iter iter; - struct strmap_entry *entry; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; @@ -3076,20 +3095,7 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - /* Free each value in the collisions map */ - strmap_for_each_entry(&collisions, &iter, entry) { - struct collision_info *info = entry->value; - string_list_clear(&info->source_files, 0); - } - /* - * In compute_collisions(), we set collisions.strdup_strings to 0 - * so that we wouldn't have to make another copy of the new_path - * allocated by apply_dir_rename(). But now that we've used them - * and have no other references to these strings, it is time to - * deallocate them. - */ - free_strmap_strings(&collisions); - strmap_clear(&collisions, 1); + free_collisions(&collisions); return clean; } -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (2 preceding siblings ...) 2022-07-01 5:23 ` [PATCH v3 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-01 9:16 ` Ævar Arnfjörð Bjarmason 2022-07-01 5:23 ` [PATCH v3 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 5 siblings, 1 reply; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> Run compute_collisions() for renames on both sides of history before any calls to collect_renames(), and do not free the computed collisions until after both calls to collect_renames(). This is just a code reorganization at this point that doesn't make sense on its own, but will permit us to use the computed collision info from both sides within each call to collect_renames() in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 1514dd173c0..b496f0e3803 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2335,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, } new_path = handle_path_level_conflicts(opt, path, side_index, - rename_info, collisions); + rename_info, + &collisions[side_index]); *clean_merge &= (new_path != NULL); return new_path; @@ -3044,16 +3045,15 @@ static int detect_regular_renames(struct merge_options *opt, static int collect_renames(struct merge_options *opt, struct diff_queue_struct *result, unsigned side_index, + struct strmap *collisions, struct strmap *dir_renames_for_side, struct strmap *rename_exclusions) { int i, clean = 1; - struct strmap collisions; struct diff_queue_struct *side_pairs; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; - compute_collisions(&collisions, dir_renames_for_side, side_pairs); for (i = 0; i < side_pairs->nr; ++i) { struct diff_filepair *p = side_pairs->queue[i]; @@ -3069,7 +3069,7 @@ static int collect_renames(struct merge_options *opt, side_index, dir_renames_for_side, rename_exclusions, - &collisions, + collisions, &clean); possibly_cache_new_pair(renames, p, side_index, new_path); @@ -3095,7 +3095,6 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - free_collisions(&collisions); return clean; } @@ -3106,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, { struct diff_queue_struct combined = { 0 }; struct rename_info *renames = &opt->priv->renames; + struct strmap collisions[3]; int need_dir_renames, s, i, clean = 1; unsigned detection_run = 0; @@ -3155,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, ALLOC_GROW(combined.queue, renames->pairs[1].nr + renames->pairs[2].nr, combined.alloc); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + int other_side = 3 - i; + compute_collisions(&collisions[i], + &renames->dir_renames[other_side], + &renames->pairs[i]); + } clean &= collect_renames(opt, &combined, MERGE_SIDE1, + collisions, &renames->dir_renames[2], &renames->dir_renames[1]); clean &= collect_renames(opt, &combined, MERGE_SIDE2, + collisions, &renames->dir_renames[1], &renames->dir_renames[2]); + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) + free_collisions(&collisions[i]); STABLE_QSORT(combined.queue, combined.nr, compare_pairs); trace2_region_leave("merge", "directory renames", opt->repo); -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions 2022-07-01 5:23 ` [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-07-01 9:16 ` Ævar Arnfjörð Bjarmason 2022-07-25 12:00 ` C99 "for (int ..." form on "master" (was: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions) Ævar Arnfjörð Bjarmason 0 siblings, 1 reply; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-07-01 9:16 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Jonathan Tan, Calvin Wan, Elijah Newren, Junio C Hamano On Fri, Jul 01 2022, Elijah Newren via GitGitGadget wrote: > From: Elijah Newren <newren@gmail.com> > > Run compute_collisions() for renames on both sides of history before > any calls to collect_renames(), and do not free the computed collisions > until after both calls to collect_renames(). This is just a code > reorganization at this point that doesn't make sense on its own, but > will permit us to use the computed collision info from both sides > within each call to collect_renames() in a subsequent commit. > > Signed-off-by: Elijah Newren <newren@gmail.com> This is much easier to read & review with the prep patches, thanks a lot. B.t.w. on the "legacy code" comment wrt merge-{ort,recursive}.c : I didn't look in that case, but I've seen that you've copied various older code from merge-recursive.c to merge-ort.c (which makes sense) in the past, but I didn't check the origin in that case. Sorry :) > @@ -3106,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, > { > struct diff_queue_struct combined = { 0 }; > struct rename_info *renames = &opt->priv->renames; > + struct strmap collisions[3]; > int need_dir_renames, s, i, clean = 1; > unsigned detection_run = 0; > > @@ -3155,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, > ALLOC_GROW(combined.queue, > renames->pairs[1].nr + renames->pairs[2].nr, > combined.alloc); > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { The "int i" here will need to be pre-declared earlier, per: 6563706568b (CodingGuidelines: give deadline for "for (int i = 0; ...", 2022-03-30) I also don't mind us just saying "we've waited enough". Junio? > + int other_side = 3 - i; > + compute_collisions(&collisions[i], > + &renames->dir_renames[other_side], > + &renames->pairs[i]); > + } > clean &= collect_renames(opt, &combined, MERGE_SIDE1, > + collisions, > &renames->dir_renames[2], > &renames->dir_renames[1]); > clean &= collect_renames(opt, &combined, MERGE_SIDE2, > + collisions, > &renames->dir_renames[1], > &renames->dir_renames[2]); > + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) > + free_collisions(&collisions[i]); > STABLE_QSORT(combined.queue, combined.nr, compare_pairs); > trace2_region_leave("merge", "directory renames", opt->repo); ^ permalink raw reply [flat|nested] 45+ messages in thread
* C99 "for (int ..." form on "master" (was: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions) 2022-07-01 9:16 ` Ævar Arnfjörð Bjarmason @ 2022-07-25 12:00 ` Ævar Arnfjörð Bjarmason 2022-07-26 2:14 ` Elijah Newren 0 siblings, 1 reply; 45+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2022-07-25 12:00 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Jonathan Tan, Calvin Wan, Elijah Newren, Junio C Hamano, Johannes Schindelin On Fri, Jul 01 2022, Ævar Arnfjörð Bjarmason wrote: > On Fri, Jul 01 2022, Elijah Newren via GitGitGadget wrote: > [...] >> @@ -3106,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, >> { >> struct diff_queue_struct combined = { 0 }; >> struct rename_info *renames = &opt->priv->renames; >> + struct strmap collisions[3]; >> int need_dir_renames, s, i, clean = 1; >> unsigned detection_run = 0; >> >> @@ -3155,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, >> ALLOC_GROW(combined.queue, >> renames->pairs[1].nr + renames->pairs[2].nr, >> combined.alloc); >> + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { > > The "int i" here will need to be pre-declared earlier, per: 6563706568b > (CodingGuidelines: give deadline for "for (int i = 0; ...", 2022-03-30) > > I also don't mind us just saying "we've waited enough". Junio? This case got fixed, but per the changed $subject others have snuck through. Since be733e12001 (Merge branch 'en/merge-tree', 2022-07-14) we've had these forms on "master", see 6debb7527b0 (merge-ort: store messages in a list, not in a single strbuf, 2022-06-18) and cb2607759e2 (merge-ort: store more specific conflict information, 2022-06-18). We could "fix" those, but per the above I think it's just as valid to just move up the deadline & say that 2.38.0 will have a hard dependency on this C99 feature... ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: C99 "for (int ..." form on "master" (was: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions) 2022-07-25 12:00 ` C99 "for (int ..." form on "master" (was: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions) Ævar Arnfjörð Bjarmason @ 2022-07-26 2:14 ` Elijah Newren 2022-07-26 4:48 ` C99 "for (int ..." form on "master" Junio C Hamano 0 siblings, 1 reply; 45+ messages in thread From: Elijah Newren @ 2022-07-26 2:14 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason Cc: Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan, Junio C Hamano, Johannes Schindelin On Mon, Jul 25, 2022 at 5:04 AM Ævar Arnfjörð Bjarmason <avarab@gmail.com> wrote: > > On Fri, Jul 01 2022, Ævar Arnfjörð Bjarmason wrote: > > > On Fri, Jul 01 2022, Elijah Newren via GitGitGadget wrote: > > [...] > >> @@ -3106,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, > >> { > >> struct diff_queue_struct combined = { 0 }; > >> struct rename_info *renames = &opt->priv->renames; > >> + struct strmap collisions[3]; > >> int need_dir_renames, s, i, clean = 1; > >> unsigned detection_run = 0; > >> > >> @@ -3155,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, > >> ALLOC_GROW(combined.queue, > >> renames->pairs[1].nr + renames->pairs[2].nr, > >> combined.alloc); > >> + for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { > > > > The "int i" here will need to be pre-declared earlier, per: 6563706568b > > (CodingGuidelines: give deadline for "for (int i = 0; ...", 2022-03-30) > > > > I also don't mind us just saying "we've waited enough". Junio? > > This case got fixed, but per the changed $subject others have snuck > through. > > Since be733e12001 (Merge branch 'en/merge-tree', 2022-07-14) we've had > these forms on "master", see 6debb7527b0 (merge-ort: store messages in a > list, not in a single strbuf, 2022-06-18) and cb2607759e2 (merge-ort: > store more specific conflict information, 2022-06-18). > > We could "fix" those, but per the above I think it's just as valid to > just move up the deadline & say that 2.38.0 will have a hard dependency > on this C99 feature... Thanks for catching this and bringing it up; sorry for missing it earlier. 6563706568b says we should "revisit this *around* November 2022" (emphasis added). 2.38 will be released in October 2022. So, maybe it's fine as-is. But if others prefer these be fixed over moving up the deadline, I'll go ahead and submit a patch. I guess we just need a call. Junio? ^ permalink raw reply [flat|nested] 45+ messages in thread
* Re: C99 "for (int ..." form on "master" 2022-07-26 2:14 ` Elijah Newren @ 2022-07-26 4:48 ` Junio C Hamano 0 siblings, 0 replies; 45+ messages in thread From: Junio C Hamano @ 2022-07-26 4:48 UTC (permalink / raw) To: Elijah Newren Cc: Ævar Arnfjörð Bjarmason, Elijah Newren via GitGitGadget, Git Mailing List, Jonathan Tan, Calvin Wan, Johannes Schindelin Elijah Newren <newren@gmail.com> writes: > Thanks for catching this and bringing it up; sorry for missing it earlier. > > 6563706568b says we should "revisit this *around* November 2022" > (emphasis added). 2.38 will be released in October 2022. So, maybe > it's fine as-is. > > But if others prefer these be fixed over moving up the deadline, I'll > go ahead and submit a patch. > > I guess we just need a call. Junio? I do not see a need to do anything special at this moment. When somebody reports that their favorite compiler does not yet grok the syntax, we can ask them to ensure that there are only two places that need "fixing" for them to compile. It would have been ideal if we kept the "weather balloon" to the only one known place, but the second one would not be a disaster. We would re-evaluate the situation when such a report comes, but even if we decide to stay away from the syntax a bit longer, we know exactly where the only two places we need to revert are. One thing we should absolutely avoid is to open the floodgate and deliberately add more of them, just for the sake of adding more of them. Thanks. P.S. Tomorrow is that "gitster goes offline every other Tuesday" day. ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v3 5/5] merge-ort: fix issue with dual rename and add/add conflict 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (3 preceding siblings ...) 2022-07-01 5:23 ` [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-07-01 5:23 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-01 5:23 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> There is code in both merge-recursive and merge-ort for avoiding doubly transitive renames (i.e. one side renames directory A/ -> B/, and the other side renames directory B/ -> C/), because this combination would otherwise make a mess for new files added to A/ on the first side and wondering which directory they end up in -- especially if there were even more renames such as the first side renaming C/ -> D/. In such cases, it just turns "off" directory rename detection for the higher order transitive cases. The testcases added in t6423 a couple commits ago are slightly different but similar in principle. They involve a similar case of paired renaming but instead of A/ -> B/ and B/ -> C/, the second side renames a leading directory of B/ to C/. And both sides add a new file somewhere under the directory that the other side will rename. While the new files added start within different directories and thus could logically end up within different directories, it is weird for a file on one side to end up where the other one started and not move along with it. So, let's just turn off directory rename detection in this case as well. Another way to look at this is that if the source name involved in a directory rename on one side is the target name of a directory rename operation for a file from the other side, then we avoid the doubly transitive rename. (More concretely, if a directory rename on side D wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D already had a file named NEW_NAME, and a directory rename on side E wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the directory rename detection for NEW_NAME to prevent the NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add conflict on NEW_NAME.) Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 8 +++++++- t/t6423-merge-rename-directories.sh | 4 ++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index b496f0e3803..daa0e4496f8 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2292,10 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap_entry *rename_info; struct strmap_entry *otherinfo; const char *new_dir; + int other_side = 3 - side_index; - /* Cases where we don't have a directory rename for this path */ + /* + * Cases where we don't have or don't want a directory rename for + * this path. + */ if (strmap_empty(dir_renames)) return NULL; + if (strmap_get(&collisions[other_side], path)) + return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) return NULL; diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index ed5586de28c..99baf77cbfd 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5250,7 +5250,7 @@ test_setup_12l () { ) } -test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (B into A): Rename into each other + add/add conflict' ' test_setup_12l BintoA && ( cd 12l_BintoA && @@ -5277,7 +5277,7 @@ test_expect_merge_algorithm failure failure '12l (B into A): Rename into each ot ) ' -test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (A into B): Rename into each other + add/add conflict' ' test_setup_12l AintoB && ( cd 12l_AintoB && -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (4 preceding siblings ...) 2022-07-01 5:23 ` [PATCH v3 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget ` (5 more replies) 5 siblings, 6 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren This series adds some testcases based on the tensorflow repository issue reported by Glen Choo at [1], demonstrating bugs in both the ort and recursive strategies. It also provides a fix for the ort strategy. Changes since v3: * Remove use of for-initializer Changes since v2: * Added a couple preparatory cleanup patches * Added a comment about why sub1/newfile is important to the new testcases * A couple other minor code cleanups Changes since v1: * Fixed some wording issues in comments, and added a bit more details to one of the commit messages [1] https://lore.kernel.org/git/kl6lee006mle.fsf@chooglen-macbookpro.roam.corp.google.com/ Elijah Newren (5): t6423: add tests of dual directory rename plus add/add conflict merge-ort: small cleanups of check_for_directory_rename merge-ort: make a separate function for freeing struct collisions merge-ort: shuffle the computation and cleanup of potential collisions merge-ort: fix issue with dual rename and add/add conflict merge-ort.c | 74 +++++++++++++------- t/t6423-merge-rename-directories.sh | 105 ++++++++++++++++++++++++++++ 2 files changed, 153 insertions(+), 26 deletions(-) base-commit: e54793a95afeea1e10de1e5ad7eab914e7416250 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1268%2Fnewren%2Ffix-dual-rename-into-each-other-plus-conflicting-adds-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1268/newren/fix-dual-rename-into-each-other-plus-conflicting-adds-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/1268 Range-diff vs v3: 1: a16a1c4d947 = 1: a16a1c4d947 t6423: add tests of dual directory rename plus add/add conflict 2: 297fef60b19 = 2: 297fef60b19 merge-ort: small cleanups of check_for_directory_rename 3: f5f87acbbd2 = 3: f5f87acbbd2 merge-ort: make a separate function for freeing struct collisions 4: d3eac3d0bf6 ! 4: 9d813116112 merge-ort: shuffle the computation and cleanup of potential collisions @@ merge-ort.c: static int detect_and_process_renames(struct merge_options *opt, ALLOC_GROW(combined.queue, renames->pairs[1].nr + renames->pairs[2].nr, combined.alloc); -+ for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { ++ for (i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + int other_side = 3 - i; + compute_collisions(&collisions[i], + &renames->dir_renames[other_side], @@ merge-ort.c: static int detect_and_process_renames(struct merge_options *opt, + collisions, &renames->dir_renames[1], &renames->dir_renames[2]); -+ for (int i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) ++ for (i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) + free_collisions(&collisions[i]); STABLE_QSORT(combined.queue, combined.nr, compare_pairs); trace2_region_leave("merge", "directory renames", opt->repo); 5: 121761e26e2 = 5: 993ac405408 merge-ort: fix issue with dual rename and add/add conflict -- gitgitgadget ^ permalink raw reply [flat|nested] 45+ messages in thread
* [PATCH v4 1/5] t6423: add tests of dual directory rename plus add/add conflict 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget ` (4 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> This is an attempt at minimalizing a testcase reported by Glen Choo with tensorflow where merge-ort would report an assertion failure: Assertion failed: (ci->filemask == 2 || ci->filemask == 4), function apply_directory_rename_modifications, file merge-ort.c, line 2410 reversing the direction of the merge provides a different error: error: cache entry has null sha1: ... fatal: unable to write .git/index so we add testcases for both. With these new testcases, the recursive strategy differs in that it returns the latter error for both merge directions. These testcases are somehow a little different than Glen's original tensorflow testcase in that these ones trigger a bug with the recursive algorithm whereas his testcase didn't. I figure that means these testcases somehow manage to be more comprehensive. Reported-by: Glen Choo <chooglen@google.com> Signed-off-by: Elijah Newren <newren@gmail.com> --- t/t6423-merge-rename-directories.sh | 105 ++++++++++++++++++++++++++++ 1 file changed, 105 insertions(+) diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index 479db32cd62..ed5586de28c 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5199,6 +5199,111 @@ test_expect_success '12k: Directory rename with sibling causes rename-to-self' ' ) ' +# Testcase 12l, Both sides rename a directory into the other side, both add +# a file which after directory renames are the same filename +# Commit O: sub1/file, sub2/other +# Commit A: sub3/file, sub2/{other, new_add_add_file_1} +# Commit B: sub1/{file, newfile}, sub1/sub2/{other, new_add_add_file_2} +# +# In words: +# A: sub1/ -> sub3/, add sub2/new_add_add_file_1 +# B: sub2/ -> sub1/sub2, add sub1/newfile, add sub1/sub2/new_add_add_file_2 +# +# Expected: sub3/{file, newfile, sub2/other} +# CONFLICT (add/add): sub1/sub2/new_add_add_file +# +# Note that sub1/newfile is not extraneous. Directory renames are only +# detected if they are needed, and they are only needed if the old directory +# had a new file added on the opposite side of history. So sub1/newfile +# is needed for there to be a sub1/ -> sub3/ rename. + +test_setup_12l () { + test_create_repo 12l_$1 && + ( + cd 12l_$1 && + + mkdir sub1 sub2 + echo file >sub1/file && + echo other >sub2/other && + git add sub1 sub2 && + git commit -m "O" && + + git branch O && + git branch A && + git branch B && + + git checkout A && + git mv sub1 sub3 && + echo conflicting >sub2/new_add_add_file && + git add sub2 && + test_tick && + git add -u && + git commit -m "A" && + + git checkout B && + echo dissimilar >sub2/new_add_add_file && + echo brand >sub1/newfile && + git add sub1 sub2 && + git mv sub2 sub1 && + test_tick && + git commit -m "B" + ) +} + +test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' + test_setup_12l BintoA && + ( + cd 12l_BintoA && + + git checkout -q A^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive B^0 && + + test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + A:sub2/new_add_add_file \ + B:sub1/sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect >expect && + test_cmp expect actual + ) +' + +test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' + test_setup_12l AintoB && + ( + cd 12l_AintoB && + + git checkout -q B^0 && + + test_must_fail git -c merge.directoryRenames=true merge -s recursive A^0 && + + test_stdout_line_count = 5 git ls-files -s && + + git rev-parse >actual \ + :0:sub3/file :0:sub3/newfile :0:sub3/sub2/other \ + :2:sub1/sub2/new_add_add_file \ + :3:sub1/sub2/new_add_add_file && + git rev-parse >expect \ + O:sub1/file B:sub1/newfile O:sub2/other \ + B:sub1/sub2/new_add_add_file \ + A:sub2/new_add_add_file && + test_cmp expect actual && + + git ls-files -o >actual && + test_write_lines actual expect >expect && + test_cmp expect actual + ) +' + ########################################################################### # SECTION 13: Checking informational and conflict messages # -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v4 2/5] merge-ort: small cleanups of check_for_directory_rename 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget ` (3 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@palantir.com> No functional changes, just some preparatory cleanups. Suggested-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Elijah Newren <newren@palantir.com> --- merge-ort.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 8545354dafd..ff037cca8d2 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2267,18 +2267,17 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap *collisions, int *clean_merge) { - char *new_path = NULL; + char *new_path; struct strmap_entry *rename_info; - struct strmap_entry *otherinfo = NULL; + struct strmap_entry *otherinfo; const char *new_dir; + /* Cases where we don't have a directory rename for this path */ if (strmap_empty(dir_renames)) - return new_path; + return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) - return new_path; - /* old_dir = rename_info->key; */ - new_dir = rename_info->value; + return NULL; /* * This next part is a little weird. We do not want to do an @@ -2304,6 +2303,7 @@ static char *check_for_directory_rename(struct merge_options *opt, * As it turns out, this also prevents N-way transient rename * confusion; See testcases 9c and 9d of t6043. */ + new_dir = rename_info->value; /* old_dir = rename_info->key; */ otherinfo = strmap_get_entry(dir_rename_exclusions, new_dir); if (otherinfo) { path_msg(opt, rename_info->key, 1, -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v4 3/5] merge-ort: make a separate function for freeing struct collisions 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget ` (2 subsequent siblings) 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@palantir.com> This commit makes no functional changes, it's just some code movement in preparation for later changes. Signed-off-by: Elijah Newren <newren@palantir.com> --- merge-ort.c | 38 ++++++++++++++++++++++---------------- 1 file changed, 22 insertions(+), 16 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index ff037cca8d2..1514dd173c0 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2259,6 +2259,27 @@ static void compute_collisions(struct strmap *collisions, } } +static void free_collisions(struct strmap *collisions) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + /* Free each value in the collisions map */ + strmap_for_each_entry(collisions, &iter, entry) { + struct collision_info *info = entry->value; + string_list_clear(&info->source_files, 0); + } + /* + * In compute_collisions(), we set collisions.strdup_strings to 0 + * so that we wouldn't have to make another copy of the new_path + * allocated by apply_dir_rename(). But now that we've used them + * and have no other references to these strings, it is time to + * deallocate them. + */ + free_strmap_strings(collisions); + strmap_clear(collisions, 1); +} + static char *check_for_directory_rename(struct merge_options *opt, const char *path, unsigned side_index, @@ -3029,8 +3050,6 @@ static int collect_renames(struct merge_options *opt, int i, clean = 1; struct strmap collisions; struct diff_queue_struct *side_pairs; - struct hashmap_iter iter; - struct strmap_entry *entry; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; @@ -3076,20 +3095,7 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - /* Free each value in the collisions map */ - strmap_for_each_entry(&collisions, &iter, entry) { - struct collision_info *info = entry->value; - string_list_clear(&info->source_files, 0); - } - /* - * In compute_collisions(), we set collisions.strdup_strings to 0 - * so that we wouldn't have to make another copy of the new_path - * allocated by apply_dir_rename(). But now that we've used them - * and have no other references to these strings, it is time to - * deallocate them. - */ - free_strmap_strings(&collisions); - strmap_clear(&collisions, 1); + free_collisions(&collisions); return clean; } -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v4 4/5] merge-ort: shuffle the computation and cleanup of potential collisions 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (2 preceding siblings ...) 2022-07-05 1:33 ` [PATCH v4 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-07-06 16:52 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Junio C Hamano 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> Run compute_collisions() for renames on both sides of history before any calls to collect_renames(), and do not free the computed collisions until after both calls to collect_renames(). This is just a code reorganization at this point that doesn't make sense on its own, but will permit us to use the computed collision info from both sides within each call to collect_renames() in a subsequent commit. Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 20 +++++++++++++++----- 1 file changed, 15 insertions(+), 5 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 1514dd173c0..a37c1c19aca 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2335,7 +2335,8 @@ static char *check_for_directory_rename(struct merge_options *opt, } new_path = handle_path_level_conflicts(opt, path, side_index, - rename_info, collisions); + rename_info, + &collisions[side_index]); *clean_merge &= (new_path != NULL); return new_path; @@ -3044,16 +3045,15 @@ static int detect_regular_renames(struct merge_options *opt, static int collect_renames(struct merge_options *opt, struct diff_queue_struct *result, unsigned side_index, + struct strmap *collisions, struct strmap *dir_renames_for_side, struct strmap *rename_exclusions) { int i, clean = 1; - struct strmap collisions; struct diff_queue_struct *side_pairs; struct rename_info *renames = &opt->priv->renames; side_pairs = &renames->pairs[side_index]; - compute_collisions(&collisions, dir_renames_for_side, side_pairs); for (i = 0; i < side_pairs->nr; ++i) { struct diff_filepair *p = side_pairs->queue[i]; @@ -3069,7 +3069,7 @@ static int collect_renames(struct merge_options *opt, side_index, dir_renames_for_side, rename_exclusions, - &collisions, + collisions, &clean); possibly_cache_new_pair(renames, p, side_index, new_path); @@ -3095,7 +3095,6 @@ static int collect_renames(struct merge_options *opt, result->queue[result->nr++] = p; } - free_collisions(&collisions); return clean; } @@ -3106,6 +3105,7 @@ static int detect_and_process_renames(struct merge_options *opt, { struct diff_queue_struct combined = { 0 }; struct rename_info *renames = &opt->priv->renames; + struct strmap collisions[3]; int need_dir_renames, s, i, clean = 1; unsigned detection_run = 0; @@ -3155,12 +3155,22 @@ static int detect_and_process_renames(struct merge_options *opt, ALLOC_GROW(combined.queue, renames->pairs[1].nr + renames->pairs[2].nr, combined.alloc); + for (i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) { + int other_side = 3 - i; + compute_collisions(&collisions[i], + &renames->dir_renames[other_side], + &renames->pairs[i]); + } clean &= collect_renames(opt, &combined, MERGE_SIDE1, + collisions, &renames->dir_renames[2], &renames->dir_renames[1]); clean &= collect_renames(opt, &combined, MERGE_SIDE2, + collisions, &renames->dir_renames[1], &renames->dir_renames[2]); + for (i = MERGE_SIDE1; i <= MERGE_SIDE2; i++) + free_collisions(&collisions[i]); STABLE_QSORT(combined.queue, combined.nr, compare_pairs); trace2_region_leave("merge", "directory renames", opt->repo); -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* [PATCH v4 5/5] merge-ort: fix issue with dual rename and add/add conflict 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (3 preceding siblings ...) 2022-07-05 1:33 ` [PATCH v4 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget @ 2022-07-05 1:33 ` Elijah Newren via GitGitGadget 2022-07-06 16:52 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Junio C Hamano 5 siblings, 0 replies; 45+ messages in thread From: Elijah Newren via GitGitGadget @ 2022-07-05 1:33 UTC (permalink / raw) To: git Cc: Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason, Elijah Newren, Elijah Newren From: Elijah Newren <newren@gmail.com> There is code in both merge-recursive and merge-ort for avoiding doubly transitive renames (i.e. one side renames directory A/ -> B/, and the other side renames directory B/ -> C/), because this combination would otherwise make a mess for new files added to A/ on the first side and wondering which directory they end up in -- especially if there were even more renames such as the first side renaming C/ -> D/. In such cases, it just turns "off" directory rename detection for the higher order transitive cases. The testcases added in t6423 a couple commits ago are slightly different but similar in principle. They involve a similar case of paired renaming but instead of A/ -> B/ and B/ -> C/, the second side renames a leading directory of B/ to C/. And both sides add a new file somewhere under the directory that the other side will rename. While the new files added start within different directories and thus could logically end up within different directories, it is weird for a file on one side to end up where the other one started and not move along with it. So, let's just turn off directory rename detection in this case as well. Another way to look at this is that if the source name involved in a directory rename on one side is the target name of a directory rename operation for a file from the other side, then we avoid the doubly transitive rename. (More concretely, if a directory rename on side D wants to rename a file on side E from OLD_NAME -> NEW_NAME, and side D already had a file named NEW_NAME, and a directory rename on side E wants to rename side D's NEW_NAME -> NEWER_NAME, then we turn off the directory rename detection for NEW_NAME to prevent the NEW_NAME -> NEWER_NAME rename, and instead end up with an add/add conflict on NEW_NAME.) Signed-off-by: Elijah Newren <newren@gmail.com> --- merge-ort.c | 8 +++++++- t/t6423-merge-rename-directories.sh | 4 ++-- 2 files changed, 9 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index a37c1c19aca..3855f9de25e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -2292,10 +2292,16 @@ static char *check_for_directory_rename(struct merge_options *opt, struct strmap_entry *rename_info; struct strmap_entry *otherinfo; const char *new_dir; + int other_side = 3 - side_index; - /* Cases where we don't have a directory rename for this path */ + /* + * Cases where we don't have or don't want a directory rename for + * this path. + */ if (strmap_empty(dir_renames)) return NULL; + if (strmap_get(&collisions[other_side], path)) + return NULL; rename_info = check_dir_renamed(path, dir_renames); if (!rename_info) return NULL; diff --git a/t/t6423-merge-rename-directories.sh b/t/t6423-merge-rename-directories.sh index ed5586de28c..99baf77cbfd 100755 --- a/t/t6423-merge-rename-directories.sh +++ b/t/t6423-merge-rename-directories.sh @@ -5250,7 +5250,7 @@ test_setup_12l () { ) } -test_expect_merge_algorithm failure failure '12l (B into A): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (B into A): Rename into each other + add/add conflict' ' test_setup_12l BintoA && ( cd 12l_BintoA && @@ -5277,7 +5277,7 @@ test_expect_merge_algorithm failure failure '12l (B into A): Rename into each ot ) ' -test_expect_merge_algorithm failure failure '12l (A into B): Rename into each other + add/add conflict' ' +test_expect_merge_algorithm failure success '12l (A into B): Rename into each other + add/add conflict' ' test_setup_12l AintoB && ( cd 12l_AintoB && -- gitgitgadget ^ permalink raw reply related [flat|nested] 45+ messages in thread
* Re: [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget ` (4 preceding siblings ...) 2022-07-05 1:33 ` [PATCH v4 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget @ 2022-07-06 16:52 ` Junio C Hamano 5 siblings, 0 replies; 45+ messages in thread From: Junio C Hamano @ 2022-07-06 16:52 UTC (permalink / raw) To: Elijah Newren via GitGitGadget Cc: git, Elijah Newren, Jonathan Tan, Calvin Wan, Ævar Arnfjörð Bjarmason "Elijah Newren via GitGitGadget" <gitgitgadget@gmail.com> writes: > This series adds some testcases based on the tensorflow repository issue > reported by Glen Choo at [1], demonstrating bugs in both the ort and > recursive strategies. It also provides a fix for the ort strategy. > > Changes since v3: > > * Remove use of for-initializer I missed them while queuing the previous round and updates look OK. I however personally find the resulting code irritating to read. The counter 'i', which never is used for two purposes at the same time, has multiple "hiding" declarations in this function, in addition to its top-level declaration. It forces the readers to think about which variable each reference of 'i' talks about, and more importantly, if the value in outer 'i' of its last use, after an inner 'i' is used, matters. Will queue. Thanks. ^ permalink raw reply [flat|nested] 45+ messages in thread
end of thread, other threads:[~2022-07-26 4:48 UTC | newest] Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-06-22 4:20 [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-06-22 4:20 ` [PATCH 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-06-27 18:20 ` Jonathan Tan 2022-06-30 0:06 ` Elijah Newren 2022-06-30 22:32 ` Jonathan Tan 2022-07-01 2:57 ` Elijah Newren 2022-06-27 22:30 ` Calvin Wan 2022-06-30 0:07 ` Elijah Newren 2022-06-22 4:20 ` [PATCH 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget 2022-06-27 18:48 ` Jonathan Tan 2022-06-27 21:04 ` Calvin Wan 2022-06-30 0:05 ` Elijah Newren 2022-06-22 4:20 ` [PATCH 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-06-27 18:47 ` Jonathan Tan 2022-06-30 0:05 ` Elijah Newren 2022-07-06 17:25 ` Jonathan Tan 2022-06-22 4:36 ` [PATCH 0/3] Fix dual rename into each other plus conflicting adds Elijah Newren 2022-06-30 6:57 ` [PATCH v2 " Elijah Newren via GitGitGadget 2022-06-30 6:57 ` [PATCH v2 1/3] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-06-30 10:21 ` Ævar Arnfjörð Bjarmason 2022-07-01 2:57 ` Elijah Newren 2022-07-01 9:29 ` Ævar Arnfjörð Bjarmason 2022-06-30 6:57 ` [PATCH v2 2/3] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget 2022-06-30 10:28 ` Ævar Arnfjörð Bjarmason 2022-07-01 3:02 ` Elijah Newren 2022-06-30 6:57 ` [PATCH v2 3/3] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-06-30 10:31 ` Ævar Arnfjörð Bjarmason 2022-07-01 3:03 ` Elijah Newren 2022-07-01 5:23 ` [PATCH v3 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget 2022-07-01 5:23 ` [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget 2022-07-01 9:16 ` Ævar Arnfjörð Bjarmason 2022-07-25 12:00 ` C99 "for (int ..." form on "master" (was: [PATCH v3 4/5] merge-ort: shuffle the computation and cleanup of potential collisions) Ævar Arnfjörð Bjarmason 2022-07-26 2:14 ` Elijah Newren 2022-07-26 4:48 ` C99 "for (int ..." form on "master" Junio C Hamano 2022-07-01 5:23 ` [PATCH v3 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 1/5] t6423: add tests of dual directory rename plus add/add conflict Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 2/5] merge-ort: small cleanups of check_for_directory_rename Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 3/5] merge-ort: make a separate function for freeing struct collisions Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 4/5] merge-ort: shuffle the computation and cleanup of potential collisions Elijah Newren via GitGitGadget 2022-07-05 1:33 ` [PATCH v4 5/5] merge-ort: fix issue with dual rename and add/add conflict Elijah Newren via GitGitGadget 2022-07-06 16:52 ` [PATCH v4 0/5] Fix dual rename into each other plus conflicting adds Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).