All of lore.kernel.org
 help / color / mirror / Atom feed
* Bugs with detection of renames that are also overwrites
@ 2010-02-13 23:46 Anders Kaseorg
  2010-02-23 12:22 ` Jeff King
  0 siblings, 1 reply; 5+ messages in thread
From: Anders Kaseorg @ 2010-02-13 23:46 UTC (permalink / raw)
  To: git

One can create a rename that also overwrites an existing file, for example 
with ‘git mv -f’:

$ git init
$ seq 100 200 > a; seq 300 400 > b; git add a b
$ git commit -m foo; git tag foo
$ git mv -f a b
$ git commit -m bar; git tag bar

Git does not ordinarily detect this as a rename, even with -M.

$ git diff --stat -M foo bar
 a |  101 ----------------------------------
 b |  202 ++++++++++++++++++++++++++++++++++----------------------------------
 2 files changed, 101 insertions(+), 202 deletions(-)

But it can (sometimes*) be forced to detect the rename with -M -B.  

$ git diff --stat -M -B foo bar
 a => b |    0
 1 files changed, 0 insertions(+), 0 deletions(-)

However, the resulting patch incorrectly omits the deletion of the 
overwritten file!

$ git diff -M -B foo bar
diff --git a/a b/b
similarity index 100%
rename from a
rename to b

Therefore, the patch can’t be applied to its own source tree.

$ git checkout foo
$ git diff -M -B foo bar | git apply
error: b: already exists in working directory
$ git format-patch --stdout -M -B foo..bar | git am
Applying: bar
error: b: already exists in index
Patch failed at 0001 bar
When you have resolved this problem run "git am --resolved".
If you would prefer to skip this patch, instead run "git am --skip".
To restore the original branch and stop patching run "git am --abort".

If it matters, I’m using Git 1.7.0.

Also, a question: is it possible to get ‘git merge’ to recognize this 
rename?

Anders

(* I say “sometimes” because -B only detects the rewrite if it deems the 
renamed file to be sufficiently different than the overwritten file.  If 
you use 190 and 390 instead of 200 and 400 above, the rewrite and rename 
will not be detected, even though the rename would be detected if it was 
not an overwrite.  This also feels like a bug.)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bugs with detection of renames that are also overwrites
  2010-02-13 23:46 Bugs with detection of renames that are also overwrites Anders Kaseorg
@ 2010-02-23 12:22 ` Jeff King
  2010-02-23 12:53   ` Jon Seymour
  2010-02-23 19:08   ` Anders Kaseorg
  0 siblings, 2 replies; 5+ messages in thread
From: Jeff King @ 2010-02-23 12:22 UTC (permalink / raw)
  To: Anders Kaseorg; +Cc: git

On Sat, Feb 13, 2010 at 06:46:55PM -0500, Anders Kaseorg wrote:

> One can create a rename that also overwrites an existing file, for example 
> with ‘git mv -f’:
> 
> $ git init
> $ seq 100 200 > a; seq 300 400 > b; git add a b
> $ git commit -m foo; git tag foo
> $ git mv -f a b
> $ git commit -m bar; git tag bar
> 
> Git does not ordinarily detect this as a rename, even with -M.

Right. Git looks at only a subset of the files when looking for renames.
For straight renames, the set of possible sources is the list of deleted
files, and the possible destinations are the new files.

Finding copies ("-C") is similar, except that we also consider files
that were modified but not deleted. And --find-copies-harder will look
at _all_ files as sources, not just modified ones.

But what you are asking for is to expand the possible destination list
to include files that were modified but are not new. I don't think there
is currently a way to do that explicitly.

As you discovered, though, if either the source or destination file has
changed significantly, we should break them apart using "-B":

> But it can (sometimes*) be forced to detect the rename with -M -B.  
> 
> $ git diff --stat -M -B foo bar
>  a => b |    0
>  1 files changed, 0 insertions(+), 0 deletions(-)

In which case the rename engine sees the deletion and addition
separately (they just happen to have the same path name), and therefore
the path gets added to the source and destination lists.

> However, the resulting patch incorrectly omits the deletion of the 
> overwritten file!
> 
> $ git diff -M -B foo bar
> diff --git a/a b/b
> similarity index 100%
> rename from a
> rename to b
>
> Therefore, the patch can’t be applied to its own source tree.
> 
> $ git checkout foo
> $ git diff -M -B foo bar | git apply
> error: b: already exists in working directory

Hmm. I think this is a conflict between what the user wants to see and
what apply wants to see. From the user's perspective, thinking about the
diff representing a change, "b" was not really deleted. It was simply
overwritten. But from apply's perspective, the diff is a set of
instructions, and those instructions do not mention that "b" previously
existed and was overwritten. So it is right to be cautious and declare a
conflict.

I'm not sure just throwing a "delete" line in there would be the best
thing, though. The instructions for apply really need to be "if 'b' has
this sha1, then it is safe to delete and rename a into place. Otherwise
it is a conflict". And I'm not sure how we would represent that (I guess
with a full diff and an "index" line).

> Also, a question: is it possible to get ‘git merge’ to recognize this 
> rename?

No, I don't think there is a way currently. You would need to patch git
to set opts.break_opt in merge-recursive.c:get_renames, I think.

> (* I say “sometimes” because -B only detects the rewrite if it deems the 
> renamed file to be sufficiently different than the overwritten file.  If 
> you use 190 and 390 instead of 200 and 400 above, the rewrite and rename 
> will not be detected, even though the rename would be detected if it was 
> not an overwrite.  This also feels like a bug.)

I think you are getting into a philosophical discussion of what is a
rename, here. If "a" and "b" are very similar, "a" is removed, and "b"
has the same (or similar) content as "a", was it a rename from "a", or
was it simply that "b" was changed, possibly to incorporate the
duplicated items in "a"?

So I don't think it is a bug. But that isn't to say you can't come up
with a case where it would be nice, during a diff or a merge, to show
things that way. I mentioned at the beginning of the message that what
you wanted was to expand the list of destination possibilities. You
could have a "--find-overwrites" which puts all of the modified files
into the destination list. You would not want it on by default, though,
I think, as it would add a lot of computation time to find a somewhat
rare case.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bugs with detection of renames that are also overwrites
  2010-02-23 12:22 ` Jeff King
@ 2010-02-23 12:53   ` Jon Seymour
  2010-02-23 12:59     ` Jeff King
  2010-02-23 19:08   ` Anders Kaseorg
  1 sibling, 1 reply; 5+ messages in thread
From: Jon Seymour @ 2010-02-23 12:53 UTC (permalink / raw)
  To: Jeff King; +Cc: Anders Kaseorg, git

I came across a similar case recently while testing my rebasing ideas
which lead me to wonder - is there a way to suppress the rename and
overwrite detection? For my purposes, I don't care if the diff is
bulky, I just want it to apply reliably.

jon

On Tue, Feb 23, 2010 at 11:22 PM, Jeff King <peff@peff.net> wrote:
> On Sat, Feb 13, 2010 at 06:46:55PM -0500, Anders Kaseorg wrote:
>
>> One can create a rename that also overwrites an existing file, for example
>> with ‘git mv -f’:
>>
>> $ git init
>> $ seq 100 200 > a; seq 300 400 > b; git add a b
>> $ git commit -m foo; git tag foo
>> $ git mv -f a b
>> $ git commit -m bar; git tag bar
>>
>> Git does not ordinarily detect this as a rename, even with -M.
>
> Right. Git looks at only a subset of the files when looking for renames.
> For straight renames, the set of possible sources is the list of deleted
> files, and the possible destinations are the new files.
>
> Finding copies ("-C") is similar, except that we also consider files
> that were modified but not deleted. And --find-copies-harder will look
> at _all_ files as sources, not just modified ones.
>
> But what you are asking for is to expand the possible destination list
> to include files that were modified but are not new. I don't think there
> is currently a way to do that explicitly.
>
> As you discovered, though, if either the source or destination file has
> changed significantly, we should break them apart using "-B":
>
>> But it can (sometimes*) be forced to detect the rename with -M -B.
>>
>> $ git diff --stat -M -B foo bar
>>  a => b |    0
>>  1 files changed, 0 insertions(+), 0 deletions(-)
>
> In which case the rename engine sees the deletion and addition
> separately (they just happen to have the same path name), and therefore
> the path gets added to the source and destination lists.
>
>> However, the resulting patch incorrectly omits the deletion of the
>> overwritten file!
>>
>> $ git diff -M -B foo bar
>> diff --git a/a b/b
>> similarity index 100%
>> rename from a
>> rename to b
>>
>> Therefore, the patch can’t be applied to its own source tree.
>>
>> $ git checkout foo
>> $ git diff -M -B foo bar | git apply
>> error: b: already exists in working directory
>
> Hmm. I think this is a conflict between what the user wants to see and
> what apply wants to see. From the user's perspective, thinking about the
> diff representing a change, "b" was not really deleted. It was simply
> overwritten. But from apply's perspective, the diff is a set of
> instructions, and those instructions do not mention that "b" previously
> existed and was overwritten. So it is right to be cautious and declare a
> conflict.
>
> I'm not sure just throwing a "delete" line in there would be the best
> thing, though. The instructions for apply really need to be "if 'b' has
> this sha1, then it is safe to delete and rename a into place. Otherwise
> it is a conflict". And I'm not sure how we would represent that (I guess
> with a full diff and an "index" line).
>
>> Also, a question: is it possible to get ‘git merge’ to recognize this
>> rename?
>
> No, I don't think there is a way currently. You would need to patch git
> to set opts.break_opt in merge-recursive.c:get_renames, I think.
>
>> (* I say “sometimes” because -B only detects the rewrite if it deems the
>> renamed file to be sufficiently different than the overwritten file.  If
>> you use 190 and 390 instead of 200 and 400 above, the rewrite and rename
>> will not be detected, even though the rename would be detected if it was
>> not an overwrite.  This also feels like a bug.)
>
> I think you are getting into a philosophical discussion of what is a
> rename, here. If "a" and "b" are very similar, "a" is removed, and "b"
> has the same (or similar) content as "a", was it a rename from "a", or
> was it simply that "b" was changed, possibly to incorporate the
> duplicated items in "a"?
>
> So I don't think it is a bug. But that isn't to say you can't come up
> with a case where it would be nice, during a diff or a merge, to show
> things that way. I mentioned at the beginning of the message that what
> you wanted was to expand the list of destination possibilities. You
> could have a "--find-overwrites" which puts all of the modified files
> into the destination list. You would not want it on by default, though,
> I think, as it would add a lot of computation time to find a somewhat
> rare case.
>
> -Peff
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bugs with detection of renames that are also overwrites
  2010-02-23 12:53   ` Jon Seymour
@ 2010-02-23 12:59     ` Jeff King
  0 siblings, 0 replies; 5+ messages in thread
From: Jeff King @ 2010-02-23 12:59 UTC (permalink / raw)
  To: Jon Seymour; +Cc: Anders Kaseorg, git

On Tue, Feb 23, 2010 at 11:53:13PM +1100, Jon Seymour wrote:

> I came across a similar case recently while testing my rebasing ideas
> which lead me to wonder - is there a way to suppress the rename and
> overwrite detection? For my purposes, I don't care if the diff is
> bulky, I just want it to apply reliably.

For diff generation, it should be off by default (but you may have
configured diff.renames, in which case you can pass --no-renames to
suppress it).

For the merge-recursive driver, it is on by default. You can hack it off
by setting merge.renamelimit to something small like '1' (but don't use
'0', which means "no limit").  Rebase uses "git am -3" internally, which
in turn invokes merge-recursive.

-Peff

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Bugs with detection of renames that are also overwrites
  2010-02-23 12:22 ` Jeff King
  2010-02-23 12:53   ` Jon Seymour
@ 2010-02-23 19:08   ` Anders Kaseorg
  1 sibling, 0 replies; 5+ messages in thread
From: Anders Kaseorg @ 2010-02-23 19:08 UTC (permalink / raw)
  To: Jeff King; +Cc: git

On Tue, 23 Feb 2010, Jeff King wrote:
> > Therefore, the patch can’t be applied to its own source tree.
> > 
> > $ git checkout foo
> > $ git diff -M -B foo bar | git apply
> > error: b: already exists in working directory
> 
> Hmm. I think this is a conflict between what the user wants to see and
> what apply wants to see. From the user's perspective, thinking about the
> diff representing a change, "b" was not really deleted. It was simply
> overwritten. But from apply's perspective, the diff is a set of
> instructions, and those instructions do not mention that "b" previously
> existed and was overwritten. So it is right to be cautious and declare a
> conflict.

I agree; git apply has no choice given that input.

> I'm not sure just throwing a "delete" line in there would be the best
> thing, though. The instructions for apply really need to be "if 'b' has
> this sha1, then it is safe to delete and rename a into place. Otherwise
> it is a conflict". And I'm not sure how we would represent that (I guess
> with a full diff and an "index" line).

Yeah, a full diff is the only sane solution, just like it is for plain 
deletes.  Otherwise the patch could not be reverse-applied.  (If the user 
really doesn’t want to see the deletion, they can use --diff-filter.)

diff --git a/b b/b
deleted file mode 100644
index 3d47ea7..0000000
--- a/b
+++ /dev/null
@@ -1,101 +0,0 @@
-300
-301
-399
-400
diff --git a/a b/b
similarity index 100%
rename from a
rename to b

This patch is already handled correctly by git apply (and git apply -R), 
as long as the rename is listed after the delete.

Anders

^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2010-02-23 19:08 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-02-13 23:46 Bugs with detection of renames that are also overwrites Anders Kaseorg
2010-02-23 12:22 ` Jeff King
2010-02-23 12:53   ` Jon Seymour
2010-02-23 12:59     ` Jeff King
2010-02-23 19:08   ` Anders Kaseorg

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.