git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git-diff-tree rename detection for single file
@ 2005-10-18 19:56 David Ho
  2005-10-18 20:50 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: David Ho @ 2005-10-18 19:56 UTC (permalink / raw)
  To: git

Hi,

I have a small suggestion to make the diff of a renamed file a bit
more meaningful.  I have a file that is renamed-edited and commited. 
git-diff-tree -M -p <commit> shows one result and git-diff-tree -M -p
<commit> <filename> shows another.  If they both show a rename
occurred then I think the single file git-diff-tree will be more
useful.

Any strange idea I have is to make git-diff-tree traverse the list of
commits (assuming the list is returned from git-rev-list) to trace
renames of the file to its origin.  Of course I don't know how useful
this is to most.

David

[davidho@penguin git-tutorial]$ git-diff-tree -r -M -p \
8c77fe87790276b4e0b2650d7c5799eb893ac3ed
8c77fe87790276b4e0b2650d7c5799eb893ac3ed
diff --git a/goodbye b/ciao
similarity index 83%
rename from goodbye
rename to ciao
index 0561cce..d8259f5 100644
--- a/goodbye
+++ b/ciao
@@ -4,3 +4,4 @@ Play, play, play
 Work, work, work
 Eat, eat, eat
 Drink, drink, drink
+Chew, chew, chew


[davidho@penguin git-tutorial]$ git-diff-tree -r -M -p \
8c77fe87790276b4e0b2650d7c5799eb893ac3ed ciao
8c77fe87790276b4e0b2650d7c5799eb893ac3ed
diff --git a/ciao b/ciao
new file mode 100644
index 0000000..d8259f5
--- /dev/null
+++ b/ciao
@@ -0,0 +1,7 @@
+Hello World
+It's a new day for git
+Play, play, play
+Work, work, work
+Eat, eat, eat
+Drink, drink, drink
+Chew, chew, chew

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: git-diff-tree rename detection for single file
  2005-10-18 19:56 git-diff-tree rename detection for single file David Ho
@ 2005-10-18 20:50 ` Junio C Hamano
  2005-10-19  2:45   ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2005-10-18 20:50 UTC (permalink / raw)
  To: David Ho; +Cc: git

David Ho <davidkwho@gmail.com> writes:

> I have a small suggestion to make the diff of a renamed file a bit
> more meaningful.  I have a file that is renamed-edited and commited. 
> git-diff-tree -M -p <commit> shows one result and git-diff-tree -M -p
> <commit> <filename> shows another.  If they both show a rename
> occurred then I think the single file git-diff-tree will be more
> useful.

Sorry, this has vetoed by Linus long time ago.  The <filename>
restricts the paths being passed to the diff machinery upfront,
so once you say <filename>, the rename detection will see only
that path and nothing else to compare and guess which other file
that file in question is a copy of.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-diff-tree rename detection for single file
  2005-10-18 20:50 ` Junio C Hamano
@ 2005-10-19  2:45   ` Junio C Hamano
  2005-10-19  3:12     ` Linus Torvalds
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2005-10-19  2:45 UTC (permalink / raw)
  To: David Ho; +Cc: git, Linus Torvalds

Junio C Hamano <junkio@cox.net> writes:

> David Ho <davidkwho@gmail.com> writes:
>
>> I have a small suggestion to make the diff of a renamed file a bit
>> more meaningful.  I have a file that is renamed-edited and commited. 
>> git-diff-tree -M -p <commit> shows one result and git-diff-tree -M -p
>> <commit> <filename> shows another.  If they both show a rename
>> occurred then I think the single file git-diff-tree will be more
>> useful.
>
> Sorry, this was vetoed by Linus long time ago.  The <filename>
> restricts the paths being passed to the diff machinery upfront,
> so once you say <filename>, the rename detection will see only
> that path and nothing else to compare and guess which other file
> that file in question is a copy of.

Having said that, I think we *could* introduce a new flag to
git-diff-* brothers, --late-pathspec, that makes them apply the
paths restriction on the output side instead.  For obvious
reasons, using this flag would not make any sense unless you are
using one of -M, -C, or --pickaxe-all.

A related thing I have long longed for is a rename following
"git-diff-tree --stdin".

    git-rev-list HEAD | git-diff-tree --stdin -M git-commit.sh

This command line, as everybody hopefully knows, is how "git
whatchanged" is implemented internally.  If git-diff-tree were
taught to follow the rename history, when it hits the boundary
that git-commit-script was renamed to git-commit.sh, it could
start acting as if the pathspec given were git-commit-script
from that point.  To see that rename it needs --late-pathspec;
the current pathspec filters the input so the above command line
would not even care what git-commit-script looked like when the
rename happend.  It would just tell git-commit.sh appeared from
nowhere.

If implemented naively, this rename-following would have funny
interactions when it hits a merge commit, so it may probably be
harder than it sounds, but this would be a good way to do
annotate as well.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-diff-tree rename detection for single file
  2005-10-19  2:45   ` Junio C Hamano
@ 2005-10-19  3:12     ` Linus Torvalds
  2005-10-19  5:20       ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Linus Torvalds @ 2005-10-19  3:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: David Ho, git



On Tue, 18 Oct 2005, Junio C Hamano wrote:
> 
> Having said that, I think we *could* introduce a new flag to
> git-diff-* brothers, --late-pathspec

Gaah. Why? It's really not possible to do it efficiently inside 
git-diff-xyz, so whatever implementation would basically boil down to 
something you can already do with some trivial scripting, basically 
boiling down to:

	git-diff-tree -r -M | grep pathnamelist | git-diff-helper

Now, several reasons why it's much better to do this kind of 
"--late-pathspec" at a higher level (instead of inside the git-diff-xyz 
family):

 (a) git-diff-xyz is already some of the more complex core parts. It's not 
     likely a good idea to make them any more complex, unless there's some 
     very fundamental reason for it.

 (b) without pathname limits, git-diff-tree is very slow. Well, it's 
     actually very fast compared to something braindead like CVS, but if 
     you want to track a single file over a thousand releases, it's MUCH 
     MUCH faster to do the pathname limit at the beginning. Otherwise 
     you'll spend all your time reading and comparing big trees with tens 
     of thousands of entries.

 (c) with a higher-level thing, what you can do is have a TWO-phase thing: 
     use the fast pathname limiter in git-diff-tree to figure out when 
     that file changes in history, and then _only_ for those commits do 
     you go back and then do the much more expensive "git-diff-tree -r -M" 
     followed by the pathname-limiting post-processing.

See what I'm saying? You really can do the post-processing outside of 
git-diff-tree, and you will in fact be much better off if you do so.

The performance impact of pruning the pathnames _before_ diffing them was 
absolutely staggering. You couldn't reasonably do a "git-whatchanged -p" 
on the kernel for a single file if you didn't do it the way we do it now.

			Linus

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-diff-tree rename detection for single file
  2005-10-19  3:12     ` Linus Torvalds
@ 2005-10-19  5:20       ` Junio C Hamano
  2005-10-19 16:04         ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2005-10-19  5:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: David Ho, git

Linus Torvalds <torvalds@osdl.org> writes:

> See what I'm saying?
> ...
> The performance impact of pruning the pathnames _before_ diffing them was 
> absolutely staggering. You couldn't reasonably do a "git-whatchanged -p" 

Yes, I understood and agreed to that logic on May 27th.  That's
why the message you are responding said we could add a new flag,
to give the choice to the user to accept full-tree scan.

The "diff-tree --stdin that follows rename history" example
might want to have some change in either the core side of diff,
or in the rev-list.  It may help to have both.  I still haven't
thought through the issues.

BTW, I really liked your example that piped multiple diff-trees
together.  That is a neat trick.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: git-diff-tree rename detection for single file
  2005-10-19  5:20       ` Junio C Hamano
@ 2005-10-19 16:04         ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2005-10-19 16:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, David Ho, git

On Tue, 18 Oct 2005, Junio C Hamano wrote:

> BTW, I really liked your example that piped multiple diff-trees
> together.  That is a neat trick.

Yeah.  Those should really be picked up to augment the documentation 
section.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2005-10-19 16:07 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2005-10-18 19:56 git-diff-tree rename detection for single file David Ho
2005-10-18 20:50 ` Junio C Hamano
2005-10-19  2:45   ` Junio C Hamano
2005-10-19  3:12     ` Linus Torvalds
2005-10-19  5:20       ` Junio C Hamano
2005-10-19 16:04         ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).