All of lore.kernel.org
 help / color / mirror / Atom feed
* git log -M -- filename is not working?
@ 2010-05-07 18:07 Eugene Sajine
  2010-05-07 18:10 ` Jacob Helwig
  0 siblings, 1 reply; 26+ messages in thread
From: Eugene Sajine @ 2010-05-07 18:07 UTC (permalink / raw)
  To: git; +Cc: Eugene Sajine

Hi!

I tried it in both git 1.6.5.6 and git 1.7.0.5

The file has been moved:

From
Src/folder1/folder2/File.java

To src/main/java/folder1/folder2/File.java

I don’t think it is relevant but just in case: The author has
committed the new file first, then the commit was amended with the
deletion of the source file.
So we ended up with the rename correctly detected.

Now
$ git log –M -- src/main/java/folder1/folder2/File.java

Shows only one last commit and doesn’t show the history when the file
was in source folder. Same thing happens with

$ git log –C  -- src/main/java/folder1/folder2/File.java

What gives?

Thanks,
Eugene

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git log -M -- filename is not working?
  2010-05-07 18:07 git log -M -- filename is not working? Eugene Sajine
@ 2010-05-07 18:10 ` Jacob Helwig
  2010-05-07 18:31   ` Eugene Sajine
  0 siblings, 1 reply; 26+ messages in thread
From: Jacob Helwig @ 2010-05-07 18:10 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: git

On Fri, May 7, 2010 at 11:07, Eugene Sajine <euguess@gmail.com> wrote:
> Hi!
>
> I tried it in both git 1.6.5.6 and git 1.7.0.5
>
> The file has been moved:
>
> From
> Src/folder1/folder2/File.java
>
> To src/main/java/folder1/folder2/File.java
>
> I don’t think it is relevant but just in case: The author has
> committed the new file first, then the commit was amended with the
> deletion of the source file.
> So we ended up with the rename correctly detected.
>
> Now
> $ git log –M -- src/main/java/folder1/folder2/File.java
>
> Shows only one last commit and doesn’t show the history when the file
> was in source folder. Same thing happens with
>
> $ git log –C  -- src/main/java/folder1/folder2/File.java
>
> What gives?
>
> Thanks,
> Eugene
>

You want the --follow flag, too.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git log -M -- filename is not working?
  2010-05-07 18:10 ` Jacob Helwig
@ 2010-05-07 18:31   ` Eugene Sajine
  2010-05-07 18:37     ` Eli Barzilay
  0 siblings, 1 reply; 26+ messages in thread
From: Eugene Sajine @ 2010-05-07 18:31 UTC (permalink / raw)
  To: Jacob Helwig; +Cc: git

On Fri, May 7, 2010 at 2:10 PM, Jacob Helwig <jacob.helwig@gmail.com> wrote:
> On Fri, May 7, 2010 at 11:07, Eugene Sajine <euguess@gmail.com> wrote:
>> Hi!
>>
>> I tried it in both git 1.6.5.6 and git 1.7.0.5
>>
>> The file has been moved:
>>
>> From
>> Src/folder1/folder2/File.java
>>
>> To src/main/java/folder1/folder2/File.java
>>
>> I don’t think it is relevant but just in case: The author has
>> committed the new file first, then the commit was amended with the
>> deletion of the source file.
>> So we ended up with the rename correctly detected.
>>
>> Now
>> $ git log –M -- src/main/java/folder1/folder2/File.java
>>
>> Shows only one last commit and doesn’t show the history when the file
>> was in source folder. Same thing happens with
>>
>> $ git log –C  -- src/main/java/folder1/folder2/File.java
>>
>> What gives?
>>
>> Thanks,
>> Eugene
>>
>
> You want the --follow flag, too.
>

Thanks! I missed this guy. By the way it seems that --follow flag
works without -M or -C.

Are those deprecated or I'm missing the difference between three of them??

Thanks,
Eugene

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git log -M -- filename is not working?
  2010-05-07 18:31   ` Eugene Sajine
@ 2010-05-07 18:37     ` Eli Barzilay
  2010-05-07 20:25       ` Matthieu Moy
                         ` (2 more replies)
  0 siblings, 3 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-07 18:37 UTC (permalink / raw)
  To: Eugene Sajine; +Cc: Jacob Helwig, git

On May  7, Eugene Sajine wrote:
> On Fri, May 7, 2010 at 2:10 PM, Jacob Helwig <jacob.helwig@gmail.com> wrote:
> > You want the --follow flag, too.
> 
> Thanks! I missed this guy. By the way it seems that --follow flag
> works without -M or -C.
> 
> Are those deprecated or I'm missing the difference between three of
> them??

BTW, I've had at least 4 people now who got confused by this.  Is
there any use for -M/-C without --follow?  In any case, it will be
very helpful if the -M/-C descriptions said "see also --follow".

Also, is there a way to set this as the default for `git log'?

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-07 18:37     ` Eli Barzilay
@ 2010-05-07 20:25       ` Matthieu Moy
  2010-05-07 20:28       ` Jakub Narebski
  2010-05-08  4:44       ` Jeff King
  2 siblings, 0 replies; 26+ messages in thread
From: Matthieu Moy @ 2010-05-07 20:25 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Eugene Sajine, Jacob Helwig, git

Eli Barzilay <eli@barzilay.org> writes:

> On May  7, Eugene Sajine wrote:
>> On Fri, May 7, 2010 at 2:10 PM, Jacob Helwig <jacob.helwig@gmail.com> wrote:
>> > You want the --follow flag, too.
>> 
>> Thanks! I missed this guy. By the way it seems that --follow flag
>> works without -M or -C.
>> 
>> Are those deprecated or I'm missing the difference between three of
>> them??
>
> BTW, I've had at least 4 people now who got confused by this.  Is
> there any use for -M/-C without --follow?  In any case, it will be
> very helpful if the -M/-C descriptions said "see also --follow".
>
> Also, is there a way to set this as the default for `git log'?

I guess -M and -C are here because log accepts all options that diff
accepts, in case you run "git log -p". Then, -M and -C control what
diff is displayed to the user.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: git log -M -- filename is not working?
  2010-05-07 18:37     ` Eli Barzilay
  2010-05-07 20:25       ` Matthieu Moy
@ 2010-05-07 20:28       ` Jakub Narebski
  2010-05-08  4:44       ` Jeff King
  2 siblings, 0 replies; 26+ messages in thread
From: Jakub Narebski @ 2010-05-07 20:28 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Eugene Sajine, Jacob Helwig, git

Eli Barzilay <eli@barzilay.org> writes:

> On May  7, Eugene Sajine wrote:
> > On Fri, May 7, 2010 at 2:10 PM, Jacob Helwig <jacob.helwig@gmail.com> wrote:
> > > You want the --follow flag, too.
> > 
> > Thanks! I missed this guy. By the way it seems that --follow flag
> > works without -M or -C.
> > 
> > Are those deprecated or I'm missing the difference between three of
> > them??
> 
> BTW, I've had at least 4 people now who got confused by this.  Is
> there any use for -M/-C without --follow?  In any case, it will be
> very helpful if the -M/-C descriptions said "see also --follow".

Yes, '-M/-C' is useful with "git diff" _without_ pathspec, including
e.g. "git show -C".

The problem with "git log -M -- <filename>" is that history simplification,
which is required for good performance, happens before diff mechanism has a
chance to perform rename detection.  Before there was '--follow' option to
git-log (which supports only the case of single file, and doesn't work that
well with more complicated history), you were forced to do

  $ git log -M -- <filename> <oldname> <oldername>...

> 
> Also, is there a way to set this as the default for `git log'?

I don't think so.  Note also that '--follow' works only with single file,
and does not work for example (currently) with directory pathspec.

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-07 18:37     ` Eli Barzilay
  2010-05-07 20:25       ` Matthieu Moy
  2010-05-07 20:28       ` Jakub Narebski
@ 2010-05-08  4:44       ` Jeff King
  2010-05-08  5:12         ` Eli Barzilay
  2 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2010-05-08  4:44 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On Fri, May 07, 2010 at 02:37:09PM -0400, Eli Barzilay wrote:

> BTW, I've had at least 4 people now who got confused by this.  Is
> there any use for -M/-C without --follow?  In any case, it will be
> very helpful if the -M/-C descriptions said "see also --follow".

Yes, it detects renames when doing diffs.

Documentation patch is below.

> Also, is there a way to set this as the default for `git log'?

If you mean --follow, then no. Nor would you probably want to, because
the --follow mechanism (as currently implemented) is pretty restricted.
It can only take a single path currently.

-- >8 --
Subject: [PATCH] docs: clarify meaning of -M for git-log

As an option to the "diff" family, it is fairly obvious what
"detect renames" means. However, for revision traversal, the
"-M" option is just included in the long list of options,
with no indication that it is about showing renames in diffs
versus following renames. Let's make it more explicit.

Signed-off-by: Jeff King <peff@peff.net>
---
The only other revision traversal manpage that includes diff-options
seems to be format-patch. Should it get the same treatment?

 Documentation/diff-options.txt |    7 +++++++
 1 files changed, 7 insertions(+), 0 deletions(-)

diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index c9c6c2b..3070ddd 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -177,7 +177,14 @@ endif::git-format-patch[]
 	Break complete rewrite changes into pairs of delete and create.
 
 -M::
+ifndef::git-log[]
 	Detect renames.
+endif::git-log[]
+ifdef::git-log[]
+	If generating diffs, detect and report renames for each commit.
+	For following files across renames while traversing history, see
+	`--follow`.
+endif::git-log[]
 
 -C::
 	Detect copies as well as renames.  See also `--find-copies-harder`.
-- 
1.7.1.176.gcff095.dirty

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-08  4:44       ` Jeff King
@ 2010-05-08  5:12         ` Eli Barzilay
  2010-05-08  5:30           ` Jeff King
  2010-05-08  5:39           ` Junio C Hamano
  0 siblings, 2 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-08  5:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On May  8, Jeff King wrote:
> On Fri, May 07, 2010 at 02:37:09PM -0400, Eli Barzilay wrote:
> 
> > BTW, I've had at least 4 people now who got confused by this.  Is
> > there any use for -M/-C without --follow?  In any case, it will be
> > very helpful if the -M/-C descriptions said "see also --follow".
> 
> Yes, it detects renames when doing diffs.

OK, so just to clear this up: -C and -M (and --find-copies-harder) are
for `diff', and --follow is for `log' with a single file (and each
will pass it on to the other)?


> Documentation patch is below.

Thanks!  (It would also be nice to mention it in -C, but not critical
since it's right after -M.)


> > Also, is there a way to set this as the default for `git log'?
> 
> If you mean --follow, then no. Nor would you probably want to,
> because the --follow mechanism (as currently implemented) is pretty
> restricted.  It can only take a single path currently.

Well, the "algorithm" I used was probably one that is very popular:

* use `git log some-file' with something that got renamed recently
* be horrified that all history is gone
* remember something vague about git detecting renames => go look at
  the man page
* Find -M, add it, try it, still doesn't work
* Go back to scanning the man page, repeat
* At the end I end up with:
    -C -M --find-copies-harder --follow

So if there was some single

  --do-whatever-you-can-as-much-as-you-can-to-find-all-renames

option that would just turn everything on for all commands (eg, -M and
-C and --find-copies harder for diffs, --follow for log for a single
file, and in some future when it's implemented, --follow for multiple
files and directories), I'd happily jump on that.  For any command
that can accept it.  If there was a way to make it the default, I'd
love to turn it on.

Even with the chain of more flags with descriptions that sound like
they're trying to scare me away by promising that my machine will work
for a REALLY LONG TIME, I'd still want to turn it on -- if it got
something slower I sure didn't notice it so far, and that's on a real
repository which is not that small (but with git's reputation I won't
be surprised if "slower" means that I had to way a whole extra 20ms
for an answer...).  If something would really take too long, as in me
sitting any waiting for an answer, *then* I can try to remove that and
see if I ran into some of the horrible edge cases...

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-08  5:12         ` Eli Barzilay
@ 2010-05-08  5:30           ` Jeff King
  2010-05-08  5:39           ` Junio C Hamano
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff King @ 2010-05-08  5:30 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On Sat, May 08, 2010 at 01:12:58AM -0400, Eli Barzilay wrote:

> > > BTW, I've had at least 4 people now who got confused by this.  Is
> > > there any use for -M/-C without --follow?  In any case, it will be
> > > very helpful if the -M/-C descriptions said "see also --follow".
> > 
> > Yes, it detects renames when doing diffs.
> 
> OK, so just to clear this up: -C and -M (and --find-copies-harder) are
> for `diff', and --follow is for `log' with a single file (and each
> will pass it on to the other)?

Yes (well, diff can never pass --follow to log, since diff never invokes
log, but yes, the per-commit diff shown by log uses the -C and -M given
to log).

> > Documentation patch is below.
> 
> Thanks!  (It would also be nice to mention it in -C, but not critical
> since it's right after -M.)

Yeah, I considered that, but didn't because of the proximity. But maybe
it would make sense to do so, or to point -C at -M (e.g., say "like -M,
but detect copies as well as renames").

> Well, the "algorithm" I used was probably one that is very popular:
> 
> * use `git log some-file' with something that got renamed recently
> * be horrified that all history is gone
> * remember something vague about git detecting renames => go look at
>   the man page
> * Find -M, add it, try it, still doesn't work
> * Go back to scanning the man page, repeat
> * At the end I end up with:
>     -C -M --find-copies-harder --follow
> 
> So if there was some single
> 
>   --do-whatever-you-can-as-much-as-you-can-to-find-all-renames

But I think all you really wanted was "--follow". I'd have to check the
code, but I'm not even sure whether "-C" will impact --follow at all.

> Even with the chain of more flags with descriptions that sound like
> they're trying to scare me away by promising that my machine will work
> for a REALLY LONG TIME, I'd still want to turn it on -- if it got
> something slower I sure didn't notice it so far, and that's on a real
> repository which is not that small (but with git's reputation I won't
> be surprised if "slower" means that I had to way a whole extra 20ms
> for an answer...).  If something would really take too long, as in me
> sitting any waiting for an answer, *then* I can try to remove that and
> see if I ran into some of the horrible edge cases...

No, copy detection can be _really_ slow. There is a reason it isn't on
by default. Try "git log -1000 -p" versus "git log -1000 -p -C -M
--find-copies-harder" in some repository. In a simple git.git test, it
is almost 5x slower (about 1 second versus 5 seconds on my machine). For
large repositories, it can be much worse, because now each diff is
O(size of repository) instead of O(size of changes).

Still, I see your point that you might want it on all the time, if you
have a sufficiently small repo. There is "diff.renames" to turn on
rename detection all the time. But I think a log.follow option doesn't
make sense at this point. For example:

  $ git config log.follow true
  $ git log foo.c ;# ok, follow foo.c
  $ git log foo.c bar.c ;# uh oh, now what?

Does the last one just barf, and make you say "git log --no-follow foo.c
bar.c"? Does it quietly turn off --follow, making the user guess why
"git log foo.c" finds some history that "git log foo.c bar.c" doesn't?

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-08  5:12         ` Eli Barzilay
  2010-05-08  5:30           ` Jeff King
@ 2010-05-08  5:39           ` Junio C Hamano
  2010-05-08  7:08             ` Eli Barzilay
  1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2010-05-08  5:39 UTC (permalink / raw)
  To: Eli Barzilay
  Cc: Jeff King, Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

Eli Barzilay <eli@barzilay.org> writes:

> So if there was some single
>
>   --do-whatever-you-can-as-much-as-you-can-to-find-all-renames

If I am not mistaken, that is exactly the point of Bo's patch:

    Message-Id: <1273207949-18500-4-git-send-email-struggleyb.nku@gmail.com>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-08  5:39           ` Junio C Hamano
@ 2010-05-08  7:08             ` Eli Barzilay
  2010-05-12 11:38               ` Jeff King
  0 siblings, 1 reply; 26+ messages in thread
From: Eli Barzilay @ 2010-05-08  7:08 UTC (permalink / raw)
  To: Jeff King, Junio C Hamano; +Cc: Eugene Sajine, Bo Yang, Jacob Helwig, git

On May  7, Junio C Hamano wrote:
> Eli Barzilay <eli@barzilay.org> writes:
> 
> > So if there was some single
> >
> >   --do-whatever-you-can-as-much-as-you-can-to-find-all-renames
> 
> If I am not mistaken, that is exactly the point of Bo's patch:
> 
>     Message-Id: <1273207949-18500-4-git-send-email-struggleyb.nku@gmail.com>

(Actually, I was talking about some meta option that always turns on
whatever can be done to track copies; but given the clarification from
Jeff below, it's not as important as I first thought.)


On May  8, Jeff King wrote:
> On Sat, May 08, 2010 at 01:12:58AM -0400, Eli Barzilay wrote:
> 
> > > > BTW, I've had at least 4 people now who got confused by this.  Is
> > > > there any use for -M/-C without --follow?  In any case, it will be
> > > > very helpful if the -M/-C descriptions said "see also --follow".
> > > 
> > > Yes, it detects renames when doing diffs.
> > 
> > OK, so just to clear this up: -C and -M (and --find-copies-harder)
> > are for `diff', and --follow is for `log' with a single file (and
> > each will pass it on to the other)?
> 
> Yes (well, diff can never pass --follow to log, since diff never
> invokes log, but yes, the per-commit diff shown by log uses the -C
> and -M given to log).

Aha -- I now see what makes most of this confusion, and an easy way to
improve it considerably: looking at the `git-log' man page, I have to
wade through 11 pages of general diff options (well, 11 is so high
because I like big fonts in my emacs...) before I get to the log
options, and even worse, there is no clear indication that those
options are only relevant if I want to get patches, which is (at least
in my interactive use cases) far from being something I do often.
(Googling around, I see more than a few examples of confused uses of
`git log -M', even the git faq says "'git log -M' will give the commit
history with rename information" -- no mention of it not making any
sense with a file that went through a rename.)

So I think that it would really help if (1) the diff options in the
git-log man page move to after its own options, and (2) they appeared
after a title saying that these are the diff options, (3) `--follow'
moves up before the few preceding options that seem to me less
important.  To clarify, I added a simple patch to the end of this
message.  (`git-format-patch' has the same thing, but there it looks
more sensible to leave it as is.)


> > So if there was some single
> > 
> >   --do-whatever-you-can-as-much-as-you-can-to-find-all-renames
> 
> But I think all you really wanted was "--follow". I'd have to check the
> code, but I'm not even sure whether "-C" will impact --follow at all.

If -C/-M are really only affecting the patches, then yes, I'd probably
be happy with only `--follow' (or a way to have --follow on by default
when possible).


> No, copy detection can be _really_ slow. There is a reason it isn't
> on by default. Try "git log -1000 -p" versus "git log -1000 -p -C -M
> --find-copies-harder" in some repository. In a simple git.git test,
> it is almost 5x slower (about 1 second versus 5 seconds on my
> machine). For large repositories, it can be much worse, because now
> each diff is O(size of repository) instead of O(size of changes).

Yes, I see that.  I was talking about interactive uses, where I'd
never want to see a 1000 diffs, or even 2.  But given that -C/-M are
only for patches, then I agree with that.


> Still, I see your point that you might want it on all the time, if you
> have a sufficiently small repo. There is "diff.renames" to turn on
> rename detection all the time.

(Ah, I missed that...)


> But I think a log.follow option doesn't make sense at this
> point. For example:
> 
>   $ git config log.follow true
>   $ git log foo.c ;# ok, follow foo.c
>   $ git log foo.c bar.c ;# uh oh, now what?
> 
> Does the last one just barf, and make you say "git log --no-follow
> foo.c bar.c"? Does it quietly turn off --follow, making the user
> guess why "git log foo.c" finds some history that "git log foo.c
> bar.c" doesn't?

How about these options:

  git config log.follow if-single-file
    makes it use --follow only when there's a single file path given,
    ignoring it otherwise (with no confusion about it now)

  git config log.follow if-possible
    makes it do the same, but might also do it for more cases if/when
    they become available (so this is the "do the best you can"
    option)

  git config log.follow true
    invalid until it is always possible to use --follow

?



[this is the patch that I mentioned above]

-------------------------------------------------------------------------------
diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index fb184ba..6bc7064 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -24,7 +24,6 @@ OPTIONS
 -------
 
 :git-log: 1
-include::diff-options.txt[]
 
 -<n>::
 	Limits the number of commits to show.
@@ -37,6 +36,10 @@ include::diff-options.txt[]
 	and <until>, see "SPECIFYING REVISIONS" section in
 	linkgit:git-rev-parse[1].
 
+--follow::
+	Continue listing the history of a file beyond renames
+	(works only for a single file).
+
 --decorate[=short|full]::
 	Print out the ref names of any commits that are shown. If 'short' is
 	specified, the ref name prefixes 'refs/heads/', 'refs/tags/' and
@@ -55,9 +58,6 @@ include::diff-options.txt[]
 	the specified paths; this means that "<path>..." limits only
 	commits, and doesn't limit diff for those commits.
 
---follow::
-	Continue listing the history of a file beyond renames.
-
 --log-size::
 	Before the log message print out its size in bytes. Intended
 	mainly for porcelain tools consumption. If git is unable to
@@ -71,6 +71,10 @@ include::diff-options.txt[]
 	to be prefixed with "\-- " to separate them from options or
 	refnames.
 
+Common diff options
+~~~~~~~~~~~~~~~~~~~
+
+include::diff-options.txt[]
 
 include::rev-list-options.txt[]
 
-------------------------------------------------------------------------------


-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-08  7:08             ` Eli Barzilay
@ 2010-05-12 11:38               ` Jeff King
  2010-05-12 12:01                 ` Eli Barzilay
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2010-05-12 11:38 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On Sat, May 08, 2010 at 03:08:53AM -0400, Eli Barzilay wrote:

> So I think that it would really help if (1) the diff options in the
> git-log man page move to after its own options, and (2) they appeared
> after a title saying that these are the diff options, (3) `--follow'
> moves up before the few preceding options that seem to me less
> important.  To clarify, I added a simple patch to the end of this
> message.  (`git-format-patch' has the same thing, but there it looks
> more sensible to leave it as is.)

I have no opinion on moving --follow around, but I definitely agree that
more clearly marking the diff-options (and including them after
revision traversal options) is much better.

> How about these options:
> 
>   git config log.follow if-single-file
>     makes it use --follow only when there's a single file path given,
>     ignoring it otherwise (with no confusion about it now)
> 
>   git config log.follow if-possible
>     makes it do the same, but might also do it for more cases if/when
>     they become available (so this is the "do the best you can"
>     option)
> 
>   git config log.follow true
>     invalid until it is always possible to use --follow

I'm not thrilled with it. It still leaves the confusing behavior, but
because we forced the user to pick the confusing behavior, we get to
claim it is their fault. So while we may be guilt-free, I'm not happy
with an option that produces such confusing behavior.

I really wonder if it would be that hard to just fix the code to follow
several files. It shouldn't be conceptually different. When any file
matching the pathspec we're following has a creation event going from
the parent to the commit in question, then we would do rename detection
for that file (and any others created in the same commit). Any detected
sources get added to the pathspec.

> diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
> index fb184ba..6bc7064 100644
> --- a/Documentation/git-log.txt
> +++ b/Documentation/git-log.txt
> @@ -24,7 +24,6 @@ OPTIONS
>  -------
>  
>  :git-log: 1
> -include::diff-options.txt[]

This ":git-log: 1" magic should probably follow the include. It sets up
an attribute for diff-options.txt to conditionally include some
log-specific bits.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 11:38               ` Jeff King
@ 2010-05-12 12:01                 ` Eli Barzilay
  2010-05-12 12:49                   ` Jeff King
  2010-05-12 13:06                   ` Bo Yang
  0 siblings, 2 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-12 12:01 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On May 12, Jeff King wrote:
> On Sat, May 08, 2010 at 03:08:53AM -0400, Eli Barzilay wrote:
> 
> > So I think that it would really help if (1) the diff options in the
> > git-log man page move to after its own options, and (2) they appeared
> > after a title saying that these are the diff options, (3) `--follow'
> > moves up before the few preceding options that seem to me less
> > important.  To clarify, I added a simple patch to the end of this
> > message.  (`git-format-patch' has the same thing, but there it looks
> > more sensible to leave it as is.)
> 
> I have no opinion on moving --follow around, but I definitely agree that
> more clearly marking the diff-options (and including them after
> revision traversal options) is much better.

Should I send that reorganization as a proper patch then?


> > diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
> > index fb184ba..6bc7064 100644
> > --- a/Documentation/git-log.txt
> > +++ b/Documentation/git-log.txt
> > @@ -24,7 +24,6 @@ OPTIONS
> >  -------
> >  
> >  :git-log: 1
> > -include::diff-options.txt[]
> 
> This ":git-log: 1" magic should probably follow the include. It sets up
> an attribute for diff-options.txt to conditionally include some
> log-specific bits.

It seemed like it's a definition that could be used elsewhere too (eg,
in other includes that could be added in the future).


> > How about these options:
> > 
> >   git config log.follow if-single-file
> >     makes it use --follow only when there's a single file path given,
> >     ignoring it otherwise (with no confusion about it now)
> > 
> >   git config log.follow if-possible
> >     makes it do the same, but might also do it for more cases if/when
> >     they become available (so this is the "do the best you can"
> >     option)
> > 
> >   git config log.follow true
> >     invalid until it is always possible to use --follow
> 
> I'm not thrilled with it. It still leaves the confusing behavior,
> but because we forced the user to pick the confusing behavior, we
> get to claim it is their fault. So while we may be guilt-free, I'm
> not happy with an option that produces such confusing behavior.

Well, yes -- it leaves that potential source of confusion, which could
result if someone recommends to just use the `if-single-file' thing,
and I do that while being unaware of what it actually does.  But OTOH,
it helps with the apparently popular confusion that makes people panic
and send "WHERE DID MY HISTORY GO???" emails.  (It's been about 2-3
weeks since we migrated from svn to git, and I had at least three
people do exactly that.)


> I really wonder if it would be that hard to just fix the code to follow
> several files. [...]

That would obviously be a better solution...

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 12:01                 ` Eli Barzilay
@ 2010-05-12 12:49                   ` Jeff King
  2010-05-12 13:03                     ` Jeff King
  2010-05-12 14:35                     ` Eli Barzilay
  2010-05-12 13:06                   ` Bo Yang
  1 sibling, 2 replies; 26+ messages in thread
From: Jeff King @ 2010-05-12 12:49 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On Wed, May 12, 2010 at 08:01:59AM -0400, Eli Barzilay wrote:

> > I have no opinion on moving --follow around, but I definitely agree that
> > more clearly marking the diff-options (and including them after
> > revision traversal options) is much better.
> 
> Should I send that reorganization as a proper patch then?

Yes, please.

> > This ":git-log: 1" magic should probably follow the include. It sets up
> > an attribute for diff-options.txt to conditionally include some
> > log-specific bits.
> 
> It seemed like it's a definition that could be used elsewhere too (eg,
> in other includes that could be added in the future).

Yeah, I considered that, too. So maybe it is best at the top of the
options list (but in that case, perhaps it should go at the very top of
the file).

> > I really wonder if it would be that hard to just fix the code to follow
> > several files. [...]
> 
> That would obviously be a better solution...

It really wasn't too bad, actually. Here is a rough cut. Still to be
considered are:

 - the XXX comment below about proper pathspec limiting. Should be easy
   to fix, but I need to look into it.

 - I haven't tested it with --full-diff to see if there are funny
   interactions with the pathspec limiting.

 - The original --follow swapped the old single path to follow for the
   new one. This patch instead _appends_ to the list of pathspecs. We
   have to do this because the pathspecs may still be of interest. For
   example, consider "git log --follow subdir". When we see that
   "subdir/foo" came from "foo", we start following "foo", too. But we
   can't stop following "subdir", because there may be other files of
   interest in it.

   But this means if you have a history where the path "foo" is of
   interest for a while, then goes away, then you move something else
   into its place, you will see commits touching the old foo, even
   though its content is not connected with the new foo.

   We could special-case when we exactly match a file with a pathspec,
   since we know it is no longer of interest. But I am tempted to call
   the new behavior _more_ useful than the old. It means that "git log
   --follow <pathspec>" means we are interested in what has _ever_ been
   in <pathspec> (which is what it means without follow), in addition to
   the history of anything that got renamed into pathspec (ever).

Anyway, here is the patch. My testing so far has been very simple, so
please try it on a few repos and let me know if it does what you expect
in all cases. Note that is based on "next", as it has Bo's
find_copies_harder patch.

diff --git a/builtin/log.c b/builtin/log.c
index 976e16f..5a7611a 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -88,9 +88,10 @@ static void cmd_log_init(int argc, const char **argv, const char *prefix,
 	if (rev->diffopt.pickaxe || rev->diffopt.filter)
 		rev->always_show_header = 0;
 	if (DIFF_OPT_TST(&rev->diffopt, FOLLOW_RENAMES)) {
-		rev->always_show_header = 0;
-		if (rev->diffopt.nr_paths != 1)
-			usage("git logs can only follow renames on one pathname at a time");
+		if (!rev->diffopt.nr_paths)
+			DIFF_OPT_CLR(&rev->diffopt, FOLLOW_RENAMES);
+		else
+			rev->always_show_header = 0;
 	}
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
diff --git a/tree-diff.c b/tree-diff.c
index 1fb3e94..9467f27 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -5,6 +5,7 @@
 #include "diff.h"
 #include "diffcore.h"
 #include "tree.h"
+#include "dir.h"
 
 static char *malloc_base(const char *base, int baselen, const char *path, int pathlen)
 {
@@ -322,78 +323,75 @@ int diff_tree(struct tree_desc *t1, struct tree_desc *t2, const char *base, stru
 }
 
 /*
- * Does it look like the resulting diff might be due to a rename?
- *  - single entry
- *  - not a valid previous file
+ * Does it look like the resulting diff might be due to a rename? If we have
+ * any files that appeared, then maybe.
  */
 static inline int diff_might_be_rename(void)
 {
-	return diff_queued_diff.nr == 1 &&
-		!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
+	int i;
+	for (i = 0; i < diff_queued_diff.nr; i++)
+		if (!DIFF_FILE_VALID(diff_queued_diff.queue[i]->one))
+			return 1;
+	return 0;
 }
 
 static void try_to_follow_renames(struct tree_desc *t1, struct tree_desc *t2, const char *base, struct diff_options *opt)
 {
 	struct diff_options diff_opts;
+	struct diff_queue_struct old_queue;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	struct diff_filepair *choice;
-	const char *paths[1];
+	const char **new_paths;
+	int new_paths_num, new_paths_alloc;
 	int i;
 
-	/* Remove the file creation entry from the diff queue, and remember it */
-	choice = q->queue[0];
-	q->nr = 0;
+	/* The diff machinery operates on a global queue, so we need to save a
+	 * copy before doing another diff. */
+	old_queue = *q;
+	DIFF_QUEUE_CLEAR(q);
 
 	diff_setup(&diff_opts);
 	DIFF_OPT_SET(&diff_opts, RECURSIVE);
 	DIFF_OPT_SET(&diff_opts, FIND_COPIES_HARDER);
 	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_opts.single_follow = opt->paths[0];
 	diff_opts.break_opt = opt->break_opt;
-	paths[0] = NULL;
-	diff_tree_setup_paths(paths, &diff_opts);
 	if (diff_setup_done(&diff_opts) < 0)
 		die("unable to set up diff options to follow renames");
 	diff_tree(t1, t2, base, &diff_opts);
 	diffcore_std(&diff_opts);
-	diff_tree_release_paths(&diff_opts);
 
-	/* Go through the new set of filepairing, and see if we find a more interesting one */
+	new_paths_num = new_paths_alloc = opt->nr_paths;
+	new_paths = xmalloc(opt->nr_paths * sizeof(*new_paths));
+	for (i = 0; i < opt->nr_paths; i++)
+		new_paths[i] = opt->paths[i];
+
+	/* Go through the new set of filepairs, looking for renames. */
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];
 
-		/*
-		 * Found a source? Not only do we use that for the new
-		 * diff_queued_diff, we will also use that as the path in
-		 * the future!
-		 */
-		if ((p->status == 'R' || p->status == 'C') && !strcmp(p->two->path, opt->paths[0])) {
-			/* Switch the file-pairs around */
-			q->queue[i] = choice;
-			choice = p;
-
-			/* Update the path we use from now on.. */
-			diff_tree_release_paths(opt);
-			opt->paths[0] = xstrdup(p->one->path);
-			diff_tree_setup_paths(opt->paths, opt);
-			break;
-		}
-	}
+		if (p->status != 'R' && p->status != 'C')
+			continue;
 
-	/*
-	 * Then, discard all the non-relevant file pairs...
-	 */
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		diff_free_filepair(p);
+		/* XXX actually this is the slightly different wildcarding
+		 * pathspec. We really want to check just prefixes. But
+		 * I wonder if we can convince the diff machinery to just
+		 * be interested in these paths as destinations, but use
+		 * the whole tree as sources */
+		if (!match_pathspec(opt->paths, p->two->path,
+				   strlen(p->two->path), 0, NULL))
+			continue;
+
+		ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+		new_paths[new_paths_num++] = xstrdup(p->one->path);
 	}
 
-	/*
-	 * .. and re-instate the one we want (which might be either the
-	 * original one, or the rename/copy we found)
-	 */
-	q->queue[0] = choice;
-	q->nr = 1;
+	/* now finalize the new paths */
+	ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+	new_paths[new_paths_num] = NULL;
+	diff_tree_setup_paths(new_paths, opt);
+
+	/* and restore our old queue */
+	free(q->queue);
+	*q = old_queue;
 }
 
 int diff_tree_sha1(const unsigned char *old, const unsigned char *new, const char *base, struct diff_options *opt)
-- 
1.7.1.258.ge9097.dirty

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 12:49                   ` Jeff King
@ 2010-05-12 13:03                     ` Jeff King
  2010-05-12 14:35                     ` Eli Barzilay
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff King @ 2010-05-12 13:03 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On Wed, May 12, 2010 at 08:49:48AM -0400, Jeff King wrote:

> Anyway, here is the patch. My testing so far has been very simple, so
> please try it on a few repos and let me know if it does what you expect
> in all cases. Note that is based on "next", as it has Bo's
> find_copies_harder patch.

By the way, here are some basic speed tests in git.git:

  $ time git log builtin >/dev/null
  real    0m0.486s
  user    0m0.468s
  sys     0m0.016s

  $ time git log --follow builtin >/dev/null
  real    0m23.518s
  user    0m23.469s
  sys     0m0.040s

So it's _way_ slower, but still snappy enough with the first few commits
not to be awful. However, I haven't done any profiling, so it may be
that we can improve that by limiting the rename-detection diff only to
interesting paths.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 12:01                 ` Eli Barzilay
  2010-05-12 12:49                   ` Jeff King
@ 2010-05-12 13:06                   ` Bo Yang
  2010-05-12 13:09                     ` Jeff King
  2010-05-12 14:42                     ` Eli Barzilay
  1 sibling, 2 replies; 26+ messages in thread
From: Bo Yang @ 2010-05-12 13:06 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Jeff King, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

Hi,

On Wed, May 12, 2010 at 8:01 PM, Eli Barzilay <eli@barzilay.org> wrote:
>> I really wonder if it would be that hard to just fix the code to follow
>> several files. [...]
>
> That would obviously be a better solution...

The new line level history browser utility will follow multiple line
ranges of multiple files and it of course will do this. :) Please wait
for some days, I will make it possible in this summer.

Regards!
Bo
-- 
My blog: http://blog.morebits.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 13:06                   ` Bo Yang
@ 2010-05-12 13:09                     ` Jeff King
  2010-05-12 14:42                     ` Eli Barzilay
  1 sibling, 0 replies; 26+ messages in thread
From: Jeff King @ 2010-05-12 13:09 UTC (permalink / raw)
  To: Bo Yang; +Cc: Eli Barzilay, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On Wed, May 12, 2010 at 09:06:11PM +0800, Bo Yang wrote:

> On Wed, May 12, 2010 at 8:01 PM, Eli Barzilay <eli@barzilay.org> wrote:
> >> I really wonder if it would be that hard to just fix the code to follow
> >> several files. [...]
> >
> > That would obviously be a better solution...
> 
> The new line level history browser utility will follow multiple line
> ranges of multiple files and it of course will do this. :) Please wait
> for some days, I will make it possible in this summer.

Yeah, I know there is some overlap with your work. I wonder, though, if
it may be desirable for speed reasons to have three levels of limiting:

  1. limit by pathspec (what we have now, very fast)

  2. limit by path with rename following (my patch, slower)

  3. limit to history of arbitrary content (your work, which will
     presumably be even slower still)

Obviously if your (3) can be as fast as my (2), then there is no need
for (2).

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 12:49                   ` Jeff King
  2010-05-12 13:03                     ` Jeff King
@ 2010-05-12 14:35                     ` Eli Barzilay
  2010-05-12 16:07                       ` Bo Yang
  2010-05-14  4:55                       ` Jeff King
  1 sibling, 2 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-12 14:35 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On May 12, Jeff King wrote:
> On Wed, May 12, 2010 at 08:01:59AM -0400, Eli Barzilay wrote:
> 
> > > I have no opinion on moving --follow around, but I definitely agree that
> > > more clearly marking the diff-options (and including them after
> > > revision traversal options) is much better.
> > 
> > Should I send that reorganization as a proper patch then?
> 
> Yes, please.

Done.


> > > This ":git-log: 1" magic should probably follow the include. It sets up
> > > an attribute for diff-options.txt to conditionally include some
> > > log-specific bits.
> > 
> > It seemed like it's a definition that could be used elsewhere too (eg,
> > in other includes that could be added in the future).
> 
> Yeah, I considered that, too. So maybe it is best at the top of the
> options list (but in that case, perhaps it should go at the very top
> of the file).

Since I don't know, I just dragged it along with the include.  (Moving
it to the top would be better better done with the other files, so
it's a different thing...)


> Anyway, here is the patch. My testing so far has been very simple,
> so please try it on a few repos and let me know if it does what you
> expect in all cases. Note that is based on "next", as it has Bo's
> find_copies_harder patch.

I tried it with our repository -- it does what I expected it to do wrt
showing the log messages, and at a very small penalty (~0.9s vs ~0.7s
for scanning a history of about 20k commits).

But with `-p' it was doing something confusing: I used two files that
were recently renamed, and the result was the correct log history, but
the first patch that was shown (the rename) showed the two files as
added.  (That's even when I added `-C' and `-M'.)  This happens even
with a single path.  OTOH, using `--follow' with `-p' and a single
path without your patch produces the expected result where the first
patch is a rename (even without `-C'/`-M').

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 13:06                   ` Bo Yang
  2010-05-12 13:09                     ` Jeff King
@ 2010-05-12 14:42                     ` Eli Barzilay
  2010-05-12 14:49                       ` Bo Yang
  1 sibling, 1 reply; 26+ messages in thread
From: Eli Barzilay @ 2010-05-12 14:42 UTC (permalink / raw)
  To: Bo Yang; +Cc: Jeff King, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On May 12, Bo Yang wrote:
> On Wed, May 12, 2010 at 8:01 PM, Eli Barzilay <eli@barzilay.org> wrote:
> > That would obviously be a better solution...
> 
> The new line level history browser utility will follow multiple line
> ranges of multiple files and it of course will do this. :) Please
> wait for some days, I will make it possible in this summer.

(Looking at the description (in the SoC2010Ideas page) it wasn't clear
to me whether this will be simple to use as log -- that is, just run
some `git foo dir/{x,y,z}' or `git foo dir'.)

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 14:42                     ` Eli Barzilay
@ 2010-05-12 14:49                       ` Bo Yang
  0 siblings, 0 replies; 26+ messages in thread
From: Bo Yang @ 2010-05-12 14:49 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Jeff King, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

Hi,
On Wed, May 12, 2010 at 10:42 PM, Eli Barzilay <eli@barzilay.org> wrote:
> (Looking at the description (in the SoC2010Ideas page) it wasn't clear
> to me whether this will be simple to use as log -- that is, just run
> some `git foo dir/{x,y,z}' or `git foo dir'.)

The idea here is that line level history will trace rename/copies of
multiple files and trace rename/copies of multiple files is obviously
a subset of tracing of it. So, what I mean is along implement such a
line level browser, I can do the --follow files case very well. :)

And as Jeff said in his mail, his implementation is necessary in speed
and please still use 'git log --follow -- pathspec' for whole file
history traversing.

Regards!
Bo

-- 
My blog: http://blog.morebits.org

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 14:35                     ` Eli Barzilay
@ 2010-05-12 16:07                       ` Bo Yang
  2010-05-12 16:45                         ` Jeff King
  2010-05-14  4:55                       ` Jeff King
  1 sibling, 1 reply; 26+ messages in thread
From: Bo Yang @ 2010-05-12 16:07 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Jeff King, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

Hi,

> But with `-p' it was doing something confusing: I used two files that
> were recently renamed, and the result was the correct log history, but
> the first patch that was shown (the rename) showed the two files as
> added.  (That's even when I added `-C' and `-M'.)  This happens even
> with a single path.  OTOH, using `--follow' with `-p' and a single
> path without your patch produces the expected result where the first
> patch is a rename (even without `-C'/`-M').

Try this one, it only change a little on Jeff's patch, my simple tests
show it works. Apply it on 'next' branch.

diff --git a/builtin/log.c b/builtin/log.c
index 6208703..0142036 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -70,9 +70,10 @@ static void cmd_log_init(int argc, const char
**argv, const char *prefix,
        if (rev->diffopt.pickaxe || rev->diffopt.filter)
                rev->always_show_header = 0;
        if (DIFF_OPT_TST(&rev->diffopt, FOLLOW_RENAMES)) {
-               rev->always_show_header = 0;
-               if (rev->diffopt.nr_paths != 1)
-                       usage("git logs can only follow renames on one
pathname at a time");
+               if (!rev->diffopt.nr_paths)
+                       DIFF_OPT_CLR(&rev->diffopt, FOLLOW_RENAMES);
+               else
+                       rev->always_show_header = 0;
        }
        for (i = 1; i < argc; i++) {
                const char *arg = argv[i];
diff --git a/diffcore.h b/diffcore.h
index 491bea0..9039a06 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -99,6 +99,10 @@ struct diff_queue_struct {
                (q)->nr = (q)->alloc = 0; \
                (q)->run = 0; \
        } while(0);
+#define DIFF_QUEUE_MARK_RUN(q, r) \
+       do { \
+               (q)->run = (r); \
+       } while(0);

 extern struct diff_queue_struct diff_queued_diff;
 extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
diff --git a/tree-diff.c b/tree-diff.c
index 1fb3e94..dd95f74 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -5,6 +5,7 @@
 #include "diff.h"
 #include "diffcore.h"
 #include "tree.h"
+#include "dir.h"

 static char *malloc_base(const char *base, int baselen, const char
*path, int pathlen)
 {
@@ -322,78 +323,74 @@ int diff_tree(struct tree_desc *t1, struct
tree_desc *t2, const char *base, stru
 }

 /*
- * Does it look like the resulting diff might be due to a rename?
- *  - single entry
- *  - not a valid previous file
+ * Does it look like the resulting diff might be due to a rename? If we have
+ * any files that appeared, then maybe.
  */
 static inline int diff_might_be_rename(void)
 {
-       return diff_queued_diff.nr == 1 &&
-               !DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
+       int i;
+       for (i = 0; i < diff_queued_diff.nr; i++)
+               if (!DIFF_FILE_VALID(diff_queued_diff.queue[i]->one))
+                       return 1;
+       return 0;
 }

 static void try_to_follow_renames(struct tree_desc *t1, struct
tree_desc *t2, const char *base, struct diff_options *opt)
 {
        struct diff_options diff_opts;
+       struct diff_queue_struct outq;
        struct diff_queue_struct *q = &diff_queued_diff;
-       struct diff_filepair *choice;
-       const char *paths[1];
+       const char **new_paths;
+       int new_paths_num, new_paths_alloc;
        int i;

-       /* Remove the file creation entry from the diff queue, and
remember it */
-       choice = q->queue[0];
+       DIFF_QUEUE_CLEAR(&outq);
        q->nr = 0;

        diff_setup(&diff_opts);
        DIFF_OPT_SET(&diff_opts, RECURSIVE);
        DIFF_OPT_SET(&diff_opts, FIND_COPIES_HARDER);
        diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-       diff_opts.single_follow = opt->paths[0];
        diff_opts.break_opt = opt->break_opt;
-       paths[0] = NULL;
-       diff_tree_setup_paths(paths, &diff_opts);
        if (diff_setup_done(&diff_opts) < 0)
                die("unable to set up diff options to follow renames");
        diff_tree(t1, t2, base, &diff_opts);
        diffcore_std(&diff_opts);
-       diff_tree_release_paths(&diff_opts);

-       /* Go through the new set of filepairing, and see if we find a
more interesting one */
+       new_paths_num = new_paths_alloc = opt->nr_paths;
+       new_paths = xmalloc(opt->nr_paths * sizeof(*new_paths));
+       for (i = 0; i < opt->nr_paths; i++)
+               new_paths[i] = opt->paths[i];
+
+       /* Go through the new set of filepairs, looking for renames. */
        for (i = 0; i < q->nr; i++) {
                struct diff_filepair *p = q->queue[i];

-               /*
-                * Found a source? Not only do we use that for the new
-                * diff_queued_diff, we will also use that as the path in
-                * the future!
-                */
-               if ((p->status == 'R' || p->status == 'C') &&
!strcmp(p->two->path, opt->paths[0])) {
-                       /* Switch the file-pairs around */
-                       q->queue[i] = choice;
-                       choice = p;
-
-                       /* Update the path we use from now on.. */
-                       diff_tree_release_paths(opt);
-                       opt->paths[0] = xstrdup(p->one->path);
-                       diff_tree_setup_paths(opt->paths, opt);
-                       break;
-               }
-       }
+               /* XXX actually this is the slightly different wildcarding
+                * pathspec. We really want to check just prefixes. But
+                * I wonder if we can convince the diff machinery to just
+                * be interested in these paths as destinations, but use
+                * the whole tree as sources */
+               if (!match_pathspec(opt->paths, p->two->path,
+                                  strlen(p->two->path), 0, NULL))
+                       continue;

-       /*
-        * Then, discard all the non-relevant file pairs...
-        */
-       for (i = 0; i < q->nr; i++) {
-               struct diff_filepair *p = q->queue[i];
-               diff_free_filepair(p);
+               diff_q(&outq, p);
+
+               ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+               new_paths[new_paths_num++] = xstrdup(p->one->path);
        }

-       /*
-        * .. and re-instate the one we want (which might be either the
-        * original one, or the rename/copy we found)
-        */
-       q->queue[0] = choice;
-       q->nr = 1;
+       /* now finalize the new paths */
+       ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+       new_paths[new_paths_num] = NULL;
+       diff_tree_release_paths(opt);
+       diff_tree_setup_paths(new_paths, opt);
+
+       /* and restore our old queue */
+       free(q->queue);
+       *q = outq;
+       DIFF_QUEUE_MARK_RUN(q, 1);
 }

 int diff_tree_sha1(const unsigned char *old, const unsigned char
*new, const char *base, struct diff_options *opt)


Regards!
Bo

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 16:07                       ` Bo Yang
@ 2010-05-12 16:45                         ` Jeff King
  2010-05-13  2:12                           ` Bo Yang
  0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2010-05-12 16:45 UTC (permalink / raw)
  To: Bo Yang; +Cc: Eli Barzilay, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On Thu, May 13, 2010 at 12:07:24AM +0800, Bo Yang wrote:

> > But with `-p' it was doing something confusing: I used two files that
> > were recently renamed, and the result was the correct log history, but
> > the first patch that was shown (the rename) showed the two files as
> > added.  (That's even when I added `-C' and `-M'.)  This happens even
> > with a single path.  OTOH, using `--follow' with `-p' and a single
> > path without your patch produces the expected result where the first
> > patch is a rename (even without `-C'/`-M').
> 
> Try this one, it only change a little on Jeff's patch, my simple tests
> show it works. Apply it on 'next' branch.

This is badly whitespace damaged. I fixed up some wrapping, but it seems
that there was some tab/space interchanges, too. Can you repost?

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 16:45                         ` Jeff King
@ 2010-05-13  2:12                           ` Bo Yang
  2010-05-13  4:39                             ` Eli Barzilay
  0 siblings, 1 reply; 26+ messages in thread
From: Bo Yang @ 2010-05-13  2:12 UTC (permalink / raw)
  To: Jeff King; +Cc: Eli Barzilay, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

Hi Jeff,
On Thu, May 13, 2010 at 12:45 AM, Jeff King <peff@peff.net> wrote:
> On Thu, May 13, 2010 at 12:07:24AM +0800, Bo Yang wrote:
>>
>> Try this one, it only change a little on Jeff's patch, my simple tests
>> show it works. Apply it on 'next' branch.
>
> This is badly whitespace damaged. I fixed up some wrapping, but it seems
> that there was some tab/space interchanges, too. Can you repost?

Ah, sorry. I just copy the patch from my putty window, so it convert
all tabs into spaces...
And the following change this. Also, I have run a 'make test', seems
there is no problem for it. Happy try it!


diff --git a/builtin/log.c b/builtin/log.c
index 6208703..0142036 100644
--- a/builtin/log.c
+++ b/builtin/log.c
@@ -70,9 +70,10 @@ static void cmd_log_init(int argc, const char
**argv, const char *prefix,
 	if (rev->diffopt.pickaxe || rev->diffopt.filter)
 		rev->always_show_header = 0;
 	if (DIFF_OPT_TST(&rev->diffopt, FOLLOW_RENAMES)) {
-		rev->always_show_header = 0;
-		if (rev->diffopt.nr_paths != 1)
-			usage("git logs can only follow renames on one pathname at a time");
+		if (!rev->diffopt.nr_paths)
+			DIFF_OPT_CLR(&rev->diffopt, FOLLOW_RENAMES);
+		else
+			rev->always_show_header = 0;
 	}
 	for (i = 1; i < argc; i++) {
 		const char *arg = argv[i];
diff --git a/diffcore.h b/diffcore.h
index 491bea0..9039a06 100644
--- a/diffcore.h
+++ b/diffcore.h
@@ -99,6 +99,10 @@ struct diff_queue_struct {
 		(q)->nr = (q)->alloc = 0; \
 		(q)->run = 0; \
 	} while(0);
+#define DIFF_QUEUE_MARK_RUN(q, r) \
+	do { \
+		(q)->run = (r); \
+	} while(0);

 extern struct diff_queue_struct diff_queued_diff;
 extern struct diff_filepair *diff_queue(struct diff_queue_struct *,
diff --git a/tree-diff.c b/tree-diff.c
index 1fb3e94..dd95f74 100644
--- a/tree-diff.c
+++ b/tree-diff.c
@@ -5,6 +5,7 @@
 #include "diff.h"
 #include "diffcore.h"
 #include "tree.h"
+#include "dir.h"

 static char *malloc_base(const char *base, int baselen, const char
*path, int pathlen)
 {
@@ -322,78 +323,74 @@ int diff_tree(struct tree_desc *t1, struct
tree_desc *t2, const char *base, stru
 }

 /*
- * Does it look like the resulting diff might be due to a rename?
- *  - single entry
- *  - not a valid previous file
+ * Does it look like the resulting diff might be due to a rename? If we have
+ * any files that appeared, then maybe.
  */
 static inline int diff_might_be_rename(void)
 {
-	return diff_queued_diff.nr == 1 &&
-		!DIFF_FILE_VALID(diff_queued_diff.queue[0]->one);
+	int i;
+	for (i = 0; i < diff_queued_diff.nr; i++)
+		if (!DIFF_FILE_VALID(diff_queued_diff.queue[i]->one))
+			return 1;
+	return 0;
 }

 static void try_to_follow_renames(struct tree_desc *t1, struct
tree_desc *t2, const char *base, struct diff_options *opt)
 {
 	struct diff_options diff_opts;
+	struct diff_queue_struct outq;
 	struct diff_queue_struct *q = &diff_queued_diff;
-	struct diff_filepair *choice;
-	const char *paths[1];
+	const char **new_paths;
+	int new_paths_num, new_paths_alloc;
 	int i;

-	/* Remove the file creation entry from the diff queue, and remember it */
-	choice = q->queue[0];
+	DIFF_QUEUE_CLEAR(&outq);
 	q->nr = 0;

 	diff_setup(&diff_opts);
 	DIFF_OPT_SET(&diff_opts, RECURSIVE);
 	DIFF_OPT_SET(&diff_opts, FIND_COPIES_HARDER);
 	diff_opts.output_format = DIFF_FORMAT_NO_OUTPUT;
-	diff_opts.single_follow = opt->paths[0];
 	diff_opts.break_opt = opt->break_opt;
-	paths[0] = NULL;
-	diff_tree_setup_paths(paths, &diff_opts);
 	if (diff_setup_done(&diff_opts) < 0)
 		die("unable to set up diff options to follow renames");
 	diff_tree(t1, t2, base, &diff_opts);
 	diffcore_std(&diff_opts);
-	diff_tree_release_paths(&diff_opts);

-	/* Go through the new set of filepairing, and see if we find a more
interesting one */
+	new_paths_num = new_paths_alloc = opt->nr_paths;
+	new_paths = xmalloc(opt->nr_paths * sizeof(*new_paths));
+	for (i = 0; i < opt->nr_paths; i++)
+		new_paths[i] = opt->paths[i];
+
+	/* Go through the new set of filepairs, looking for renames. */
 	for (i = 0; i < q->nr; i++) {
 		struct diff_filepair *p = q->queue[i];

-		/*
-		 * Found a source? Not only do we use that for the new
-		 * diff_queued_diff, we will also use that as the path in
-		 * the future!
-		 */
-		if ((p->status == 'R' || p->status == 'C') && !strcmp(p->two->path,
opt->paths[0])) {
-			/* Switch the file-pairs around */
-			q->queue[i] = choice;
-			choice = p;
-
-			/* Update the path we use from now on.. */
-			diff_tree_release_paths(opt);
-			opt->paths[0] = xstrdup(p->one->path);
-			diff_tree_setup_paths(opt->paths, opt);
-			break;
-		}
-	}
+		/* XXX actually this is the slightly different wildcarding
+		 * pathspec. We really want to check just prefixes. But
+		 * I wonder if we can convince the diff machinery to just
+		 * be interested in these paths as destinations, but use
+		 * the whole tree as sources */
+		if (!match_pathspec(opt->paths, p->two->path,
+				   strlen(p->two->path), 0, NULL))
+			continue;

-	/*
-	 * Then, discard all the non-relevant file pairs...
-	 */
-	for (i = 0; i < q->nr; i++) {
-		struct diff_filepair *p = q->queue[i];
-		diff_free_filepair(p);
+		diff_q(&outq, p);
+
+		ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+		new_paths[new_paths_num++] = xstrdup(p->one->path);
 	}

-	/*
-	 * .. and re-instate the one we want (which might be either the
-	 * original one, or the rename/copy we found)
-	 */
-	q->queue[0] = choice;
-	q->nr = 1;
+	/* now finalize the new paths */
+	ALLOC_GROW(new_paths, new_paths_num + 1, new_paths_alloc);
+	new_paths[new_paths_num] = NULL;
+	diff_tree_release_paths(opt);
+	diff_tree_setup_paths(new_paths, opt);
+
+	/* and restore our old queue */
+	free(q->queue);
+	*q = outq;
+	DIFF_QUEUE_MARK_RUN(q, 1);
 }

 int diff_tree_sha1(const unsigned char *old, const unsigned char
*new, const char *base, struct diff_options *opt)


Regards!
Bo
-- 
My blog: http://blog.morebits.org

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-13  2:12                           ` Bo Yang
@ 2010-05-13  4:39                             ` Eli Barzilay
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-13  4:39 UTC (permalink / raw)
  To: Bo Yang; +Cc: Jeff King, Junio C Hamano, Eugene Sajine, Jacob Helwig, git

On May 13, Bo Yang wrote:
> 
> Ah, sorry. I just copy the patch from my putty window, so it convert
> all tabs into spaces...  And the following change this. Also, I have
> run a 'make test', seems there is no problem for it. Happy try it!

There are still a few lines that got wrapped -- after I fixed them I
could apply it, and it looks like it works as (I) expected with and
without `-p'.  (I didn't try to measure the speed though -- it was
perfectly fine for any interactive use that I'd want to use it for.)

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-12 14:35                     ` Eli Barzilay
  2010-05-12 16:07                       ` Bo Yang
@ 2010-05-14  4:55                       ` Jeff King
  2010-05-14  5:05                         ` Eli Barzilay
  1 sibling, 1 reply; 26+ messages in thread
From: Jeff King @ 2010-05-14  4:55 UTC (permalink / raw)
  To: Eli Barzilay; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On Wed, May 12, 2010 at 10:35:32AM -0400, Eli Barzilay wrote:

> But with `-p' it was doing something confusing: I used two files that
> were recently renamed, and the result was the correct log history, but
> the first patch that was shown (the rename) showed the two files as
> added.  (That's even when I added `-C' and `-M'.)  This happens even
> with a single path.  OTOH, using `--follow' with `-p' and a single
> path without your patch produces the expected result where the first
> patch is a rename (even without `-C'/`-M').

Ah, yeah, I see. The problem is that my code is doing something like:

  1. Do a sha1-only diff with our current path list.

  2. If there were any created files, they might be renames.
     Put aside the old diff results. Do a new diff, looking for renames.

  3. If there are renames, add them to our path list.

  4. Restore the old diff results.

  5. Proceed with other desired diff options (rename detection, showing
     patches, etc).

But the during step (5), remember that we are still working with the old
diff results, which will not include the expanded path. Thus we won't
consider the new path as a rename source, and will fail to find the
rename.

The naive right way would be to re-do step (1) with the expanded path.
But there is an optimization, since we can use the diff results from (2)
directly, including avoiding re-doing the rename detection.

The only "downside" is that it means --follow actually impacts the diff
generation by implying --find-copies-harder. And I put downside in
quotes because it is probably not a big deal. We have already spent the
CPU time to find the answer, so it is silly not to show it. I can't
imagine why somebody would want --follow, but would _not_ want rename
detection in the resulting diff.

Bo's version of the patch does that optimization. When I clean up the
patch (probably sometime next week), I'll take those changes.

-Peff

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: Re: git log -M -- filename is not working?
  2010-05-14  4:55                       ` Jeff King
@ 2010-05-14  5:05                         ` Eli Barzilay
  0 siblings, 0 replies; 26+ messages in thread
From: Eli Barzilay @ 2010-05-14  5:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Eugene Sajine, Bo Yang, Jacob Helwig, git

On May 14, Jeff King wrote:
> [...]

(I didn't get most of that...)

> The only "downside" is that it means --follow actually impacts the diff
> generation by implying --find-copies-harder. And I put downside in
> quotes because it is probably not a big deal. We have already spent the
> CPU time to find the answer, so it is silly not to show it. I can't
> imagine why somebody would want --follow, but would _not_ want rename
> detection in the resulting diff.

...but this seems very reasonable.  (In fact, if --follow implies
finding copies and renames, and if it's doing that for diff too, then
it's exactly what I originally wanted.)


> Bo's version of the patch does that optimization. When I clean up the
> patch (probably sometime next week), I'll take those changes.

Thanks!

-- 
          ((lambda (x) (x x)) (lambda (x) (x x)))          Eli Barzilay:
                    http://barzilay.org/                   Maze is Life!

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2010-05-14  5:05 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-07 18:07 git log -M -- filename is not working? Eugene Sajine
2010-05-07 18:10 ` Jacob Helwig
2010-05-07 18:31   ` Eugene Sajine
2010-05-07 18:37     ` Eli Barzilay
2010-05-07 20:25       ` Matthieu Moy
2010-05-07 20:28       ` Jakub Narebski
2010-05-08  4:44       ` Jeff King
2010-05-08  5:12         ` Eli Barzilay
2010-05-08  5:30           ` Jeff King
2010-05-08  5:39           ` Junio C Hamano
2010-05-08  7:08             ` Eli Barzilay
2010-05-12 11:38               ` Jeff King
2010-05-12 12:01                 ` Eli Barzilay
2010-05-12 12:49                   ` Jeff King
2010-05-12 13:03                     ` Jeff King
2010-05-12 14:35                     ` Eli Barzilay
2010-05-12 16:07                       ` Bo Yang
2010-05-12 16:45                         ` Jeff King
2010-05-13  2:12                           ` Bo Yang
2010-05-13  4:39                             ` Eli Barzilay
2010-05-14  4:55                       ` Jeff King
2010-05-14  5:05                         ` Eli Barzilay
2010-05-12 13:06                   ` Bo Yang
2010-05-12 13:09                     ` Jeff King
2010-05-12 14:42                     ` Eli Barzilay
2010-05-12 14:49                       ` Bo Yang

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.