All of lore.kernel.org
 help / color / mirror / Atom feed
* gitk "find commit adding/removing string"/possible pickaxe bug?
@ 2011-01-18 16:16 Sebastian Hahn
  2011-01-18 16:44 ` Thomas Rast
  0 siblings, 1 reply; 10+ messages in thread
From: Sebastian Hahn @ 2011-01-18 16:16 UTC (permalink / raw)
  To: git

Hi,

I quite like gitk and am a fan of the ability to easily locate commits
where a specific string was added/removed. If the string in question
was added in a merge commit as part of a conflicted/otherwise
changed merge, gitk doesn't display it.

(If you want to reproduce, the repository is git://git.torproject.org/ 
tor,
the string I'm looking for is "< DIGEST" and the first commit I hope to
find is ed87738ede789fb).

I presented the issue to #git, and it was suggested that it is probably
a pickaxe bug in that it doesn't display changes in merge commits if
they add strings that neither of their parents has.

Do you agree that this is a bug, or am I missing anything here?

Thanks for you consideration!

Sebastian

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 16:16 gitk "find commit adding/removing string"/possible pickaxe bug? Sebastian Hahn
@ 2011-01-18 16:44 ` Thomas Rast
  2011-01-18 18:39   ` Junio C Hamano
  2011-01-18 18:50   ` Jeff King
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Rast @ 2011-01-18 16:44 UTC (permalink / raw)
  To: git; +Cc: Sebastian Hahn

Sebastian Hahn wrote:
> 
> I quite like gitk and am a fan of the ability to easily locate commits
> where a specific string was added/removed. If the string in question
> was added in a merge commit as part of a conflicted/otherwise
> changed merge, gitk doesn't display it.
[...]
> I presented the issue to #git, and it was suggested that it is probably
> a pickaxe bug

In particular, in a history where

  $ git show HEAD:foo
  quux
  $ git show HEAD^:foo
  bar
  $ git show HEAD^2:foo
  baz

the behaviour is:

  git log -Squux                  # empty
  git log -Squux -p               # empty
  git log -Squux --pickaxe-all    # empty

  git log -Squux -c      	  # shows merge, but no diff
  git log -Squux --cc    	  # shows merge, but no diff
  git log -Squux -c -p   	  # shows merge, but no diff
  git log -Squux -c --pickaxe-all # shows merge, but no diff

  git log -Squux --pickaxe-all -c -p  # shows merge & combined diff

So it only shows the diff with --pickaxe-all, even though the (only)
hunk clearly introduced the string.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 16:44 ` Thomas Rast
@ 2011-01-18 18:39   ` Junio C Hamano
  2011-01-18 18:50   ` Jeff King
  1 sibling, 0 replies; 10+ messages in thread
From: Junio C Hamano @ 2011-01-18 18:39 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Sebastian Hahn

Thomas Rast <trast@student.ethz.ch> writes:

>   git log -Squux --pickaxe-all -c -p  # shows merge & combined diff
>
> So it only shows the diff with --pickaxe-all, even though the (only)
> hunk clearly introduced the string.

Isn't it just because unless you ask for -m to get all the pairwise diff
between each parent and the merged child the diff machinery won't kick in
for merge commits?  I didn't check the option defaulting logic, but there
may be something that changes the default setting of "-m" among your
examples.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 16:44 ` Thomas Rast
  2011-01-18 18:39   ` Junio C Hamano
@ 2011-01-18 18:50   ` Jeff King
  2011-01-18 20:39     ` Thomas Rast
  1 sibling, 1 reply; 10+ messages in thread
From: Jeff King @ 2011-01-18 18:50 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Sebastian Hahn

On Tue, Jan 18, 2011 at 05:44:17PM +0100, Thomas Rast wrote:

> In particular, in a history where
> 
>   $ git show HEAD:foo
>   quux
>   $ git show HEAD^:foo
>   bar
>   $ git show HEAD^2:foo
>   baz

I created a similar repo with:

commit() {
  echo $1 >file && git add file && git commit -m $1
}
mkdir repo && cd repo && git init
commit base
commit master
git checkout -b other HEAD^
commit other
git merge master
commit resolved

which should be identical. But I get different results (see near the
end):

> the behaviour is:
> 
>   git log -Squux                  # empty
>   git log -Squux -p               # empty

All of which make sense to me. Pickaxe operates on diff filepairs, and
git by default doesn't seem to do merge diffing at all (but see below).
So those filepairs don't exist to consider.

>   git log -Squux --pickaxe-all    # empty

This doesn't help. It just loosens the actual diff shown from "just the
things that matched -S" to "everything in that commit". It doesn't
add to the filepairs that make it to pickaxe.

>   git log -Squux -c      	  # shows merge, but no diff

This "-c" does what you want, because we start looking at merge
filepairs. Although one thing leaves me confused. If I do:

  git log -p

I get no diff for the merge commit. But in git-diff(1), it says:

  COMBINED DIFF FORMAT
    "git-diff-tree", "git-diff-files" and "git-diff" can take -c or --cc
    option to produce combined diff. For showing a merge commit with
    "git log -p", this is the default format; you can force showing full
    diff with the -m option.

which implies to me that "-c" should be on by default if we selected
"-p" (or presumably -S).

I didn't bisect, but I wonder if the doc is wrong, or if we accidentally
lost this default at some point.

>   git log -Squux --cc    	  # shows merge, but no diff

Makes sense again, since you didn't ask for a patch, no patch.

>   git log -Squux -c -p   	  # shows merge, but no diff

Weird.  Here I get a nice combined diff, which is what I expect.

>   git log -Squux -c --pickaxe-all # shows merge, but no diff

Yep, no "-p" again.

>   git log -Squux --pickaxe-all -c -p  # shows merge & combined diff

The pickaxe-all shouldn't impact anything, at least not in my test repo,
as there is only one file. But of course I do get the diff, as I did
above. Can you show the steps to create your repo? I'm wondering what is
different.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 18:50   ` Jeff King
@ 2011-01-18 20:39     ` Thomas Rast
  2011-01-18 20:50       ` Jeff King
  2011-01-18 21:16       ` Thomas Rast
  0 siblings, 2 replies; 10+ messages in thread
From: Thomas Rast @ 2011-01-18 20:39 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Sebastian Hahn

Jeff King wrote:
> commit() {
>   echo $1 >file && git add file && git commit -m $1
> }
> mkdir repo && cd repo && git init
> commit base
> commit master
> git checkout -b other HEAD^
> commit other
> git merge master
> commit resolved

Indeed, my history looks just like that.

> >   git log -Squux -c -p          # shows merge, but no diff
> 
> Weird.  Here I get a nice combined diff, which is what I expect.

True, I managed to confuse myself between looking for the resolution and
looking for one of the (deleted) merge sides.

So indeed

  git log -Squux -c -p

gives a combined diff.  But OTOH

  git log -Sbar -c -p

doesn't; it only gives a diff for the commit that introduced 'bar'.  I
guess this makes sense: -S notices that the number of 'bar's is
actually the same as in *one* merge parent, hence the merge cannot be
all that interesting.  OTOH it still shows the merge commit in the
history, which is a bit strange.  --pickaxe-all does not make a
difference either;

   git log -Sbar --cc -p --pickaxe-all

still shows the merge commit but no diff.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 20:39     ` Thomas Rast
@ 2011-01-18 20:50       ` Jeff King
  2011-01-18 21:26         ` Junio C Hamano
  2011-01-31 20:00         ` Jeff King
  2011-01-18 21:16       ` Thomas Rast
  1 sibling, 2 replies; 10+ messages in thread
From: Jeff King @ 2011-01-18 20:50 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Sebastian Hahn

On Tue, Jan 18, 2011 at 09:39:28PM +0100, Thomas Rast wrote:

> So indeed
> 
>   git log -Squux -c -p
> 
> gives a combined diff.  But OTOH
> 
>   git log -Sbar -c -p
> 
> doesn't; it only gives a diff for the commit that introduced 'bar'.  I
> guess this makes sense: -S notices that the number of 'bar's is
> actually the same as in *one* merge parent, hence the merge cannot be
> all that interesting.  OTOH it still shows the merge commit in the
> history, which is a bit strange.  --pickaxe-all does not make a
> difference either;

Hrm. What I expected[1] to happen would be for the diff machinery to
look at each filepair individually, one of them to trigger -S, which
shows the commit, and then to fail to produce a combined diff because we
threw away the other uninteresting filepair. But in that case,
--pickaxe-all _should_ show something, as its point is to keep all of
the filepairs.  And that's clearly not happening.

So now I don't know what's going on. I'll try to trace through the diff
machinery and see if that gives a clue.

-Peff

[1] That's what I expect, but not necessarily what I want. I think what
I would want is for it to do a token count of the merge commit, and if
it fails to match _every_ parent, then it it interesting. Otherwise, the
content presumably came from that parent.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 20:39     ` Thomas Rast
  2011-01-18 20:50       ` Jeff King
@ 2011-01-18 21:16       ` Thomas Rast
  1 sibling, 0 replies; 10+ messages in thread
From: Thomas Rast @ 2011-01-18 21:16 UTC (permalink / raw)
  To: Jeff King; +Cc: git, Sebastian Hahn, Junio C Hamano

Thomas Rast wrote:
> So indeed
> 
>   git log -Squux -c -p
> 
> gives a combined diff.  But OTOH
> 
>   git log -Sbar -c -p
> 
> doesn't; it only gives a diff for the commit that introduced 'bar'.

It's actually even stranger.  -S does not seem to filter merges at
all.  For example, in git.git

  $ git log | grep -c ^Merge
  4677
  $ git log -Sthis_string_never_existed_anywhere -c | grep -c ^Merge
  4677

I think it should just filter all history, shouldn't it?  After all

  $ git log -- this_file_never_existed

also comes up empty.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 20:50       ` Jeff King
@ 2011-01-18 21:26         ` Junio C Hamano
  2011-01-18 21:33           ` Jeff King
  2011-01-31 20:00         ` Jeff King
  1 sibling, 1 reply; 10+ messages in thread
From: Junio C Hamano @ 2011-01-18 21:26 UTC (permalink / raw)
  To: Jeff King; +Cc: Thomas Rast, git, Sebastian Hahn

Jeff King <peff@peff.net> writes:

> [1] That's what I expect, but not necessarily what I want. I think what
> I would want is for it to do a token count of the merge commit, and if
> it fails to match _every_ parent, then it it interesting. Otherwise, the
> content presumably came from that parent.

Honestly, my guess is that the interaction of -S with a merge commit is
"whatever the code happens to do", as I didn't think nor design how they
should interact with each other when I wrote -c/--cc nor when I wrote -S.
If I recall correctly -S codepath predates -c/--cc by a wide margin, and I
wouldn't be surprised at all if pickaxe doesn't work as expected (by
anybody's definition of "expectation"), unless you are looking at "-p -m"
output, not a combined one.

Having said that, I tend to agree with your latter expectation ("what I
want").

By the way, you guys should really not be looking at the disused
plumbing-helper -S but instead be advocating its newer and more human
friendly cousin -G.  1.7.4 is coming ;-).

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 21:26         ` Junio C Hamano
@ 2011-01-18 21:33           ` Jeff King
  0 siblings, 0 replies; 10+ messages in thread
From: Jeff King @ 2011-01-18 21:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Thomas Rast, git, Sebastian Hahn

On Tue, Jan 18, 2011 at 01:26:15PM -0800, Junio C Hamano wrote:

> Honestly, my guess is that the interaction of -S with a merge commit is
> "whatever the code happens to do", as I didn't think nor design how they
> should interact with each other when I wrote -c/--cc nor when I wrote -S.

That's kind of what I figured.

> Having said that, I tend to agree with your latter expectation ("what I
> want").

I'll take a look and see how painful it will be to do that. I'm not even
sure how the merge filepairs are represented by the diff code (since
they are not even pairs, but rather one child with many parents).

> By the way, you guys should really not be looking at the disused
> plumbing-helper -S but instead be advocating its newer and more human
> friendly cousin -G.  1.7.4 is coming ;-).

It's _way_ slower for simple things:

  $ time git log -Sfoo >/dev/null
  real    0m11.550s
  user    0m11.409s
  sys     0m0.116s

  $ time git log -Gfoo >/dev/null
  real    0m25.722s
  user    0m25.442s
  sys     0m0.220s

I for one really like the -S behavior, anyway, but I expect I am in the
minority.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: gitk "find commit adding/removing string"/possible pickaxe bug?
  2011-01-18 20:50       ` Jeff King
  2011-01-18 21:26         ` Junio C Hamano
@ 2011-01-31 20:00         ` Jeff King
  1 sibling, 0 replies; 10+ messages in thread
From: Jeff King @ 2011-01-31 20:00 UTC (permalink / raw)
  To: Thomas Rast; +Cc: git, Sebastian Hahn

On Tue, Jan 18, 2011 at 03:50:40PM -0500, Jeff King wrote:

> > doesn't; it only gives a diff for the commit that introduced 'bar'.  I
> > guess this makes sense: -S notices that the number of 'bar's is
> > actually the same as in *one* merge parent, hence the merge cannot be
> > all that interesting.  OTOH it still shows the merge commit in the
> > history, which is a bit strange.  --pickaxe-all does not make a
> > difference either;
> 
> Hrm. What I expected[1] to happen would be for the diff machinery to
> look at each filepair individually, one of them to trigger -S, which
> shows the commit, and then to fail to produce a combined diff because we
> threw away the other uninteresting filepair. But in that case,
> --pickaxe-all _should_ show something, as its point is to keep all of
> the filepairs.  And that's clearly not happening.
> 
> So now I don't know what's going on. I'll try to trace through the diff
> machinery and see if that gives a clue.
> 
> -Peff
> 
> [1] That's what I expect, but not necessarily what I want. I think what
> I would want is for it to do a token count of the merge commit, and if
> it fails to match _every_ parent, then it it interesting. Otherwise, the
> content presumably came from that parent.

I looked into this, and sadly the "wanted" behavior I described above is
not easy to do. It turns out that we never actually see the whole 3-way
diff as a single unit in diffcore-pickaxe. Instead, log-tree calls into
diff_tree_combined, which diffs each parent _individually_, including
running diffcore magic on it. And then if one of those appears
interesting, we show the merge.

So diffcore-pickaxe never even knows that we are doing a combined diff.
It just sees the difference between M and M^, and then separately the
difference between M and M^2. This works OK in my example:

  commit() {
    echo $1 >file && git add file && git commit -m $1
  }
  commit base
  commit master
  git checkout -b other HEAD^
  commit other
  git merge master
  commit resolved

as doing "git log -Sother -c" will show both the commit "other" _and_
the merge commit (since it removed "other" in favor of "resolved"). But
you could also construct a case where it isn't true. For example,
consider a case where two sides add the same token, and the resolution
is to keep both. E.g.:

  echo base >file && git add file && git commit -m base
  echo foo bar >file && git commit -a -m master
  git checkout -b other HEAD^
  echo foo baz >file && git commit -a -m other
  git merge master
  (echo foo bar; echo foo baz) >file && git commit -a -m resolved

That shows the merge commit, even though it didn't actually introduce or
delete that token at all. OTOH, it is part of a conflict region, so it
is really difficult to say whether it is interesting or not. I dunno
what the right semantics are (and note that the definition I gave in the
above email would also trigger on this case).

I have the nagging feeling there is another less ambiguous corner case
that is wrong, but I'm having trouble constructing one.


Anyway, the real point is that we can't do anything special to pickaxe
merge commits at the diffcore level without some pretty major diff
surgery. So where does that leave us? You can still get pretty
reasonable results from turning on "-c". I was curious what the CPU cost
was of turning "-c" on by default, and was very surprised by the
results (in git.git):

  $ time git log -Sfoo >/dev/null
  real    0m11.532s
  user    0m11.273s
  sys     0m0.116s

  $ time git log -c -Sfoo >/dev/null
  real    3m7.530s
  user    3m3.991s
  sys     0m2.948s

A 1700% slowdown? Wow. There are ~20000 non-merge commits in git.git and
~4500 merge commits. Each merge commit has two parents (since we don't
tend to octopus merge), each of which is diffed individually. So I'd
expect it to add about 9000 diffs, or roughly 50% on top of the
11-second case.

My guess is that the subtree merges from gitk and git-gui are very
expensive to look at, since from one parent's perspective we will have
created the entire git project from scratch. On every merge. Yikes.

-Peff

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2011-01-31 20:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-01-18 16:16 gitk "find commit adding/removing string"/possible pickaxe bug? Sebastian Hahn
2011-01-18 16:44 ` Thomas Rast
2011-01-18 18:39   ` Junio C Hamano
2011-01-18 18:50   ` Jeff King
2011-01-18 20:39     ` Thomas Rast
2011-01-18 20:50       ` Jeff King
2011-01-18 21:26         ` Junio C Hamano
2011-01-18 21:33           ` Jeff King
2011-01-31 20:00         ` Jeff King
2011-01-18 21:16       ` Thomas Rast

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.