All of lore.kernel.org
 help / color / mirror / Atom feed
* blame --reverse selecting wrong commit
@ 2011-05-30  2:21 Shawn Pearce
  2011-05-30  2:33 ` Shawn Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn Pearce @ 2011-05-30  2:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Although blame shows Stefan Lay removed the block in commit 05fa1713,
this isn't what happened. It was actually removed in commit 2302a6d3
by Christian Halstrick. It looks like blame gets confused around this
section of the JGit history.

Repository URL:  git://egit.eclipse.jgit/jgit.git

$ git blame -L 1080, --reverse 40fa75feb..master --
org.eclipse.jgit.test/tst/org/eclipse/jgit/storage/file/RefDirectoryTest.java
283a60d1 (Shawn O. Pearce 2011-05-26 17:25:59 -0700 1080)       }
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1081)
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1082)       /**
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1083)        *
Kick the timestamp of a local file.
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1084)        * <p>
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1085)        * We
shouldn't have to make these method calls. The cache is using file
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1086)        *
system timestamps, and on many systems unit tests run faster than the
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1087)        *
modification clock. Dumping the cache after we make an edit behind
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1088)        *
RefDirectory's back allows the tests to pass.
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1089)        *
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1090)        * @param name
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1091)        *
       the file in the repository to force a time change on.
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1092)        */
05fa1713 (Stefan Lay      2011-05-24 01:38:59 -0700 1093)
private void BUG_WorkAroundRacyGitIssues(String name) {


-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blame --reverse selecting wrong commit
  2011-05-30  2:21 blame --reverse selecting wrong commit Shawn Pearce
@ 2011-05-30  2:33 ` Shawn Pearce
  2011-05-30  3:11   ` Shawn Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn Pearce @ 2011-05-30  2:33 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, May 29, 2011 at 19:21, Shawn Pearce <spearce@spearce.org> wrote:
> Although blame shows Stefan Lay removed the block in commit 05fa1713,
> this isn't what happened. It was actually removed in commit 2302a6d3
> by Christian Halstrick. It looks like blame gets confused around this
> section of the JGit history.
>
> Repository URL:  git://egit.eclipse.jgit/jgit.git
>
> $ git blame -L 1080, --reverse 40fa75feb..master --
> org.eclipse.jgit.test/tst/org/eclipse/jgit/storage/file/RefDirectoryTest.java

Actually, you can use a narrower history range of 16e810b2..aa05559.

I found this glitch while working on the new JGit implementation of
reverse blame. It just so happens you and I both took the same rule on
a merge commit, follow the parent that has exactly the blob of the
merge itself... passing all blame onto it.

Except that in reverse mode this isn't true. One of the "parents" is
the descendent that deleted this method, so of course its not
identical content. The other parent is a different side-branch that
didn't touch the file at all. Later blame finds these branches merge
together, and lays the blame on the wrong side.

It seems that we may want to avoid "parents" that have identical file
content when in reverse mode if there is another parent that has
different content for the current target path.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blame --reverse selecting wrong commit
  2011-05-30  2:33 ` Shawn Pearce
@ 2011-05-30  3:11   ` Shawn Pearce
  2011-05-30  6:47     ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn Pearce @ 2011-05-30  3:11 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, May 29, 2011 at 19:33, Shawn Pearce <spearce@spearce.org> wrote:
> On Sun, May 29, 2011 at 19:21, Shawn Pearce <spearce@spearce.org> wrote:
>> Although blame shows Stefan Lay removed the block in commit 05fa1713,
>> this isn't what happened. It was actually removed in commit 2302a6d3
>> by Christian Halstrick. It looks like blame gets confused around this
>> section of the JGit history.

Oh. Re-reading the man page for git blame helps. It says blame shows
the last revision that a line exists in, rather than the revision that
removed the line. IMHO, I expected reverse to show me the revision
that deleted (or replaced) that line, so I can inspect the commit
message and the contents of the patch. Showing me one of the potential
parents of that revision seems to be nearly useless.

Rereading commit 85af7929ee ("git-blame --reverse"), it seems you left
this an "exercise for the reader"... and in the past 3 years, no
reader has stepped forward to implement the exercise as a patch to
blame. *sigh*

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blame --reverse selecting wrong commit
  2011-05-30  3:11   ` Shawn Pearce
@ 2011-05-30  6:47     ` Junio C Hamano
  2011-05-30 17:57       ` Shawn Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2011-05-30  6:47 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

> Rereading commit 85af7929ee ("git-blame --reverse"), it seems you left
> this an "exercise for the reader"... and in the past 3 years, no
> reader has stepped forward to implement the exercise as a patch to
> blame. *sigh*

Yeah.

It has always been my opinion that asking for "one commit past the blamed
one" is a undefined request (after all, blame for the line fell on that
commit exactly why the next commit does _not_ have any corresponding
line), so that is why I punted there.

Now we seem to have found one interested reader, eh? ;-)

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blame --reverse selecting wrong commit
  2011-05-30  6:47     ` Junio C Hamano
@ 2011-05-30 17:57       ` Shawn Pearce
  2011-05-30 20:01         ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn Pearce @ 2011-05-30 17:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Sun, May 29, 2011 at 23:47, Junio C Hamano <gitster@pobox.com> wrote:
> Shawn Pearce <spearce@spearce.org> writes:
>
>> Rereading commit 85af7929ee ("git-blame --reverse"), it seems you left
>> this an "exercise for the reader"... and in the past 3 years, no
>> reader has stepped forward to implement the exercise as a patch to
>> blame. *sigh*
>
> Yeah.
>
> It has always been my opinion that asking for "one commit past the blamed
> one" is a undefined request (after all, blame for the line fell on that
> commit exactly why the next commit does _not_ have any corresponding
> line), so that is why I punted there.

I don't think its undefined. Normally with blame/annotate we want to
discover who put this line here, that is who did the insertion or
replacement that made this line show up in the result file. Under
these circumstances its clear to everyone that this is the commit with
the "+" in its unified diff on that line. :-)

A reverse blame/annotate would want to say who removed this line in
history. That's very well defined by everyone as the commit with the
"-" in its unified diff on that line.

The notion of who deleted the line may seem less well defined than who
added it, because when you flip the history graph around you have
these forks exiting a single revision (an anti-merge as it were)...
and any of those forks could have caused the deletion in question. Or
all of them may have caused it. If more than one is responsible for
the deletion, who do you blame?

The same problem of multiple parties at fault exists in the normal
forward case too. When a normal merge commit is reached, both sides
may have added the exact same line independently... and the merge
result will now contain identical content. Currently blame chooses to
traverse only the first parent history in this case, ignoring the
other parents, and never showing that side branch. Its not any
different than the fork problem during deletion.

> Now we seem to have found one interested reader, eh? ;-)

Well, I implemented --reverse "correctly" in JGit last night. With my
patch series applied, `jgit blame --reverse` will produce annotations
for who deleted the line, rather than the last surviving revision.

The catch is, its slightly more expensive than forward because we pass
blame down *all* paths of a fork, rather than only the one that was
identical. This prevents pruning of side branches like we do in the
normal forward case, making for a larger amount of history to examine
during the reverse traversal. The incremental output also may produce
more than one "source" for a given line, if different side branches
deleted that line independently of one another. We smooth that out in
our final result object for the command line case by showing only the
earliest deletion, and discarding the other candidates.

I still need to write a bunch of unit tests around the Java code, but
the algorithm worked as expected on the section of graph I started
this thread with. It may be worth back-porting to C Git, but now that
--reverse is already shipping with its current "last surviving
revision" results, I'm not sure we can change the results without
causing some major confusion to users.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: blame --reverse selecting wrong commit
  2011-05-30 17:57       ` Shawn Pearce
@ 2011-05-30 20:01         ` Junio C Hamano
  0 siblings, 0 replies; 6+ messages in thread
From: Junio C Hamano @ 2011-05-30 20:01 UTC (permalink / raw)
  To: Shawn Pearce; +Cc: git

Shawn Pearce <spearce@spearce.org> writes:

>> Yeah.
>>
>> It has always been my opinion that asking for "one commit past the blamed
>> one" is a undefined request (after all, blame for the line fell on that
>> commit exactly why the next commit does _not_ have any corresponding
>> line), so that is why I punted there.
>
> I don't think its undefined. Normally with blame/annotate we want to
> discover who put this line here, that is who did the insertion or
> replacement that made this line show up in the result file.

Very true. The question that does not have a defined answer is "what
content was before this commit on that blamed line", and that is very
different from "what commit was there before this commit the line is
blamed for". The topic of this thread is the latter, and I was referring
the former which is offtopic because I was confused and tired.

> The catch is, its slightly more expensive than forward because we pass
> blame down *all* paths of a fork, rather than only the one that was
> identical.

Hmm, I don't know. Isn't it just the matter of running the current
"reverse blame" to find the newest revision that had the line (without any
"dig wider"), and then dig a single level wider by comparing that revision
and all its children to see which child has the line in question removed?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-30 20:01 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-30  2:21 blame --reverse selecting wrong commit Shawn Pearce
2011-05-30  2:33 ` Shawn Pearce
2011-05-30  3:11   ` Shawn Pearce
2011-05-30  6:47     ` Junio C Hamano
2011-05-30 17:57       ` Shawn Pearce
2011-05-30 20:01         ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.