All of lore.kernel.org
 help / color / mirror / Atom feed
From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Bo Yang <struggleyb.nku@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: GSoC draft proposal: Line-level history browser
Date: Sat, 20 Mar 2010 14:36:25 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.1.00.1003201432410.7596@pacific.mpi-cbg.de> (raw)
In-Reply-To: <41f08ee11003200610n2c7c9684l6ca200cffdfdb434@mail.gmail.com>

Hi,

[please do not cull the Cc: list]

On Sat, 20 Mar 2010, Bo Yang wrote:

> I (Johannes) wrote:
>
> > I think that that might be good for starters, but one could imagine 
> > that an integration into "git log" might be even better, so that gitk 
> > can use this without any further changes.
> 
> So, I think add some new options to 'git log' is preferred.

Yes, I think that this should be the target for the user interface. 
However, the logic should be different enough to merit a completely new 
file for the code (think "git add --interactive").

> > It would be good if the code looked harder after failing with the 
> > simple strategy, such as looking for code removed in other files, 
> > fuzzy matching (optional), and looking for code duplication (i.e. 
> > literal copying, or slightly modified copying).
> >
> > The fuzzy matching might be necessary to catch things like a Java 
> > class moving from one file into another (and changing its name): the 
> > first line changes, but not completely.
> 
> That's really a good idea.
> So, when the program reach the end of the history thread of some
> changes of line range, it should not stop immediately. It then should
> make a harder code search and try to find whether the new add lines of
> code is moved to there or just copied from other place to there. And
> these kind of search should use fuzzy matching instead of exact string
> matching.
> 
> But notice that, detect code movement in one commit is much efficient
> than detecting code copy. So, I think we should add an option to
> control whether we detect such kind of code copy. By default, we
> detect code move but not code copy. How do you think about this?

Yes, it is much more difficult, and it is more expensive. So: there are 
several steps in the project (you could also call them "milestones"), and 
fuzzy matching end lines would come later than simple code movement. And 
still later than code movement between files.

> > Just have a look at the word-level diff (--color-words):
> >
> > http://repo.or.cz/w/git/dscho.git/blob/bc1ed6aafd9ee4937559535c66c8bddf1864bec6:/diff.c#l382
> >
> > You will see that there is a function fn_out_diff_words_aux(), which 
> > is passed to xdi_diff_outf(). That latter function calls xdiff such 
> > that the former function receives a complete line at a time. And this 
> > is what I would suggest doing in the line-level log, too.
> 
> I have look over the function fn_out_diff_words_aux, this function parse 
> each line of a memory diff. We can use it to detect the diff hunk head 
> and find the line change. If you think the performance is acceptable, I 
> think using this callback mechanism is all right.

Yes, I think that the performance is alright there, it works well enough 
for --color-words.

Thanks,
Dscho

  parent reply	other threads:[~2010-03-20 13:35 UTC|newest]

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-03-20  9:18 GSoC draft proposal: Line-level history browser Bo Yang
2010-03-20 11:30 ` Johannes Schindelin
2010-03-20 13:10   ` Bo Yang
2010-03-20 13:30     ` Junio C Hamano
2010-03-21  6:03       ` Bo Yang
2010-03-20 13:36     ` Johannes Schindelin [this message]
2010-03-21  6:05       ` Bo Yang
2010-03-20 20:35 ` Alex Riesen
2010-03-20 20:57   ` Junio C Hamano
2010-03-21  6:10     ` Bo Yang
2010-03-20 21:58   ` A Large Angry SCM
2010-03-21  6:16     ` Bo Yang
2010-03-21 13:19       ` A Large Angry SCM
2010-03-22  3:48         ` Bo Yang
2010-03-22  4:24           ` Junio C Hamano
2010-03-22  4:34             ` Bo Yang
2010-03-22  5:32               ` Junio C Hamano
2010-03-22  7:31                 ` Bo Yang
2010-03-22  7:41                   ` Junio C Hamano
2010-03-22  7:52                     ` Bo Yang
2010-03-22  8:10                     ` Jonathan Nieder
2010-03-23  6:01                       ` Bo Yang
2010-03-23 10:08                         ` Jakub Narebski
2010-03-23 10:38                           ` Bo Yang
2010-03-23 11:22                             ` Jakub Narebski
2010-03-23 12:23                               ` Bo Yang
2010-03-23 13:49                                 ` Jakub Narebski
2010-03-23 15:23                                   ` Bo Yang
2010-03-23 19:57                                     ` Jonathan Nieder
2010-03-23 21:51                                       ` A Large Angry SCM
2010-03-24  2:30                                       ` Bo Yang
2010-03-23 12:02                             ` Peter Kjellerstedt
2010-03-23 18:57                         ` Jonathan Nieder
2010-03-24  2:39                           ` Bo Yang
2010-03-24  4:02                             ` Jonathan Nieder
2010-03-22 10:39                 ` Alex Riesen
2010-03-22 15:05                   ` Johannes Schindelin
2010-03-22  3:52         ` Bo Yang
2010-03-22 15:48           ` Jakub Narebski
2010-03-22 18:21             ` Johannes Schindelin
2010-03-22 18:38               ` Sverre Rabbelier
2010-03-22 19:26                 ` Johannes Schindelin
2010-03-22 20:21                   ` Sverre Rabbelier
2010-03-22 19:24           ` Johannes Schindelin
2010-03-23  6:08             ` Bo Yang
2010-03-23  6:27             ` Bo Yang
     [not found]           ` <201003282120.40536.trast@student.ethz.ch>
2010-03-29  4:14             ` Bo Yang
2010-03-29 18:42               ` Thomas Rast
2010-03-30  2:52                 ` Bo Yang
2010-03-30  9:07                   ` Michael J Gruber
2010-03-30  9:38                     ` Michael J Gruber
2010-03-30 11:10                     ` Bo Yang
2010-03-30  9:10                   ` Jakub Narebski
2010-03-30 11:15                     ` Bo Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.1.00.1003201432410.7596@pacific.mpi-cbg.de \
    --to=johannes.schindelin@gmx.de \
    --cc=git@vger.kernel.org \
    --cc=struggleyb.nku@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.