All of lore.kernel.org
 help / color / mirror / Atom feed
* rename tracking and file-name swapping
@ 2009-09-13 11:17 Yuri D'Elia
  2009-09-13 18:14 ` Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Yuri D'Elia @ 2009-09-13 11:17 UTC (permalink / raw)
  To: git

Hi everyone. Does rename tracking recognize two file names being swapped?

% ls -l
total 24
-rw-rw-r--   1 wavexx  wavexx  5952 Sep 13 13:09 file1.txt
-rw-rw-r--   1 wavexx  wavexx  3330 Sep 13 13:09 file2.txt
% mv file1.txt file3.txt 
% mv file2.txt file1.txt
% mv file3.txt file2.txt
% git add file1.txt file2.txt 
% git diff -M --stat --cached
 file1.txt |  150 +++++++++++++++++++++++-------------------------------------
 file2.txt |  150 +++++++++++++++++++++++++++++++++++++-----------------------
 2 files changed, 150 insertions(+), 150 deletions(-)

I would expect at least to see

  file1.txt => file2.txt
  file2.txt => file1.txt

if the contents are totally unchanged.

Am I doing something wrong?

Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rename tracking and file-name swapping
  2009-09-13 11:17 rename tracking and file-name swapping Yuri D'Elia
@ 2009-09-13 18:14 ` Junio C Hamano
  2009-09-13 21:13   ` Yuri D'Elia
  2009-09-15  3:34   ` [Bug?] "diff -B --color" output doesn't show space errors Nanako Shiraishi
  0 siblings, 2 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-13 18:14 UTC (permalink / raw)
  To: Yuri D'Elia; +Cc: git

Yuri D'Elia <wavexx@users.sf.net> writes:

> Hi everyone. Does rename tracking recognize two file names being swapped?
>
> % ls -l
> total 24
> -rw-rw-r--   1 wavexx  wavexx  5952 Sep 13 13:09 file1.txt
> -rw-rw-r--   1 wavexx  wavexx  3330 Sep 13 13:09 file2.txt
> % mv file1.txt file3.txt 
> % mv file2.txt file1.txt
> % mv file3.txt file2.txt
> % git add file1.txt file2.txt 
> % git diff -M --stat --cached
>  file1.txt |  150 +++++++++++++++++++++++-------------------------------------
>  file2.txt |  150 +++++++++++++++++++++++++++++++++++++-----------------------
>  2 files changed, 150 insertions(+), 150 deletions(-)

By default, if the pathname that was present in the old version still
appears in the new version, that path is not considered as a candiate
for rename detection.  Only "X used to be there but is gone" and "Y did
not exist but appeared" are paired up and checked if they are similar.

Give the command -B option, too, to break the filepair that does not
disappear.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rename tracking and file-name swapping
  2009-09-13 18:14 ` Junio C Hamano
@ 2009-09-13 21:13   ` Yuri D'Elia
  2009-09-14  6:44     ` Johannes Sixt
  2009-09-15  3:34   ` [Bug?] "diff -B --color" output doesn't show space errors Nanako Shiraishi
  1 sibling, 1 reply; 7+ messages in thread
From: Yuri D'Elia @ 2009-09-13 21:13 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Bernardo Innocenti

On 13 Sep 2009, at 20:14, Junio C Hamano wrote:

>> % mv file1.txt file3.txt
>> % mv file2.txt file1.txt
>> % mv file3.txt file2.txt
>> % git add file1.txt file2.txt
>> % git diff -M --stat --cached
>>  file1.txt |  150 ++++++++++++++++++++++ 
>> +-------------------------------------
>>  file2.txt |  150 ++++++++++++++++++++++++++++++++++++ 
>> +-----------------------
>>  2 files changed, 150 insertions(+), 150 deletions(-)
>
> By default, if the pathname that was present in the old version still
> appears in the new version, that path is not considered as a candiate
> for rename detection.  Only "X used to be there but is gone" and "Y  
> did
> not exist but appeared" are paired up and checked if they are similar.
>
> Give the command -B option, too, to break the filepair that does not
> disappear.

That does the trick. I'm curious, is there any other use for -B  
besides rename handling?
Any reason of why it isn't a default when --find-copies-harder is in  
effect?

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: rename tracking and file-name swapping
  2009-09-13 21:13   ` Yuri D'Elia
@ 2009-09-14  6:44     ` Johannes Sixt
  0 siblings, 0 replies; 7+ messages in thread
From: Johannes Sixt @ 2009-09-14  6:44 UTC (permalink / raw)
  To: Yuri D'Elia; +Cc: Junio C Hamano, git, Bernardo Innocenti

Yuri D'Elia schrieb:
> On 13 Sep 2009, at 20:14, Junio C Hamano wrote:
>> By default, if the pathname that was present in the old version still
>> appears in the new version, that path is not considered as a candiate
>> for rename detection.  Only "X used to be there but is gone" and "Y did
>> not exist but appeared" are paired up and checked if they are similar.
>>
>> Give the command -B option, too, to break the filepair that does not
>> disappear.
> 
> That does the trick. I'm curious, is there any other use for -B besides
> rename handling?

Yes: It can make patches easier to read (just like -M and -C do) if a file
was completely rewritten. For example, look at b9dfe51c with and without
-B, and also note the "dissimilarity index" in the diff header.

-- Hannes

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [Bug?] "diff -B --color" output doesn't show space errors
  2009-09-13 18:14 ` Junio C Hamano
  2009-09-13 21:13   ` Yuri D'Elia
@ 2009-09-15  3:34   ` Nanako Shiraishi
  2009-09-15  4:44     ` Junio C Hamano
  1 sibling, 1 reply; 7+ messages in thread
From: Nanako Shiraishi @ 2009-09-15  3:34 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Yuri D'Elia, git

Quoting Junio C Hamano <gitster@pobox.com>

> By default, if the pathname that was present in the old version still
> appears in the new version, that path is not considered as a candiate
> for rename detection.  Only "X used to be there but is gone" and "Y did
> not exist but appeared" are paired up and checked if they are similar.
>
> Give the command -B option, too, to break the filepair that does not
> disappear.

I wanted to try this -B option, and wrote a little test program.

While it shows correctly that the file was rewritten, it doesn't
point out various whitespace mistakes in the file anymore.

Is this a bug, or should I give some other options as well?

-- >8 -- cut here -- 8< --
git init

cat >file <<"EOF"
This is an article that will be
completely rewritten in a
later commit.
EOF

git add file

sed -e "s/T/\t/g" -e "s/_/ /g" >file <<"EOF"
An article was written but it was_
later rewritten to be_
a completely different text.
_____
An article was written but it was_
later rewritten to be_
a completely different text.

An article was written but it was_
later rewritten to be_
a completely different text.

Worse yet, the replacement text_
introduces a lot of
_Twhite space errors_
such as SP before HT and trailing
whitespaces, when the file was modified by the 
later commit.

Also there are trailing empty lines at the end of the file.



EOF

git diff --color
git diff --color -B
# end of test program

-- 
Nanako Shiraishi
http://ivory.ap.teacup.com/nanako3/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [Bug?] "diff -B --color" output doesn't show space errors
  2009-09-15  3:34   ` [Bug?] "diff -B --color" output doesn't show space errors Nanako Shiraishi
@ 2009-09-15  4:44     ` Junio C Hamano
  2009-09-15  5:05       ` [PATCH] diff --whitespace: fix blank lines at end Junio C Hamano
  0 siblings, 1 reply; 7+ messages in thread
From: Junio C Hamano @ 2009-09-15  4:44 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Yuri D'Elia, git

Nanako Shiraishi <nanako3@lavabit.com> writes:

> I wanted to try this -B option, and wrote a little test program.
>
> While it shows correctly that the file was rewritten, it doesn't
> point out various whitespace mistakes in the file anymore.

As you included the trailing blank lines, I assume you are running
'next'.

The code to emit complete rewrite patch hasn't changed much since it was
written, and I do not think it is aware of any whitespace error checking,
let alone the "trailing blank lines", which is pretty new for even the
regular diff codepath.

But a more interesting thing about your test program is that it exposes to
a bug in the new code in next.  Let me cook up a patch to fix that issue
first, and then build probably a few more patches on it to add whitespace
error highlighting to the complete-rewrite codepath.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH] diff --whitespace: fix blank lines at end
  2009-09-15  4:44     ` Junio C Hamano
@ 2009-09-15  5:05       ` Junio C Hamano
  0 siblings, 0 replies; 7+ messages in thread
From: Junio C Hamano @ 2009-09-15  5:05 UTC (permalink / raw)
  To: Nanako Shiraishi; +Cc: Yuri D'Elia, git

The earlier logic tried to colour any and all blank lines that were added
beyond the last blank line in the original, but this was very wrong.  If
you added 96 blank lines, a non-blank line, and then 3 blank lines at the
end, only the last 3 lines should trigger the error, not the earlier 96
blank lines.

We need to also make sure that the lines are after the last non-blank line
in the postimage as well before deciding to paint them.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * This is _not_ about colouring whitespace errors in -B output;
   that is a bit more involved change that needs a few more changes.

 diff.c                  |   74 +++++++++++++++++++++++++++++++++-------------
 t/t4019-diff-wserror.sh |    2 +-
 2 files changed, 54 insertions(+), 22 deletions(-)

diff --git a/diff.c b/diff.c
index 2b285b8..63a3bfc 100644
--- a/diff.c
+++ b/diff.c
@@ -491,8 +491,10 @@ struct emit_callback {
 	struct xdiff_emit_state xm;
 	int color_diff;
 	unsigned ws_rule;
-	int blank_at_eof;
+	int blank_at_eof_in_preimage;
+	int blank_at_eof_in_postimage;
 	int lno_in_preimage;
+	int lno_in_postimage;
 	sane_truncate_fn truncate;
 	const char **label_path;
 	struct diff_words_data *diff_words;
@@ -542,6 +544,17 @@ static void emit_line(FILE *file, const char *set, const char *reset, const char
 		fputc('\n', file);
 }
 
+static int new_blank_line_at_eof(struct emit_callback *ecbdata, const char *line, int len)
+{
+	if (!((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
+	      ecbdata->blank_at_eof_in_preimage &&
+	      ecbdata->blank_at_eof_in_postimage &&
+	      ecbdata->blank_at_eof_in_preimage <= ecbdata->lno_in_preimage &&
+	      ecbdata->blank_at_eof_in_postimage <= ecbdata->lno_in_postimage))
+		return 0;
+	return ws_blank_line(line + 1, len - 1, ecbdata->ws_rule);
+}
+
 static void emit_add_line(const char *reset, struct emit_callback *ecbdata, const char *line, int len)
 {
 	const char *ws = diff_get_color(ecbdata->color_diff, DIFF_WHITESPACE);
@@ -549,11 +562,8 @@ static void emit_add_line(const char *reset, struct emit_callback *ecbdata, cons
 
 	if (!*ws)
 		emit_line(ecbdata->file, set, reset, line, len);
-	else if ((ecbdata->ws_rule & WS_BLANK_AT_EOF) &&
-		 ecbdata->blank_at_eof &&
-		 (ecbdata->blank_at_eof <= ecbdata->lno_in_preimage) &&
-		 ws_blank_line(line + 1, len - 1, ecbdata->ws_rule))
-		/* Blank line at EOF */
+	else if (new_blank_line_at_eof(ecbdata, line, len))
+		/* Blank line at EOF - paint '+' as well */
 		emit_line(ecbdata->file, ws, reset, line, len);
 	else {
 		/* Emit just the prefix, then the rest. */
@@ -581,12 +591,19 @@ static unsigned long sane_truncate_line(struct emit_callback *ecb, char *line, u
 	return allot - l;
 }
 
-static int find_preimage_lno(const char *line)
+static void find_lno(const char *line, struct emit_callback *ecbdata)
 {
-	char *p = strchr(line, '-');
+	const char *p;
+	ecbdata->lno_in_preimage = 0;
+	ecbdata->lno_in_postimage = 0;
+	p = strchr(line, '-');
 	if (!p)
-		return 0; /* should not happen */
-	return strtol(p+1, NULL, 10);
+		return; /* cannot happen */
+	ecbdata->lno_in_preimage = strtol(p + 1, NULL, 10);
+	p = strchr(p, '+');
+	if (!p)
+		return; /* cannot happen */
+	ecbdata->lno_in_postimage = strtol(p + 1, NULL, 10);
 }
 
 static void fn_out_consume(void *priv, char *line, unsigned long len)
@@ -613,7 +630,7 @@ static void fn_out_consume(void *priv, char *line, unsigned long len)
 
 	if (line[0] == '@') {
 		len = sane_truncate_line(ecbdata, line, len);
-		ecbdata->lno_in_preimage = find_preimage_lno(line);
+		find_lno(line, ecbdata);
 		emit_line(ecbdata->file,
 			  diff_get_color(ecbdata->color_diff, DIFF_FRAGINFO),
 			  reset, line, len);
@@ -651,10 +668,13 @@ static void fn_out_consume(void *priv, char *line, unsigned long len)
 			diff_get_color(ecbdata->color_diff,
 				       line[0] == '-' ? DIFF_FILE_OLD : DIFF_PLAIN);
 		ecbdata->lno_in_preimage++;
+		if (line[0] == ' ')
+			ecbdata->lno_in_postimage++;
 		emit_line(ecbdata->file, color, reset, line, len);
-		return;
+	} else {
+		ecbdata->lno_in_postimage++;
+		emit_add_line(reset, ecbdata, line, len);
 	}
-	emit_add_line(reset, ecbdata, line, len);
 }
 
 static char *pprint_rename(const char *a, const char *b)
@@ -1470,16 +1490,23 @@ static int count_trailing_blank(mmfile_t *mf, unsigned ws_rule)
 	return cnt;
 }
 
-static int adds_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2, unsigned ws_rule)
+static void check_blank_at_eof(mmfile_t *mf1, mmfile_t *mf2,
+			       struct emit_callback *ecbdata)
 {
 	int l1, l2, at;
+	unsigned ws_rule = ecbdata->ws_rule;
 	l1 = count_trailing_blank(mf1, ws_rule);
 	l2 = count_trailing_blank(mf2, ws_rule);
-	if (l2 <= l1)
-		return 0;
-	/* starting where? */
+	if (l2 <= l1) {
+		ecbdata->blank_at_eof_in_preimage = 0;
+		ecbdata->blank_at_eof_in_postimage = 0;
+		return;
+	}
 	at = count_lines(mf1->ptr, mf1->size);
-	return (at - l1) + 1; /* the line number counts from 1 */
+	ecbdata->blank_at_eof_in_preimage = (at - l1) + 1;
+
+	at = count_lines(mf2->ptr, mf2->size);
+	ecbdata->blank_at_eof_in_postimage = (at - l2) + 1;
 }
 
 static void builtin_diff(const char *name_a,
@@ -1572,8 +1599,7 @@ static void builtin_diff(const char *name_a,
 		ecbdata.found_changesp = &o->found_changes;
 		ecbdata.ws_rule = whitespace_rule(name_b ? name_b : name_a);
 		if (ecbdata.ws_rule & WS_BLANK_AT_EOF)
-			ecbdata.blank_at_eof =
-				adds_blank_at_eof(&mf1, &mf2, ecbdata.ws_rule);
+			check_blank_at_eof(&mf1, &mf2, &ecbdata);
 		ecbdata.file = o->file;
 		xpp.flags = XDF_NEED_MINIMAL | o->xdl_opts;
 		xecfg.ctxlen = o->context;
@@ -1699,7 +1725,13 @@ static void builtin_checkdiff(const char *name_a, const char *name_b,
 		xdi_diff(&mf1, &mf2, &xpp, &xecfg, &ecb);
 
 		if (data.ws_rule & WS_BLANK_AT_EOF) {
-			int blank_at_eof = adds_blank_at_eof(&mf1, &mf2, data.ws_rule);
+			struct emit_callback ecbdata;
+			int blank_at_eof;
+
+			ecbdata.ws_rule = data.ws_rule;
+			check_blank_at_eof(&mf1, &mf2, &ecbdata);
+			blank_at_eof = ecbdata.blank_at_eof_in_preimage;
+
 			if (blank_at_eof) {
 				static char *err;
 				if (!err)
diff --git a/t/t4019-diff-wserror.sh b/t/t4019-diff-wserror.sh
index 1e75f1a..3a3663f 100755
--- a/t/t4019-diff-wserror.sh
+++ b/t/t4019-diff-wserror.sh
@@ -193,7 +193,7 @@ test_expect_success 'do not color trailing cr in context' '
 test_expect_success 'color new trailing blank lines' '
 	{ echo a; echo b; echo; echo; } >x &&
 	git add x &&
-	{ echo a; echo; echo; echo; echo; } >x &&
+	{ echo a; echo; echo; echo; echo c; echo; echo; echo; echo; } >x &&
 	git diff --color x >output &&
 	cnt=$(grep "${blue_grep}" output | wc -l) &&
 	test $cnt = 2
-- 
1.6.5.rc1.54.g4aad

^ permalink raw reply related	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-09-15  5:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-09-13 11:17 rename tracking and file-name swapping Yuri D'Elia
2009-09-13 18:14 ` Junio C Hamano
2009-09-13 21:13   ` Yuri D'Elia
2009-09-14  6:44     ` Johannes Sixt
2009-09-15  3:34   ` [Bug?] "diff -B --color" output doesn't show space errors Nanako Shiraishi
2009-09-15  4:44     ` Junio C Hamano
2009-09-15  5:05       ` [PATCH] diff --whitespace: fix blank lines at end Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.