All of lore.kernel.org
 help / color / mirror / Atom feed
* trim_common_tail bug?
@ 2007-12-15 11:16 Jeff King
  2007-12-15 15:51 ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-15 11:16 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

Something seems to be not quite right with the trim_common_tail code in
xdiff-interface.c. The diff I just sent for contrib/completion looks
fine (as I sent it) when I comment out trim_common_tail. But using
the current master, it looks like this:

diff --git a/contrib/completion/git-completion.bash b/contrib/completion/git-completion.bash
index 58e0e53..2fd32db 100755
--- a/contrib/completion/git-completion.bash
+++ b/contrib/completion/git-completion.bash
@@ -291,7 +291,7 @@ __git_commands ()
 	for i in $(git help -a|egrep '^ ')
 	do
 		case $i in
-		add--interactive) : plumbing;;
+		*--*)             : plumbing pattern;;
 		applymbox)        : ask gittus;;
 		applypatch)       : ask gittus;;
 		archimport)       : import;;
@@ -307,5 +307,4 @@ __git_commands ()
 		diff-index)       : plumbing;;
 		diff-tree)        : plumbing;;
 		fast-import)      : import;;
-		fsck-objects)     : plumbing;;
-		fetch--tool)      : plumbing;;
\ No newline at end of file
+		fsck-objects)     : plumbing;;
\ No newline at end of file

-Peff

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-15 11:16 trim_common_tail bug? Jeff King
@ 2007-12-15 15:51 ` Jeff King
  2007-12-15 17:49   ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-15 15:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sat, Dec 15, 2007 at 06:16:21AM -0500, Jeff King wrote:

> Something seems to be not quite right with the trim_common_tail code in
> xdiff-interface.c. The diff I just sent for contrib/completion looks
> fine (as I sent it) when I comment out trim_common_tail. But using
> the current master, it looks like this:

Hrm. It feels like there is an off-by-one when recovering the context
lines; we end the inner loop at the newline, but then we never skip past
it. So we end up counting only one line total, and even that ends up
with a missing newline. Something like this fixes it for me, but
somebody please double check.

diff --git a/xdiff-interface.c b/xdiff-interface.c
index 700def2..eb60e88 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -124,6 +124,8 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
 	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
 		while (recovered < trimmed && ap[recovered] != '\n')
 			recovered++;
+		if (recovered < trimmed && ap[recovered] == '\n')
+			recovered++;
 	}
 	a->size -= (trimmed - recovered);
 	b->size -= (trimmed - recovered);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-15 15:51 ` Jeff King
@ 2007-12-15 17:49   ` Junio C Hamano
  2007-12-15 20:02     ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2007-12-15 17:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Jeff King <peff@peff.net> writes:

> diff --git a/xdiff-interface.c b/xdiff-interface.c
> index 700def2..eb60e88 100644
> --- a/xdiff-interface.c
> +++ b/xdiff-interface.c
> @@ -124,6 +124,8 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
>  	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
>  		while (recovered < trimmed && ap[recovered] != '\n')
>  			recovered++;
> +		if (recovered < trimmed && ap[recovered] == '\n')
> +			recovered++;
>  	}
>  	a->size -= (trimmed - recovered);
>  	b->size -= (trimmed - recovered);

Shoot.  Thanks for spotting.

Wouldn't it be enough to do:

  	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
		while (recovered < trimmed && ap[recovered++] != '\n')
	        	; /* nothing */
	}

then (warning: I haven't had my coffee yet)?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-15 17:49   ` Junio C Hamano
@ 2007-12-15 20:02     ` Jeff King
  2007-12-16  7:06       ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-15 20:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sat, Dec 15, 2007 at 09:49:06AM -0800, Junio C Hamano wrote:

> Shoot.  Thanks for spotting.
> 
> Wouldn't it be enough to do:
> 
>   	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
> 		while (recovered < trimmed && ap[recovered++] != '\n')
> 	        	; /* nothing */
> 	}
> 
> then (warning: I haven't had my coffee yet)?

Yes, I think that is equivalent. My sleep-deprived brain keeps thinking
there must be a more clear way of writing this whole loop, but it
escapes me at the moment.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-15 20:02     ` Jeff King
@ 2007-12-16  7:06       ` Jeff King
  2007-12-16 19:43         ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-16  7:06 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sat, Dec 15, 2007 at 03:02:02PM -0500, Jeff King wrote:

> >   	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
> > 		while (recovered < trimmed && ap[recovered++] != '\n')
> > 	        	; /* nothing */
> > 	}
> > 
> > then (warning: I haven't had my coffee yet)?
> 
> Yes, I think that is equivalent. My sleep-deprived brain keeps thinking
> there must be a more clear way of writing this whole loop, but it
> escapes me at the moment.

And this came to me in a dream. :) It fixes the bug, and I think it is a
bit simpler to see the termination conditions in a single loop. But
please double-check correctness, and that you agree that it is more
readable.

---
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 700def2..98b02ed 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -110,7 +110,7 @@ int xdiff_outf(void *priv_, mmbuffer_t *mb, int nbuf)
 static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
 {
 	const int blk = 1024;
-	long trimmed = 0, recovered = 0, i;
+	long trimmed = 0, recovered = 0;
 	char *ap = a->ptr + a->size;
 	char *bp = b->ptr + b->size;
 	long smaller = (a->size < b->size) ? a->size : b->size;
@@ -121,10 +121,9 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
 		bp -= blk;
 	}
 
-	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
-		while (recovered < trimmed && ap[recovered] != '\n')
-			recovered++;
-	}
+	while (recovered < trimmed && ctx)
+		if (ap[recovered++] == '\n')
+			ctx--;
 	a->size -= (trimmed - recovered);
 	b->size -= (trimmed - recovered);
 }

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-16  7:06       ` Jeff King
@ 2007-12-16 19:43         ` Junio C Hamano
  2007-12-16 21:16           ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2007-12-16 19:43 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Jeff King <peff@peff.net> writes:

> On Sat, Dec 15, 2007 at 03:02:02PM -0500, Jeff King wrote:
>
>> >   	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
>> > 		while (recovered < trimmed && ap[recovered++] != '\n')
>> > 	        	; /* nothing */
>> > 	}
>> > 
>> > then (warning: I haven't had my coffee yet)?
>> 
>> Yes, I think that is equivalent. My sleep-deprived brain keeps thinking
>> there must be a more clear way of writing this whole loop, but it
>> escapes me at the moment.
>
> And this came to me in a dream. :) It fixes the bug, and I think it is a
> bit simpler to see the termination conditions in a single loop. But
> please double-check correctness, and that you agree that it is more
> readable.

I wanted to recover to the end of the line that includes the cut-off
point, even when (ctx == 0), to be extra safer, but I do not think it
was necessary.

> ...
>  		bp -= blk;
>  	}
>  
> +	while (recovered < trimmed && ctx)
> +		if (ap[recovered++] == '\n')
> +			ctx--;
>  	a->size -= (trimmed - recovered);
>  	b->size -= (trimmed - recovered);
>  }

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-16 19:43         ` Junio C Hamano
@ 2007-12-16 21:16           ` Junio C Hamano
  2007-12-16 21:21             ` Jeff King
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2007-12-16 21:16 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Junio C Hamano <gitster@pobox.com> writes:

> Jeff King <peff@peff.net> writes:
>
>> On Sat, Dec 15, 2007 at 03:02:02PM -0500, Jeff King wrote:
>>
>>> >   	for (i = 0, recovered = 0; recovered < trimmed && i <= ctx; i++) {
>>> > 		while (recovered < trimmed && ap[recovered++] != '\n')
>>> > 	        	; /* nothing */
>>> > 	}
>>> > 
>>> > then (warning: I haven't had my coffee yet)?
>>> 
>>> Yes, I think that is equivalent. My sleep-deprived brain keeps thinking
>>> there must be a more clear way of writing this whole loop, but it
>>> escapes me at the moment.
>>
>> And this came to me in a dream. :) It fixes the bug, and I think it is a
>> bit simpler to see the termination conditions in a single loop. But
>> please double-check correctness, and that you agree that it is more
>> readable.
>
> I wanted to recover to the end of the line that includes the cut-off
> point, even when (ctx == 0), to be extra safer, but I do not think it
> was necessary.

I have to wonder what happens if the last line which is incomplete has
more than block bytes that are identical at the end?  We need to recover
the whole thing to show the correct diff, don't we?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: trim_common_tail bug?
  2007-12-16 21:16           ` Junio C Hamano
@ 2007-12-16 21:21             ` Jeff King
  2007-12-16 21:49               ` [PATCH] Re-re-re-fix common tail optimization Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-16 21:21 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sun, Dec 16, 2007 at 01:16:36PM -0800, Junio C Hamano wrote:

> > I wanted to recover to the end of the line that includes the cut-off
> > point, even when (ctx == 0), to be extra safer, but I do not think it
> > was necessary.
> 
> I have to wonder what happens if the last line which is incomplete has
> more than block bytes that are identical at the end?  We need to recover
> the whole thing to show the correct diff, don't we?

No, I think it's right as-is. We forget about the blocks during the
recovery section of the code. IOW, we just keep reading forward until we
find all of the context lines, or we run out of trimmed content. In the
first case, we are fine (we restored the right number of context lines).
In the latter case, we are also fine, because we end up trimming nothing
(IOW, there _weren't_ enough context lines in the first place).

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 21:21             ` Jeff King
@ 2007-12-16 21:49               ` Junio C Hamano
  2007-12-16 22:15                 ` Jeff King
  2007-12-19 14:18                 ` Charles Bailey
  0 siblings, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2007-12-16 21:49 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Jeff King <peff@peff.net> writes:

> No, I think it's right as-is. We forget about the blocks during the
> recovery section of the code. IOW, we just keep reading forward until we
> find all of the context lines, or we run out of trimmed content. In the
> first case, we are fine (we restored the right number of context lines).
> In the latter case, we are also fine, because we end up trimming nothing
> (IOW, there _weren't_ enough context lines in the first place).

Kind'a embarrassing that both of us cannot get this right without so
many rounds, isn't it?

-- >8 --
Subject: [PATCH] Re-re-re-fix common tail optimization

We need to be extra careful recovering the removed common section, so
that we do not break context nor the changed incomplete line (i.e. the
last line that does not end with LF).

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 t/t4024-diff-optimize-common.sh |   69 +++++++++++++++++++++++++++++++++++++++
 xdiff-interface.c               |    2 +-
 2 files changed, 70 insertions(+), 1 deletions(-)

diff --git a/t/t4024-diff-optimize-common.sh b/t/t4024-diff-optimize-common.sh
new file mode 100755
index 0000000..10405f1
--- /dev/null
+++ b/t/t4024-diff-optimize-common.sh
@@ -0,0 +1,69 @@
+#!/bin/sh
+
+test_description='common tail optimization'
+
+. ./test-lib.sh
+
+z=zzzzzzzz ;# 8
+z="$z$z$z$z$z$z$z$z" ;# 64
+z="$z$z$z$z$z$z$z$z" ;# 512
+z="$z$z$z$z" ;# 2048
+z2047=$(expr "$z" : '.\(.*\)') ; #2047
+
+test_expect_success setup '
+
+	echo "a$z2047" >file-a &&
+	echo "b" >file-b &&
+	echo "$z2047" >>file-b &&
+	echo "c$z2047" | tr -d "\012" >file-c &&
+	echo "d" >file-d &&
+	echo "$z2047" | tr -d "\012" >>file-d &&
+
+	git add file-a file-b file-c file-d &&
+
+	echo "A$z2047" >file-a &&
+	echo "B" >file-b &&
+	echo "$z2047" >>file-b &&
+	echo "C$z2047" | tr -d "\012" >file-c &&
+	echo "D" >file-d &&
+	echo "$z2047" | tr -d "\012" >>file-d
+
+'
+
+echo >expect <<\EOF
+diff --git a/file-a b/file-a
+--- a/file-a
++++ b/file-a
+@@ -1 +1 @@
+-aZ
++AZ
+diff --git a/file-b b/file-b
+--- a/file-b
++++ b/file-b
+@@ -1 +1 @@
+-b
++B
+diff --git a/file-c b/file-c
+--- a/file-c
++++ b/file-c
+@@ -1 +1 @@
+-cZ
+\ No newline at end of file
++CZ
+\ No newline at end of file
+diff --git a/file-d b/file-d
+--- a/file-d
++++ b/file-d
+@@ -1 +1 @@
+-d
++D
+EOF
+
+test_expect_success 'diff -U0' '
+
+	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
+	diff -u expect actual
+
+'
+
+test_done
diff --git a/xdiff-interface.c b/xdiff-interface.c
index 98b02ed..9ee877c 100644
--- a/xdiff-interface.c
+++ b/xdiff-interface.c
@@ -121,7 +121,7 @@ static void trim_common_tail(mmfile_t *a, mmfile_t *b, long ctx)
 		bp -= blk;
 	}
 
-	while (recovered < trimmed && ctx)
+	while (recovered < trimmed && 0 <= ctx)
 		if (ap[recovered++] == '\n')
 			ctx--;
 	a->size -= (trimmed - recovered);

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 21:49               ` [PATCH] Re-re-re-fix common tail optimization Junio C Hamano
@ 2007-12-16 22:15                 ` Jeff King
  2007-12-16 22:23                   ` Junio C Hamano
  2007-12-19 14:18                 ` Charles Bailey
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-16 22:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sun, Dec 16, 2007 at 01:49:17PM -0800, Junio C Hamano wrote:

> Kind'a embarrassing that both of us cannot get this right without so
> many rounds, isn't it?

Good thing we can just delete the emails and nobody will be the wiser...

> +echo >expect <<\EOF

This would probably work better as 'cat'.

> +test_expect_success 'diff -U0' '
> +
> +	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
> +	diff -u expect actual

Aren't we using "git diff" for the second diff there nowadays?

> -	while (recovered < trimmed && ctx)
> +	while (recovered < trimmed && 0 <= ctx)
>  		if (ap[recovered++] == '\n')
>  			ctx--;
>  	a->size -= (trimmed - recovered);

Oops (I think maybe I misunderstood what you were asking in the last
email). This fix is correct, though the code is now kind of subtle. I
think it would be more obvious as:

  /* finish off any changed line we are in */
  while (recovered < trimmed && ap[recovered++] != '\n')
    /* nothing */;
  /* recover context lines */
  while (recovered < trimmed && ctx)
    if (ap[recovered++] == '\n')
      ctx--;

Your loop does both actions in the same loop, which is correct, but took
me 10 minutes of thinking and staring to realize what was going on.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 22:15                 ` Jeff King
@ 2007-12-16 22:23                   ` Junio C Hamano
  2007-12-16 22:28                     ` Junio C Hamano
  2007-12-16 22:29                     ` Jeff King
  0 siblings, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2007-12-16 22:23 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Jeff King <peff@peff.net> writes:

> This would probably work better as 'cat'.

Yeah, I amended it without adding another "re-" to the title ;-)  The
result has been already pushed out.

>> +test_expect_success 'diff -U0' '
>> +
>> +	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
>> +	diff -u expect actual
>
> Aren't we using "git diff" for the second diff there nowadays?

Some people seem to think that is a good idea, but I generally do not
like using "git diff" between expect and actual (both untracked) inside
tests.  The last "diff" is about validating what git does and using "git
diff" there would make the test meaningless when "git diff" itself is
broken.

This is especially so because comparison between untracked files is a
bolted-on afterthought and I am least confident about among the
codepaths in the whole "git diff" (it is not even my nor Linus's code).

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 22:23                   ` Junio C Hamano
@ 2007-12-16 22:28                     ` Junio C Hamano
  2007-12-16 22:29                     ` Jeff King
  1 sibling, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2007-12-16 22:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Linus Torvalds, git

Junio C Hamano <gitster@pobox.com> writes:

>>> +test_expect_success 'diff -U0' '
>>> +
>>> +	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
>>> +	diff -u expect actual
>>
>> Aren't we using "git diff" for the second diff there nowadays?
>
> Some people seem to think that is a good idea, but I generally do not
> like using "git diff" between expect and actual (both untracked) inside
> tests.  The last "diff" is about validating what git does and using "git
> diff" there would make the test meaningless when "git diff" itself is
> broken.
>
> This is especially so because comparison between untracked files is a
> bolted-on afterthought and I am least confident about among the
> codepaths in the whole "git diff" (it is not even my nor Linus's code).

Side note.  The "confidence" I am talking about the above is not about
the correct-working of the current code.  It seems to work fine.

It is about the fact it was bolted on rather than designed in from the
beginning---it is much likely to subtly break than other parts that are
much more integrated when you change seemingly unrelated thing like
git-dir discovery and rename detection.

IOW, the confidence is about the fixability/maintainability when
somebody breaks it.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 22:23                   ` Junio C Hamano
  2007-12-16 22:28                     ` Junio C Hamano
@ 2007-12-16 22:29                     ` Jeff King
  2007-12-17  8:42                       ` Wincent Colaiuta
  1 sibling, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-16 22:29 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, git

On Sun, Dec 16, 2007 at 02:23:27PM -0800, Junio C Hamano wrote:

> Yeah, I amended it without adding another "re-" to the title ;-)  The
> result has been already pushed out.

OK. Too late, but it has my ack. ;)

> > Aren't we using "git diff" for the second diff there nowadays?
> 
> Some people seem to think that is a good idea, but I generally do not
> like using "git diff" between expect and actual (both untracked) inside
> tests.  The last "diff" is about validating what git does and using "git
> diff" there would make the test meaningless when "git diff" itself is
> broken.

I think that is a valid concern. But ISTR that were some issues with
using GNU diff. Commit 5bd74506 mentions getting rid of the dependency
in all existing tests, but gives no reason.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 22:29                     ` Jeff King
@ 2007-12-17  8:42                       ` Wincent Colaiuta
  2007-12-17 10:39                         ` Johannes Schindelin
  0 siblings, 1 reply; 30+ messages in thread
From: Wincent Colaiuta @ 2007-12-17  8:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Linus Torvalds, git

El 16/12/2007, a las 23:29, Jeff King escribió:

> On Sun, Dec 16, 2007 at 02:23:27PM -0800, Junio C Hamano wrote:
>
>> Yeah, I amended it without adding another "re-" to the title ;-)  The
>> result has been already pushed out.
>
> OK. Too late, but it has my ack. ;)
>
>>> Aren't we using "git diff" for the second diff there nowadays?
>>
>> Some people seem to think that is a good idea, but I generally do not
>> like using "git diff" between expect and actual (both untracked)  
>> inside
>> tests.  The last "diff" is about validating what git does and using  
>> "git
>> diff" there would make the test meaningless when "git diff" itself is
>> broken.
>
> I think that is a valid concern. But ISTR that were some issues with
> using GNU diff. Commit 5bd74506 mentions getting rid of the dependency
> in all existing tests, but gives no reason.

I'd say it's safe and sensible to use "git diff" in all tests *except*  
for tests of "git diff" itself.

Wincent

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17  8:42                       ` Wincent Colaiuta
@ 2007-12-17 10:39                         ` Johannes Schindelin
  2007-12-17 10:59                           ` Wincent Colaiuta
  0 siblings, 1 reply; 30+ messages in thread
From: Johannes Schindelin @ 2007-12-17 10:39 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Jeff King, Junio C Hamano, Linus Torvalds, git

Hi,

On Mon, 17 Dec 2007, Wincent Colaiuta wrote:

> El 16/12/2007, a las 23:29, Jeff King escribi?:
> 
> > On Sun, Dec 16, 2007 at 02:23:27PM -0800, Junio C Hamano wrote:
> > 
> > > > Aren't we using "git diff" for the second diff there nowadays?
> > > 
> > > Some people seem to think that is a good idea, but I generally do 
> > > not like using "git diff" between expect and actual (both untracked) 
> > > inside tests.  The last "diff" is about validating what git does and 
> > > using "git diff" there would make the test meaningless when "git 
> > > diff" itself is broken.
> > 
> > I think that is a valid concern. But ISTR that were some issues with 
> > using GNU diff. Commit 5bd74506 mentions getting rid of the dependency 
> > in all existing tests, but gives no reason.
> 
> I'd say it's safe and sensible to use "git diff" in all tests *except* 
> for tests of "git diff" itself.

To the contrary.  It has to test "git diff", so it must use "git diff".  
As for the reference output: we include the expected diffs as texts, and 
therefore do not really have to rely on having GNU diff installed.

Besides, we cannot even test the goodies like "rename from" by comparing 
to GNU diff's output.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 10:39                         ` Johannes Schindelin
@ 2007-12-17 10:59                           ` Wincent Colaiuta
  2007-12-17 11:57                             ` Johannes Schindelin
  0 siblings, 1 reply; 30+ messages in thread
From: Wincent Colaiuta @ 2007-12-17 10:59 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Junio C Hamano, Linus Torvalds, git

El 17/12/2007, a las 11:39, Johannes Schindelin escribió:

> Hi,
>
> On Mon, 17 Dec 2007, Wincent Colaiuta wrote:
>
>> El 16/12/2007, a las 23:29, Jeff King escribi?:
>>
>>> On Sun, Dec 16, 2007 at 02:23:27PM -0800, Junio C Hamano wrote:
>>>
>>>>> Aren't we using "git diff" for the second diff there nowadays?
>>>>
>>>> Some people seem to think that is a good idea, but I generally do
>>>> not like using "git diff" between expect and actual (both  
>>>> untracked)
>>>> inside tests.  The last "diff" is about validating what git does  
>>>> and
>>>> using "git diff" there would make the test meaningless when "git
>>>> diff" itself is broken.
>>>
>>> I think that is a valid concern. But ISTR that were some issues with
>>> using GNU diff. Commit 5bd74506 mentions getting rid of the  
>>> dependency
>>> in all existing tests, but gives no reason.
>>
>> I'd say it's safe and sensible to use "git diff" in all tests  
>> *except*
>> for tests of "git diff" itself.
>
> To the contrary.  It has to test "git diff", so it must use "git  
> diff".

Obviously, you can only test "git diff" by actually running it.

> As for the reference output: we include the expected diffs as texts,  
> and
> therefore do not really have to rely on having GNU diff installed.
>
> Besides, we cannot even test the goodies like "rename from" by  
> comparing
> to GNU diff's output.

Sorry, I didn't make myself clear. That's not what I was proposing at  
all. I was talking about this kind of example:

> +	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
> +	diff -u expect actual


First line uses "git diff", if the second line uses "git diff" as well  
and "git diff" happens to be broken then you're using a broken tool to  
test a broken tool, as Junio already pointed out. I presumed that if  
you had read the whole thread then that would be obvious (look at the  
quoted section from Junio above).

In the example you're not interested in the details of the output  
format, only in the exit status, so it is appropriate to use diff  
instead of "git diff".

Wincent

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 10:59                           ` Wincent Colaiuta
@ 2007-12-17 11:57                             ` Johannes Schindelin
  2007-12-17 12:08                               ` Wincent Colaiuta
  0 siblings, 1 reply; 30+ messages in thread
From: Johannes Schindelin @ 2007-12-17 11:57 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Jeff King, Junio C Hamano, Linus Torvalds, git

Hi,

On Mon, 17 Dec 2007, Wincent Colaiuta wrote:

> El 17/12/2007, a las 11:39, Johannes Schindelin escribi?:
> 
> > On Mon, 17 Dec 2007, Wincent Colaiuta wrote:
> > 
> > > El 16/12/2007, a las 23:29, Jeff King escribi?:
> > > 
> > > > On Sun, Dec 16, 2007 at 02:23:27PM -0800, Junio C Hamano wrote:
> > > > 
> > > > > > Aren't we using "git diff" for the second diff there nowadays?
> > > > > 
> > > > > Some people seem to think that is a good idea, but I generally 
> > > > > do not like using "git diff" between expect and actual (both 
> > > > > untracked) inside tests.  The last "diff" is about validating 
> > > > > what git does and using "git diff" there would make the test 
> > > > > meaningless when "git diff" itself is broken.
> > > > 
> > > > I think that is a valid concern. But ISTR that were some issues 
> > > > with using GNU diff. Commit 5bd74506 mentions getting rid of the 
> > > > dependency in all existing tests, but gives no reason.
> > > 
> > > I'd say it's safe and sensible to use "git diff" in all tests 
> > > *except* for tests of "git diff" itself.
> > 
> > To the contrary.  It has to test "git diff", so it must use "git 
> > diff".
> 
> Obviously, you can only test "git diff" by actually running it.

Sorry, I should have made clear that I meant this as funny:

	;-)

> > As for the reference output: we include the expected diffs as texts, 
> > and therefore do not really have to rely on having GNU diff installed.
> > 
> > Besides, we cannot even test the goodies like "rename from" by 
> > comparing to GNU diff's output.
> 
> Sorry, I didn't make myself clear. That's not what I was proposing at 
> all. I was talking about this kind of example:
> 
> > + git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
> > + diff -u expect actual
> 
> First line uses "git diff", if the second line uses "git diff" as well 
> and "git diff" happens to be broken then you're using a broken tool to 
> test a broken tool, as Junio already pointed out.

Hmm.  There is some chicken-and-egg problem here (I read the thread, but 
did not really see a problem, as I assumed that _other_ tests would assure 
that "git diff --no-index" works as expected).

But as at least one released version of GNU diff has a pretty serious bug, 
I would rather not rely too much on diff.  (BTW this was the reason I 
wanted --no-index so badly.)

So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not 
"git diff".

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 11:57                             ` Johannes Schindelin
@ 2007-12-17 12:08                               ` Wincent Colaiuta
  2007-12-17 12:12                                 ` Jeff King
  2007-12-17 12:20                                 ` Johannes Sixt
  0 siblings, 2 replies; 30+ messages in thread
From: Wincent Colaiuta @ 2007-12-17 12:08 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Jeff King, Junio C Hamano, Linus Torvalds, git

El 17/12/2007, a las 12:57, Johannes Schindelin escribió:

> Hmm.  There is some chicken-and-egg problem here (I read the thread,  
> but
> did not really see a problem, as I assumed that _other_ tests would  
> assure
> that "git diff --no-index" works as expected).
>
> But as at least one released version of GNU diff has a pretty  
> serious bug,
> I would rather not rely too much on diff.  (BTW this was the reason I
> wanted --no-index so badly.)
>
> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not
> "git diff".

Well cmp would be fine as well, seeing all we want is a boolean "is  
this the same or not" answer. (I'm not familiar with the GNU diff bug  
you speak of, but was it so bad that it couldn't even get *that*  
answer right?)

Cheers,
Wincent

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 12:08                               ` Wincent Colaiuta
@ 2007-12-17 12:12                                 ` Jeff King
  2007-12-17 12:20                                 ` Johannes Sixt
  1 sibling, 0 replies; 30+ messages in thread
From: Jeff King @ 2007-12-17 12:12 UTC (permalink / raw)
  To: Wincent Colaiuta; +Cc: Johannes Schindelin, Junio C Hamano, Linus Torvalds, git

On Mon, Dec 17, 2007 at 01:08:45PM +0100, Wincent Colaiuta wrote:

>> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not
>> "git diff".
>
> Well cmp would be fine as well, seeing all we want is a boolean "is this the 
> same or not" answer. (I'm not familiar with the GNU diff bug you speak of, 
> but was it so bad that it couldn't even get *that* answer right?)

Personally I find it useful to generate the diff output when viewing
with "-v". But if there is a real reason to use cmp over diff in the
script, one can always manually go into the trash directory and run
diff.

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 12:08                               ` Wincent Colaiuta
  2007-12-17 12:12                                 ` Jeff King
@ 2007-12-17 12:20                                 ` Johannes Sixt
  2007-12-17 12:51                                   ` Johannes Schindelin
  1 sibling, 1 reply; 30+ messages in thread
From: Johannes Sixt @ 2007-12-17 12:20 UTC (permalink / raw)
  To: Wincent Colaiuta
  Cc: Johannes Schindelin, Jeff King, Junio C Hamano, Linus Torvalds, git

Wincent Colaiuta schrieb:
> El 17/12/2007, a las 12:57, Johannes Schindelin escribió:
> 
>> Hmm.  There is some chicken-and-egg problem here (I read the thread, but
>> did not really see a problem, as I assumed that _other_ tests would
>> assure
>> that "git diff --no-index" works as expected).
>>
>> But as at least one released version of GNU diff has a pretty serious
>> bug,
>> I would rather not rely too much on diff.  (BTW this was the reason I
>> wanted --no-index so badly.)
>>
>> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not
>> "git diff".
> 
> Well cmp would be fine as well, seeing all we want is a boolean "is this
> the same or not" answer. (I'm not familiar with the GNU diff bug you
> speak of, but was it so bad that it couldn't even get *that* answer right?)

Heh, there's at least one distribution out there (Suse 10.1) that comes with
a *cmp* that doesn't get that answer right if its output is connected to
/dev/null, which is the case when you simply 'make test'.

-- Hannes

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 12:20                                 ` Johannes Sixt
@ 2007-12-17 12:51                                   ` Johannes Schindelin
  2007-12-17 17:58                                     ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Johannes Schindelin @ 2007-12-17 12:51 UTC (permalink / raw)
  To: Johannes Sixt
  Cc: Wincent Colaiuta, Jeff King, Junio C Hamano, Linus Torvalds, git

Hi,

On Mon, 17 Dec 2007, Johannes Sixt wrote:

> Wincent Colaiuta schrieb:
> > El 17/12/2007, a las 12:57, Johannes Schindelin escribi?:
> > 
> >> Hmm.  There is some chicken-and-egg problem here (I read the thread, but
> >> did not really see a problem, as I assumed that _other_ tests would
> >> assure
> >> that "git diff --no-index" works as expected).
> >>
> >> But as at least one released version of GNU diff has a pretty serious
> >> bug,
> >> I would rather not rely too much on diff.  (BTW this was the reason I
> >> wanted --no-index so badly.)
> >>
> >> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not
> >> "git diff".
> > 
> > Well cmp would be fine as well, seeing all we want is a boolean "is 
> > this the same or not" answer. (I'm not familiar with the GNU diff bug 
> > you speak of, but was it so bad that it couldn't even get *that* 
> > answer right?)
> 
> Heh, there's at least one distribution out there (Suse 10.1) that comes 
> with a *cmp* that doesn't get that answer right if its output is 
> connected to /dev/null, which is the case when you simply 'make test'.

Yeah.  That's what it was.  I even posted a patch to GNU diff, only to 
find out that it was already fixed in CVS.  Sigh.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 12:51                                   ` Johannes Schindelin
@ 2007-12-17 17:58                                     ` Junio C Hamano
  2007-12-17 18:05                                       ` Johannes Schindelin
  0 siblings, 1 reply; 30+ messages in thread
From: Junio C Hamano @ 2007-12-17 17:58 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Johannes Sixt, Wincent Colaiuta, Jeff King, Linus Torvalds, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

>> >> But as at least one released version of GNU diff has a pretty serious
>> >> bug,
>> >> I would rather not rely too much on diff.  (BTW this was the reason I
>> >> wanted --no-index so badly.)
>> >>
>> >> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but not
>> >> "git diff".
>> > 
>> > Well cmp would be fine as well, seeing all we want is a boolean "is 
>> > this the same or not" answer. (I'm not familiar with the GNU diff bug 
>> > you speak of, but was it so bad that it couldn't even get *that* 
>> > answer right?)
>> 
>> Heh, there's at least one distribution out there (Suse 10.1) that comes 
>> with a *cmp* that doesn't get that answer right if its output is 
>> connected to /dev/null, which is the case when you simply 'make test'.
>
> Yeah.  That's what it was.  I even posted a patch to GNU diff, only to 
> find out that it was already fixed in CVS.  Sigh.

Wait.  Are you still talking about diff or cmp, or are you saying that
your earlier statement about avoiding GNU diff due to its bugs is
unfounded?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-17 17:58                                     ` Junio C Hamano
@ 2007-12-17 18:05                                       ` Johannes Schindelin
  0 siblings, 0 replies; 30+ messages in thread
From: Johannes Schindelin @ 2007-12-17 18:05 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Johannes Sixt, Wincent Colaiuta, Jeff King, Linus Torvalds, git

Hi,

On Mon, 17 Dec 2007, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> 
> >> >> But as at least one released version of GNU diff has a pretty 
> >> >> serious bug, I would rather not rely too much on diff.  (BTW this 
> >> >> was the reason I wanted --no-index so badly.)
> >> >>
> >> >> So yeah, the second "diff" cannot be "git diff".  Maybe "cmp", but 
> >> >> not "git diff".
> >> > 
> >> > Well cmp would be fine as well, seeing all we want is a boolean "is 
> >> > this the same or not" answer. (I'm not familiar with the GNU diff 
> >> > bug you speak of, but was it so bad that it couldn't even get 
> >> > *that* answer right?)
> >> 
> >> Heh, there's at least one distribution out there (Suse 10.1) that 
> >> comes with a *cmp* that doesn't get that answer right if its output 
> >> is connected to /dev/null, which is the case when you simply 'make 
> >> test'.
> >
> > Yeah.  That's what it was.  I even posted a patch to GNU diff, only to 
> > find out that it was already fixed in CVS.  Sigh.
> 
> Wait.  Are you still talking about diff or cmp, or are you saying that 
> your earlier statement about avoiding GNU diff due to its bugs is 
> unfounded?

I do not remember offhand.  I only remembered that it was the GNU diff 
package.  But maybe it was only the "cmp" tool.  Symptoms were that tests 
were failing without "-i", but succeeding with "-i".

Okay, I found the mail:

http://article.gmane.org/gmane.comp.version-control.git/25107/match=cmp

Seems it was only "cmp".

Sorry for the noise,
Dscho

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-16 21:49               ` [PATCH] Re-re-re-fix common tail optimization Junio C Hamano
  2007-12-16 22:15                 ` Jeff King
@ 2007-12-19 14:18                 ` Charles Bailey
  2007-12-19 14:27                   ` Jeff King
  1 sibling, 1 reply; 30+ messages in thread
From: Charles Bailey @ 2007-12-19 14:18 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Linus Torvalds, git

On Sun, Dec 16, 2007 at 01:49:17PM -0800, Junio C Hamano wrote:
> 
> Kind'a embarrassing that both of us cannot get this right without so
> many rounds, isn't it?
> 
> -- >8 --
> Subject: [PATCH] Re-re-re-fix common tail optimization
> 
> We need to be extra careful recovering the removed common section, so
> that we do not break context nor the changed incomplete line (i.e. the
> last line that does not end with LF).
> 
> Signed-off-by: Junio C Hamano <gitster@pobox.com>

Just to add to the woe on this one, this test breaks on MacOS X due to
the pattern length limitations of the default sed on that platform.

Interested in a patch?

Charles.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-19 14:18                 ` Charles Bailey
@ 2007-12-19 14:27                   ` Jeff King
  2007-12-19 14:37                     ` Charles Bailey
  0 siblings, 1 reply; 30+ messages in thread
From: Jeff King @ 2007-12-19 14:27 UTC (permalink / raw)
  To: Charles Bailey; +Cc: Junio C Hamano, Linus Torvalds, git

On Wed, Dec 19, 2007 at 02:18:45PM +0000, Charles Bailey wrote:

> Just to add to the woe on this one, this test breaks on MacOS X due to
> the pattern length limitations of the default sed on that platform.
> 
> Interested in a patch?

Somebody beat you to it. :) Can you confirm that the fix in

  <1198007158-27576-1-git-send-email-win@wincent.com>

works for you?

-Peff

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-19 14:27                   ` Jeff King
@ 2007-12-19 14:37                     ` Charles Bailey
  2007-12-20  0:21                       ` Junio C Hamano
  0 siblings, 1 reply; 30+ messages in thread
From: Charles Bailey @ 2007-12-19 14:37 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Linus Torvalds, git

On Wed, Dec 19, 2007 at 09:27:15AM -0500, Jeff King wrote:
> On Wed, Dec 19, 2007 at 02:18:45PM +0000, Charles Bailey wrote:
> 
> > Just to add to the woe on this one, this test breaks on MacOS X due to
> > the pattern length limitations of the default sed on that platform.
> > 
> > Interested in a patch?
> 
> Somebody beat you to it. :) Can you confirm that the fix in
> 
>   <1198007158-27576-1-git-send-email-win@wincent.com>
> 
> works for you?
> 
> -Peff


Ooh, the excitement, I've never had the opportunity to "git am"
before.

Yes, I can confirm.  It works for me.

For reference I had the following, which is fewer lines but not
inherently better in any other way.

Charles.



diff --git a/t/t4024-diff-optimize-common.sh b/t/t4024-diff-optimize-common.sh
index 20fe87b..ffb2c8f 100755
--- a/t/t4024-diff-optimize-common.sh
+++ b/t/t4024-diff-optimize-common.sh
@@ -7,8 +7,9 @@ test_description='common tail optimization'
 z=zzzzzzzz ;# 8
 z="$z$z$z$z$z$z$z$z" ;# 64
 z="$z$z$z$z$z$z$z$z" ;# 512
-z="$z$z$z$z" ;# 2048
-z2047=$(expr "$z" : '.\(.*\)') ; #2047
+z="$z$z" ;# 1024
+z1023=$(expr "$z" : '.\(.*\)') ; #1023
+z2047=$z$z1023
 
 test_expect_success setup '
 
@@ -35,8 +36,8 @@ diff --git a/file-a b/file-a
 --- a/file-a
 +++ b/file-a
 @@ -1 +1 @@
--aZ
-+AZ
+-aZZz
++AZZz
 diff --git a/file-b b/file-b
 --- a/file-b
 +++ b/file-b
@@ -47,9 +48,9 @@ diff --git a/file-c b/file-c
 --- a/file-c
 +++ b/file-c
 @@ -1 +1 @@
--cZ
+-cZZz
 \ No newline at end of file
-+CZ
++CZZz
 \ No newline at end of file
 diff --git a/file-d b/file-d
 --- a/file-d
@@ -61,7 +62,7 @@ EOF
 
 test_expect_success 'diff -U0' '
 
-	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
+	git diff -U0 | sed -e "/^index/d" -e "s/$z1023/Z/g" >actual &&
 	diff -u expect actual
 
 '
-- 
1.5.3.7.11.ga3d7

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-19 14:37                     ` Charles Bailey
@ 2007-12-20  0:21                       ` Junio C Hamano
  2007-12-20  1:38                         ` Wincent Colaiuta
  2007-12-20  9:23                         ` Charles Bailey
  0 siblings, 2 replies; 30+ messages in thread
From: Junio C Hamano @ 2007-12-20  0:21 UTC (permalink / raw)
  To: Charles Bailey; +Cc: Wincent Colaiuta, Jeff King, Linus Torvalds, git

Charles Bailey <charles@hashpling.org> writes:

> On Wed, Dec 19, 2007 at 09:27:15AM -0500, Jeff King wrote:
> ...
>> Somebody beat you to it. :) Can you confirm that the fix in
>> 
>>   <1198007158-27576-1-git-send-email-win@wincent.com>
>> 
>> works for you?
>
> Ooh, the excitement, I've never had the opportunity to "git am"
> before.

Well, if Wincent had sent the patch as plain text without MIME
quoted-printable, you did not have to even have the excitement of
running "git am", but a simple "git apply" would have sufficed.

    ...
    How about the following? This swaps in perl in place of sed, which we c=
    an
    hopefully rely upon to work across platforms.

    Cheers,
    Wincent

    -------- 8< --------
    =46ix tests for broken sed on Leopard

    The newly-added common-tail-optimization test fails on Leopard because
    the broken sed implementation bails with a spurious "unterminated
    substitute pattern" error because of the length of one of the
    arguments.

Just for future reference, I do not mind avoiding "top-post" style patch
using "-- >8 --" cut markers, but do not have three or more dashes in
the cut marker.  "mailinfo" will split the message at that line and the
portion intended as the proposed commit log message will be taken as
part of garbage in front of the patch.  Using "-- >8 --" (two dashes,
space, scissors, ...) keeps both the front matter and the log text
together so that "am -i" or "commit --amend" can work on both parts.


By the way, how does this rewrite look?

-- >8 --
t4024: fix test script to use simpler sed pattern

The earlier test stripped away expected number of 'z' but the output
would have been very hard to read once somebody broke the common tail
optimization.  Instead, count the number of 'z' and show it, to help
diagnosing the problem better in the future.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 t/t4024-diff-optimize-common.sh |  158 ++++++++++++++++++++++++++++++---------
 1 files changed, 123 insertions(+), 35 deletions(-)

diff --git a/t/t4024-diff-optimize-common.sh b/t/t4024-diff-optimize-common.sh
index 20fe87b..3c66102 100755
--- a/t/t4024-diff-optimize-common.sh
+++ b/t/t4024-diff-optimize-common.sh
@@ -10,58 +10,146 @@ z="$z$z$z$z$z$z$z$z" ;# 512
 z="$z$z$z$z" ;# 2048
 z2047=$(expr "$z" : '.\(.*\)') ; #2047
 
-test_expect_success setup '
-
-	echo "a$z2047" >file-a &&
-	echo "b" >file-b &&
-	echo "$z2047" >>file-b &&
-	echo "c$z2047" | tr -d "\012" >file-c &&
-	echo "d" >file-d &&
-	echo "$z2047" | tr -d "\012" >>file-d &&
+x=zzzzzzzzzz			;# 10
+y="$x$x$x$x$x$x$x$x$x$x"	;# 100
+z="$y$y$y$y$y$y$y$y$y$y"	;# 1000
+z1000=$z
+z100=$y
+z10=$x
 
-	git add file-a file-b file-c file-d &&
+zs() {
+	count="$1"
+	while test "$count" -ge 1000
+	do
+		count=$(($count - 1000))
+		printf "%s" $z1000
+	done
+	while test "$count" -ge 100
+	do
+		count=$(($count - 100))
+		printf "%s" $z100
+	done
+	while test "$count" -ge 10
+	do
+		count=$(($count - 10))
+		printf "%s" $z10
+	done
+	while test "$count" -ge 1
+	do
+		count=$(($count - 1))
+		printf "z"
+	done
+}
 
-	echo "A$z2047" >file-a &&
-	echo "B" >file-b &&
-	echo "$z2047" >>file-b &&
-	echo "C$z2047" | tr -d "\012" >file-c &&
-	echo "D" >file-d &&
-	echo "$z2047" | tr -d "\012" >>file-d
-
-'
+zc () {
+	sed -e "/^index/d" \
+		-e "s/$z1000/Q/g" \
+		-e "s/QQQQQQQQQ/Z9000/g" \
+		-e "s/QQQQQQQQ/Z8000/g" \
+		-e "s/QQQQQQQ/Z7000/g" \
+		-e "s/QQQQQQ/Z6000/g" \
+		-e "s/QQQQQ/Z5000/g" \
+		-e "s/QQQQ/Z4000/g" \
+		-e "s/QQQ/Z3000/g" \
+		-e "s/QQ/Z2000/g" \
+		-e "s/Q/Z1000/g" \
+		-e "s/$z100/Q/g" \
+		-e "s/QQQQQQQQQ/Z900/g" \
+		-e "s/QQQQQQQQ/Z800/g" \
+		-e "s/QQQQQQQ/Z700/g" \
+		-e "s/QQQQQQ/Z600/g" \
+		-e "s/QQQQQ/Z500/g" \
+		-e "s/QQQQ/Z400/g" \
+		-e "s/QQQ/Z300/g" \
+		-e "s/QQ/Z200/g" \
+		-e "s/Q/Z100/g" \
+		-e "s/000Z//g" \
+		-e "s/$z10/Q/g" \
+		-e "s/QQQQQQQQQ/Z90/g" \
+		-e "s/QQQQQQQQ/Z80/g" \
+		-e "s/QQQQQQQ/Z70/g" \
+		-e "s/QQQQQQ/Z60/g" \
+		-e "s/QQQQQ/Z50/g" \
+		-e "s/QQQQ/Z40/g" \
+		-e "s/QQQ/Z30/g" \
+		-e "s/QQ/Z20/g" \
+		-e "s/Q/Z10/g" \
+		-e "s/00Z//g" \
+		-e "s/z/Q/g" \
+		-e "s/QQQQQQQQQ/Z9/g" \
+		-e "s/QQQQQQQQ/Z8/g" \
+		-e "s/QQQQQQQ/Z7/g" \
+		-e "s/QQQQQQ/Z6/g" \
+		-e "s/QQQQQ/Z5/g" \
+		-e "s/QQQQ/Z4/g" \
+		-e "s/QQQ/Z3/g" \
+		-e "s/QQ/Z2/g" \
+		-e "s/Q/Z1/g" \
+		-e "s/0Z//g" \
+	;
+}
 
-cat >expect <<\EOF
-diff --git a/file-a b/file-a
---- a/file-a
-+++ b/file-a
+expect_pattern () {
+	cnt="$1"
+	cat <<EOF
+diff --git a/file-a$cnt b/file-a$cnt
+--- a/file-a$cnt
++++ b/file-a$cnt
 @@ -1 +1 @@
--aZ
-+AZ
-diff --git a/file-b b/file-b
---- a/file-b
-+++ b/file-b
+-Z${cnt}a
++Z${cnt}A
+diff --git a/file-b$cnt b/file-b$cnt
+--- a/file-b$cnt
++++ b/file-b$cnt
 @@ -1 +1 @@
 -b
 +B
-diff --git a/file-c b/file-c
---- a/file-c
-+++ b/file-c
+diff --git a/file-c$cnt b/file-c$cnt
+--- a/file-c$cnt
++++ b/file-c$cnt
 @@ -1 +1 @@
--cZ
+-cZ$cnt
 \ No newline at end of file
-+CZ
++CZ$cnt
 \ No newline at end of file
-diff --git a/file-d b/file-d
---- a/file-d
-+++ b/file-d
+diff --git a/file-d$cnt b/file-d$cnt
+--- a/file-d$cnt
++++ b/file-d$cnt
 @@ -1 +1 @@
 -d
 +D
 EOF
+}
+
+sample='1023 1024 1025 2047 4095'
+
+test_expect_success setup '
+
+	for n in $sample
+	do
+		( zs $n ; echo a ) >file-a$n &&
+		( echo b; zs $n; echo ) >file-b$n &&
+		( printf c; zs $n ) >file-c$n &&
+		( echo d; zs $n ) >file-d$n &&
+
+		git add file-a$n file-b$n file-c$n file-d$n &&
+
+		( zs $n ; echo A ) >file-a$n &&
+		( echo B; zs $n; echo ) >file-b$n &&
+		( printf C; zs $n ) >file-c$n &&
+		( echo D; zs $n ) >file-d$n &&
+
+		expect_pattern $n || break
+
+	done >expect
+'
 
 test_expect_success 'diff -U0' '
 
-	git diff -U0 | sed -e "/^index/d" -e "s/$z2047/Z/g" >actual &&
+	for n in $sample
+	do
+		git diff -U0 file-?$n
+	done | zc >actual &&
 	diff -u expect actual
 
 '

^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-20  0:21                       ` Junio C Hamano
@ 2007-12-20  1:38                         ` Wincent Colaiuta
  2007-12-20  9:23                         ` Charles Bailey
  1 sibling, 0 replies; 30+ messages in thread
From: Wincent Colaiuta @ 2007-12-20  1:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Charles Bailey, Jeff King, Linus Torvalds, git

El 20/12/2007, a las 1:21, Junio C Hamano escribió:

> Charles Bailey <charles@hashpling.org> writes:
>
>> On Wed, Dec 19, 2007 at 09:27:15AM -0500, Jeff King wrote:
>> ...
>>> Somebody beat you to it. :) Can you confirm that the fix in
>>>
>>>  <1198007158-27576-1-git-send-email-win@wincent.com>
>>>
>>> works for you?
>>
>> Ooh, the excitement, I've never had the opportunity to "git am"
>> before.
>
> Well, if Wincent had sent the patch as plain text without MIME
> quoted-printable, you did not have to even have the excitement of
> running "git am", but a simple "git apply" would have sufficed.

Well, I sent it using "git send-email", so I guess it didn't have MIME  
quoted-printable.

Cheers,
Wincent

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-20  0:21                       ` Junio C Hamano
  2007-12-20  1:38                         ` Wincent Colaiuta
@ 2007-12-20  9:23                         ` Charles Bailey
  2007-12-20  9:40                           ` Junio C Hamano
  1 sibling, 1 reply; 30+ messages in thread
From: Charles Bailey @ 2007-12-20  9:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Wincent Colaiuta, Jeff King, Linus Torvalds, git

On Wed, Dec 19, 2007 at 04:21:30PM -0800, Junio C Hamano wrote:
> 
> By the way, how does this rewrite look?
> 

It looks good and the test works on my fedora and Mac OS X boxes.
(Just for reference I'm on 10.4.3, not 'Leopard'.)

I've snipped everything except zc that I wanted to comment on.

The only two really minor nits I have is that zc does bizarre things
without warning for more than 9999 "z" ( e.g. 17000 = 9000 + 8000 =
98000 (!) ) and that, from a function responsibility point of view,
the /^index/d looks rather out of place.

Perhaps a comment saying tha zc is designed for <= 9999 z's?  Given
this, a lot of the /g are redundant.

But hey, it's a test script and it works and I don't have any better
suggestions. :)

Charles.

> +zc () {
> +	sed -e "/^index/d" \
> +		-e "s/$z1000/Q/g" \
> +		-e "s/QQQQQQQQQ/Z9000/g" \
> +		-e "s/QQQQQQQQ/Z8000/g" \
> +		-e "s/QQQQQQQ/Z7000/g" \
> +		-e "s/QQQQQQ/Z6000/g" \
> +		-e "s/QQQQQ/Z5000/g" \
> +		-e "s/QQQQ/Z4000/g" \
> +		-e "s/QQQ/Z3000/g" \
> +		-e "s/QQ/Z2000/g" \
> +		-e "s/Q/Z1000/g" \
> +		-e "s/$z100/Q/g" \
> +		-e "s/QQQQQQQQQ/Z900/g" \
> +		-e "s/QQQQQQQQ/Z800/g" \
> +		-e "s/QQQQQQQ/Z700/g" \
> +		-e "s/QQQQQQ/Z600/g" \
> +		-e "s/QQQQQ/Z500/g" \
> +		-e "s/QQQQ/Z400/g" \
> +		-e "s/QQQ/Z300/g" \
> +		-e "s/QQ/Z200/g" \
> +		-e "s/Q/Z100/g" \
> +		-e "s/000Z//g" \
> +		-e "s/$z10/Q/g" \
> +		-e "s/QQQQQQQQQ/Z90/g" \
> +		-e "s/QQQQQQQQ/Z80/g" \
> +		-e "s/QQQQQQQ/Z70/g" \
> +		-e "s/QQQQQQ/Z60/g" \
> +		-e "s/QQQQQ/Z50/g" \
> +		-e "s/QQQQ/Z40/g" \
> +		-e "s/QQQ/Z30/g" \
> +		-e "s/QQ/Z20/g" \
> +		-e "s/Q/Z10/g" \
> +		-e "s/00Z//g" \
> +		-e "s/z/Q/g" \
> +		-e "s/QQQQQQQQQ/Z9/g" \
> +		-e "s/QQQQQQQQ/Z8/g" \
> +		-e "s/QQQQQQQ/Z7/g" \
> +		-e "s/QQQQQQ/Z6/g" \
> +		-e "s/QQQQQ/Z5/g" \
> +		-e "s/QQQQ/Z4/g" \
> +		-e "s/QQQ/Z3/g" \
> +		-e "s/QQ/Z2/g" \
> +		-e "s/Q/Z1/g" \
> +		-e "s/0Z//g" \
> +	;
> +}

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH] Re-re-re-fix common tail optimization
  2007-12-20  9:23                         ` Charles Bailey
@ 2007-12-20  9:40                           ` Junio C Hamano
  0 siblings, 0 replies; 30+ messages in thread
From: Junio C Hamano @ 2007-12-20  9:40 UTC (permalink / raw)
  To: Charles Bailey; +Cc: Wincent Colaiuta, Jeff King, Linus Torvalds, git

Charles Bailey <charles@hashpling.org> writes:

> Perhaps a comment saying tha zc is designed for <= 9999 z's?  Given
> this, a lot of the /g are redundant.

Yeah, /g redundancy is a leftover from the time when it did not have
these sawtooth patterns.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2007-12-20  9:41 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-15 11:16 trim_common_tail bug? Jeff King
2007-12-15 15:51 ` Jeff King
2007-12-15 17:49   ` Junio C Hamano
2007-12-15 20:02     ` Jeff King
2007-12-16  7:06       ` Jeff King
2007-12-16 19:43         ` Junio C Hamano
2007-12-16 21:16           ` Junio C Hamano
2007-12-16 21:21             ` Jeff King
2007-12-16 21:49               ` [PATCH] Re-re-re-fix common tail optimization Junio C Hamano
2007-12-16 22:15                 ` Jeff King
2007-12-16 22:23                   ` Junio C Hamano
2007-12-16 22:28                     ` Junio C Hamano
2007-12-16 22:29                     ` Jeff King
2007-12-17  8:42                       ` Wincent Colaiuta
2007-12-17 10:39                         ` Johannes Schindelin
2007-12-17 10:59                           ` Wincent Colaiuta
2007-12-17 11:57                             ` Johannes Schindelin
2007-12-17 12:08                               ` Wincent Colaiuta
2007-12-17 12:12                                 ` Jeff King
2007-12-17 12:20                                 ` Johannes Sixt
2007-12-17 12:51                                   ` Johannes Schindelin
2007-12-17 17:58                                     ` Junio C Hamano
2007-12-17 18:05                                       ` Johannes Schindelin
2007-12-19 14:18                 ` Charles Bailey
2007-12-19 14:27                   ` Jeff King
2007-12-19 14:37                     ` Charles Bailey
2007-12-20  0:21                       ` Junio C Hamano
2007-12-20  1:38                         ` Wincent Colaiuta
2007-12-20  9:23                         ` Charles Bailey
2007-12-20  9:40                           ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.