heads-up: git-index-pack in "next" is broken

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* heads-up: git-index-pack in "next" is broken
@ 2006-10-17  4:55 Junio C Hamano
  2006-10-17 15:39 ` Nicolas Pitre
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-17  4:55 UTC (permalink / raw)
  To: git; +Cc: Nicolas Pitre

I'm still a bit under the weather and do not have enough
concentration to dig into the problem tonight, but I noticed
that something in "next", most likely the delta-base-offset
patchset, broke git-index-pack:

$ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd
$ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack
$ ~/git-master/bin/git-index-pack pack-$X.pack
ec0c3491753e115e1775256f6b7bd1bce4dea7cd
$ git-index-pack pack-$X.pack
fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17  4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano
@ 2006-10-17 15:39 ` Nicolas Pitre
  2006-10-17 16:07   ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 15:39 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Mon, 16 Oct 2006, Junio C Hamano wrote:

> I'm still a bit under the weather and do not have enough
> concentration to dig into the problem tonight, but I noticed
> that something in "next", most likely the delta-base-offset
> patchset, broke git-index-pack:
> 
> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd
> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack
> $ ~/git-master/bin/git-index-pack pack-$X.pack
> ec0c3491753e115e1775256f6b7bd1bce4dea7cd
> $ git-index-pack pack-$X.pack
> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas

Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just 
cannot reproduce this problem at all.  I always get a good index and 
ec0c3491753e115e1775256f6b7bd1bce4dea7cd back.


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 15:39 ` Nicolas Pitre
@ 2006-10-17 16:07   ` Junio C Hamano
  2006-10-17 17:00     ` Nicolas Pitre
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-17 16:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> On Mon, 16 Oct 2006, Junio C Hamano wrote:
>
>> I'm still a bit under the weather and do not have enough
>> concentration to dig into the problem tonight, but I noticed
>> that something in "next", most likely the delta-base-offset
>> patchset, broke git-index-pack:
>> 
>> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd
>> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack
>> $ ~/git-master/bin/git-index-pack pack-$X.pack
>> ec0c3491753e115e1775256f6b7bd1bce4dea7cd
>> $ git-index-pack pack-$X.pack
>> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas
>
> Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just 
> cannot reproduce this problem at all.  I always get a good index and 
> ec0c3491753e115e1775256f6b7bd1bce4dea7cd back.

Hmph.  I just got exactly the same breakage; could this be
another 64-bit breakage?  My breakage was on x86-64.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 16:07   ` Junio C Hamano
@ 2006-10-17 17:00     ` Nicolas Pitre
  2006-10-17 18:11       ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 17:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > On Mon, 16 Oct 2006, Junio C Hamano wrote:
> >
> >> I'm still a bit under the weather and do not have enough
> >> concentration to dig into the problem tonight, but I noticed
> >> that something in "next", most likely the delta-base-offset
> >> patchset, broke git-index-pack:
> >> 
> >> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd
> >> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack
> >> $ ~/git-master/bin/git-index-pack pack-$X.pack
> >> ec0c3491753e115e1775256f6b7bd1bce4dea7cd
> >> $ git-index-pack pack-$X.pack
> >> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas
> >
> > Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just 
> > cannot reproduce this problem at all.  I always get a good index and 
> > ec0c3491753e115e1775256f6b7bd1bce4dea7cd back.
> 
> Hmph.  I just got exactly the same breakage; could this be
> another 64-bit breakage?  My breakage was on x86-64.

I've been suspecting that since then as well.  I indeed tested on i386.
But reviewing the code I just can't find any obvious spot where 64-bit 
would be an issue, especially since your pack does not have any 
OFS_DELTA objects.

Could you instrument the code at the end of 
index-pack.c:parse_pack_objects() to display how many deltas were 
actually resolved and how many were not?  IOW is it a case of all or 
nothing, or is there an isolated case of corruption lurking somewhere?


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 17:00     ` Nicolas Pitre
@ 2006-10-17 18:11       ` Junio C Hamano
  2006-10-17 18:47         ` Nicolas Pitre
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-17 18:11 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> On Tue, 17 Oct 2006, Junio C Hamano wrote:
>
>> Nicolas Pitre <nico@cam.org> writes:
>> 
>> > On Mon, 16 Oct 2006, Junio C Hamano wrote:
>> >
>> >> I'm still a bit under the weather and do not have enough
>> >> concentration to dig into the problem tonight, but I noticed
>> >> that something in "next", most likely the delta-base-offset
>> >> patchset, broke git-index-pack:
>> >> 
>> >> $ X=ec0c3491753e115e1775256f6b7bd1bce4dea7cd
>> >> $ wget http://www.kernel.org/pub/scm/git/git.git/objects/pack/pack-$X.pack
>> >> $ ~/git-master/bin/git-index-pack pack-$X.pack
>> >> ec0c3491753e115e1775256f6b7bd1bce4dea7cd
>> >> $ git-index-pack pack-$X.pack
>> >> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has unresolved deltas
>> >
>> > Using the tip of the "next" branch (git version 1.4.2.4.gf9fe) I just 
>> > cannot reproduce this problem at all.  I always get a good index and 
>> > ec0c3491753e115e1775256f6b7bd1bce4dea7cd back.
>> 
>> Hmph.  I just got exactly the same breakage; could this be
>> another 64-bit breakage?  My breakage was on x86-64.
>
> I've been suspecting that since then as well.  I indeed tested on i386.
> But reviewing the code I just can't find any obvious spot where 64-bit 
> would be an issue, especially since your pack does not have any 
> OFS_DELTA objects.
>
> Could you instrument the code at the end of 
> index-pack.c:parse_pack_objects() to display how many deltas were 
> actually resolved and how many were not?  IOW is it a case of all or 
> nothing, or is there an isolated case of corruption lurking somewhere?

fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has
18915 unresolved ref-deltas and 0 ofs-deltas among 21205

By the way, "Gaaaah".  Is this find_delta() called from
find_delta_children() doing the right thing?  I wonder if this
is open to accidental collisions?. If you have an object name
whose last 12-bytes are all NUL and you have a pack offset whose
bytes happens to be a good prefix for an object, what happens?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 18:11       ` Junio C Hamano
@ 2006-10-17 18:47         ` Nicolas Pitre
  2006-10-17 19:36           ` Sergey Vlasov
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 18:47 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> Nicolas Pitre <nico@cam.org> writes:
> 
> > Could you instrument the code at the end of 
> > index-pack.c:parse_pack_objects() to display how many deltas were 
> > actually resolved and how many were not?  IOW is it a case of all or 
> > nothing, or is there an isolated case of corruption lurking somewhere?
> 
> fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has
> 18915 unresolved ref-deltas and 0 ofs-deltas among 21205

Hmmm....  Interesting.

Is it possible that sizeof(union delta_base) might not be equal to 20 
for you?

> By the way, "Gaaaah".  Is this find_delta() called from
> find_delta_children() doing the right thing?  I wonder if this
> is open to accidental collisions?. If you have an object name
> whose last 12-bytes are all NUL and you have a pack offset whose
> bytes happens to be a good prefix for an object, what happens?

It is filtered out later thusly:

	...
	for (j = ref_first; j <= ref_last; j++)
		if (deltas[j].obj->type == OBJ_REF_DELTA)
			resolve_delta(&deltas[j], data, ...);

So if a collision happens the object won't be of the right type and it 
is simply skipped.


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 18:47         ` Nicolas Pitre
@ 2006-10-17 19:36           ` Sergey Vlasov
  2006-10-17 20:10             ` Junio C Hamano
  2006-10-17 20:23             ` Nicolas Pitre
  0 siblings, 2 replies; 33+ messages in thread
From: Sergey Vlasov @ 2006-10-17 19:36 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

[-- Attachment #1: Type: text/plain, Size: 1054 bytes --]

On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote:

> On Tue, 17 Oct 2006, Junio C Hamano wrote:
> 
> > Nicolas Pitre <nico@cam.org> writes:
> > 
> > > Could you instrument the code at the end of 
> > > index-pack.c:parse_pack_objects() to display how many deltas were 
> > > actually resolved and how many were not?  IOW is it a case of all or 
> > > nothing, or is there an isolated case of corruption lurking somewhere?
> > 
> > fatal: packfile 'pack-ec0c3491753e115e1775256f6b7bd1bce4dea7cd.pack' has
> > 18915 unresolved ref-deltas and 0 ofs-deltas among 21205
> 
> Hmmm....  Interesting.
> 
> Is it possible that sizeof(union delta_base) might not be equal to 20 
> for you?

Yes, on x86_64 this is 24 because of 8-byte alignment for longs:


$ echo 'unsigned x = sizeof(union{unsigned char sha1[20]; long offset;});' | gcc -S -x c -m64 -o - - | grep long
        .long   24

$ echo 'unsigned x = sizeof(union{unsigned char sha1[20]; long offset;});' | gcc -S -x c -m32 -o - - | grep long
        .long   20


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 19:36           ` Sergey Vlasov
@ 2006-10-17 20:10             ` Junio C Hamano
  2006-10-17 20:25               ` Nicolas Pitre
  2006-10-17 20:23             ` Nicolas Pitre
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-17 20:10 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Sergey Vlasov

Sergey Vlasov <vsu@altlinux.ru> writes:

> On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote:
>
>> Is it possible that sizeof(union delta_base) might not be equal to 20 
>> for you?
>
> Yes, on x86_64 this is 24 because of 8-byte alignment for longs:

Enough eyeballs made this bug shallow ;-)  Thanks.

diff --git a/index-pack.c b/index-pack.c
index fffddd2..49b6efe 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -166,6 +166,7 @@ static void *unpack_raw_entry(unsigned l
 	case OBJ_REF_DELTA:
 		if (pos + 20 >= pack_limit)
 			bad_object(offset, "object extends past end of pack");
+		memset(delta_base, 0, sizeof(*delta_base));
 		hashcpy(delta_base->sha1, pack_base + pos);
 		pos += 20;
 		break;
@@ -290,6 +291,7 @@ static void resolve_delta(struct delta_e
 		bad_object(obj->offset, "failed to apply delta");
 	sha1_object(result, result_size, type, obj->sha1);
 
+	memset(&delta_base, 0, sizeof(delta_base));
 	hashcpy(delta_base.sha1, obj->sha1);
 	if (!find_delta_childs(&delta_base, &first, &last)) {
 		for (j = first; j <= last; j++)
@@ -365,6 +367,7 @@ static void parse_pack_objects(void)
 
 		if (obj->type == OBJ_REF_DELTA || obj->type == OBJ_OFS_DELTA)
 			continue;
+		memset(&base, 0, sizeof(base));
 		hashcpy(base.sha1, obj->sha1);
 		ref = !find_delta_childs(&base, &ref_first, &ref_last);
 		memset(&base, 0, sizeof(base));

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 19:36           ` Sergey Vlasov
  2006-10-17 20:10             ` Junio C Hamano
@ 2006-10-17 20:23             ` Nicolas Pitre
  2006-10-17 20:51               ` Linus Torvalds
  1 sibling, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 20:23 UTC (permalink / raw)
  To: Sergey Vlasov; +Cc: Junio C Hamano, git

On Tue, 17 Oct 2006, Sergey Vlasov wrote:

> On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote:
> 
> > Is it possible that sizeof(union delta_base) might not be equal to 20 
> > for you?
> 
> Yes, on x86_64 this is 24 because of 8-byte alignment for longs:

Ah bummer.  Then this is most likely the cause.  And here's a simple 
fix (Junio please confirm):

diff --git a/index-pack.c b/index-pack.c
index fffddd2..56c590e 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -23,6 +23,12 @@ union delta_base {
 	unsigned long offset;
 };
 
+/*
+ * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
+ * to memcmp() only the first 20 bytes.
+ */
+#define UNION_BASE_SZ	20
+
 struct delta_entry
 {
 	struct object_entry *obj;
@@ -211,7 +217,7 @@ static int find_delta(const union delta_
                 struct delta_entry *delta = &deltas[next];
                 int cmp;
 
-                cmp = memcmp(base, &delta->base, sizeof(*base));
+                cmp = memcmp(base, &delta->base, UNION_BASE_SZ);
                 if (!cmp)
                         return next;
                 if (cmp < 0) {
@@ -232,9 +238,9 @@ static int find_delta_childs(const union
 
 	if (first < 0)
 		return -1;
-	while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base)))
+	while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
 		--first;
-	while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base)))
+	while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
 		++last;
 	*first_index = first;
 	*last_index = last;
@@ -312,7 +318,7 @@ static int compare_delta_entry(const voi
 {
 	const struct delta_entry *delta_a = a;
 	const struct delta_entry *delta_b = b;
-	return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base));
+	return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ);
 }
 
 static void parse_pack_objects(void)


Nicolas

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 20:10             ` Junio C Hamano
@ 2006-10-17 20:25               ` Nicolas Pitre
  0 siblings, 0 replies; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 20:25 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Sergey Vlasov

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> Sergey Vlasov <vsu@altlinux.ru> writes:
> 
> > On Tue, 17 Oct 2006 14:47:16 -0400 (EDT) Nicolas Pitre wrote:
> >
> >> Is it possible that sizeof(union delta_base) might not be equal to 20 
> >> for you?
> >
> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> 
> Enough eyeballs made this bug shallow ;-)  Thanks.
> 
> diff --git a/index-pack.c b/index-pack.c
> index fffddd2..49b6efe 100644
> --- a/index-pack.c
> +++ b/index-pack.c
> @@ -166,6 +166,7 @@ static void *unpack_raw_entry(unsigned l
>  	case OBJ_REF_DELTA:
>  		if (pos + 20 >= pack_limit)
>  			bad_object(offset, "object extends past end of pack");
> +		memset(delta_base, 0, sizeof(*delta_base));
>  		hashcpy(delta_base->sha1, pack_base + pos);

Yeah.... but this is a bit wasteful since (especially on 32-bit archs) 
you just write to the same memory twice in a row.


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 20:23             ` Nicolas Pitre
@ 2006-10-17 20:51               ` Linus Torvalds
  2006-10-17 21:21                 ` Nicolas Pitre
  2006-10-17 21:54                 ` Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Linus Torvalds @ 2006-10-17 20:51 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> On Tue, 17 Oct 2006, Sergey Vlasov wrote:
> > 
> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> 
> Ah bummer.  Then this is most likely the cause.  And here's a simple 
> fix (Junio please confirm):

Why do you use "unsigned long" in the first place?

For some structure like this, it sounds positively wrong. Pack-files 
should be architecture-neutral, which means that they shouldn't depend on 
word-size, and they should be in some neutral byte-order.

Quite frankly, this all makes me go "Eww..". The original pack-file (well, 
v2) format was well-defined and had none of these issues. In contrast, the 
new code in 'next' is just _ugly_.

And the thing about "ugly" is that it also tends to mean "fragile" and 
"buggy" and "hard to extend later".

And maybe it's just me, but I consider unions to be bug-prone on their 
own. The "master" branch has exactly two unions: the "grep_expr" structure 
contains one (where the union member is clearly defined by the node type 
in that structure), and object.c has a "union any_object" that _literally_ 
exists as purely an allocation size issue (ie it is used _only_ to 
allocate the maximum size of any of the possible structures).

In contrast, the new union introduced in "next" is just horrid. There's 
not even any way to know which member to use, except apparently that it 
expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
true, but still - that's some ugly stuff.

Is this something you want to bet a big project on?

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 20:51               ` Linus Torvalds
@ 2006-10-17 21:21                 ` Nicolas Pitre
  2006-10-17 21:46                   ` Linus Torvalds
  2006-10-17 21:54                 ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-17 21:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > On Tue, 17 Oct 2006, Sergey Vlasov wrote:
> > > 
> > > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> > 
> > Ah bummer.  Then this is most likely the cause.  And here's a simple 
> > fix (Junio please confirm):
> 
> Why do you use "unsigned long" in the first place?

Because offsets into packs are expressed as unsigned long everywhere 
else (except in the current pack index on-disk format).

> For some structure like this, it sounds positively wrong. Pack-files 
> should be architecture-neutral, which means that they shouldn't depend on 
> word-size, and they should be in some neutral byte-order.

But they do.  Please consider this code:

	case OBJ_OFS_DELTA:
		memset(delta_base, 0, sizeof(*delta_base));
		c = pack_base[pos++];
		base_offset = c & 127;
		while (c & 128) {
			base_offset += 1;
			if (!base_offset || base_offset & ~(~0UL >> 7))
				bad_object(offset, "offset value overflow for delta base object");
			if (pos >= pack_limit)
				bad_object(offset, "object extends past end of pack");
			c = pack_base[pos++];
			base_offset = (base_offset << 7) + (c & 127);
		}
		delta_base->offset = offset - base_offset;
		if (delta_base->offset >= offset)
			bad_object(offset, "delta base offset is out of bound");
		break;

Do you see anything inerently wrong in this code?  The above is already 
64-bit ready such that it'll just work on 64-bit archs and will display 
a sensible message if a 32-bit arch encounter a pack larger than 4GB.  
But the on-disk pack format has no limitation what so ever.

> Quite frankly, this all makes me go "Eww..". The original pack-file (well, 
> v2) format was well-defined and had none of these issues. In contrast, the 
> new code in 'next' is just _ugly_.

I beg to differ.  Please reconsider in light of the above.

> And maybe it's just me, but I consider unions to be bug-prone on their 
> own. The "master" branch has exactly two unions: the "grep_expr" structure 
> contains one (where the union member is clearly defined by the node type 
> in that structure), and object.c has a "union any_object" that _literally_ 
> exists as purely an allocation size issue (ie it is used _only_ to 
> allocate the maximum size of any of the possible structures).
> 
> In contrast, the new union introduced in "next" is just horrid. There's 
> not even any way to know which member to use, except apparently that it 
> expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> true, but still - that's some ugly stuff.

This union should be looked at just like a sortable hash pointing to a 
base object so that deltas with the same base object can be sorted 
together.  And the field to use is well defined of course: deltas with 
sha1 to base use the sha1 member, deltas with offset to base use the 
offset member.  This hash, together with the delta type, constitute a 
tuple guaranteed to be unique so there can't be any confusion.

> Is this something you want to bet a big project on?

I don't see why not.


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 21:21                 ` Nicolas Pitre
@ 2006-10-17 21:46                   ` Linus Torvalds
  2006-10-18  0:20                     ` Nicolas Pitre
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-17 21:46 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git



On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> 
> Because offsets into packs are expressed as unsigned long everywhere 
> else (except in the current pack index on-disk format).

Until your work, that "unsigned long" was totally just an internal thing 
that didn't actually bleed into anything else.

> > For some structure like this, it sounds positively wrong. Pack-files 
> > should be architecture-neutral, which means that they shouldn't depend on 
> > word-size, and they should be in some neutral byte-order.
> 
> But they do.  Please consider this code:

Right. The pack-file itself. But the code that actually _generates_ it 
mixes things in alarming ways.

> > In contrast, the new union introduced in "next" is just horrid. There's 
> > not even any way to know which member to use, except apparently that it 
> > expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> > true, but still - that's some ugly stuff.
> 
> This union should be looked at just like a sortable hash pointing to a 
> base object so that deltas with the same base object can be sorted 
> together.

.. and it sorts _differently_ on a big-endian vs little-endian thing, 
doesn't it?

So now the sort order depends on endianness and/or wordsize. That just 
sounds really really wrong.

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 20:51               ` Linus Torvalds
  2006-10-17 21:21                 ` Nicolas Pitre
@ 2006-10-17 21:54                 ` Junio C Hamano
  2006-10-18  1:38                   ` Nicolas Pitre
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-17 21:54 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, git

Linus Torvalds <torvalds@osdl.org> writes:

> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
>> On Tue, 17 Oct 2006, Sergey Vlasov wrote:
>> > 
>> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
>> 
>> Ah bummer.  Then this is most likely the cause.  And here's a simple 
>> fix (Junio please confirm):
>
> Why do you use "unsigned long" in the first place?
>
> For some structure like this, it sounds positively wrong. Pack-files 
> should be architecture-neutral, which means that they shouldn't depend on 
> word-size, and they should be in some neutral byte-order.

This is an in-core structure if I am not mistaken, and is never
written out in the resulting pack file.  The program is reading
from .pack and building .idx that corresponds to it.  I agree
that it is ugly, but I do not hink using neutral byte-order in
this part of the processing buys us anything in this particular
case.

> In contrast, the new union introduced in "next" is just horrid. There's 
> not even any way to know which member to use, except apparently that it 
> expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> true, but still - that's some ugly stuff.

Again I agree with the ugliness objection, and I raised the
"expecting no collision" issue which Nico refuted later.  The
code is probably safe (not just safe in practice but safe in
theory as well), although that would not change the fact that it
is ugly X-<.

Nico, could we have an un-uglied version please?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 21:46                   ` Linus Torvalds
@ 2006-10-18  0:20                     ` Nicolas Pitre
  2006-10-18  0:57                       ` Linus Torvalds
  2006-10-18  1:30                       ` Junio C Hamano
  0 siblings, 2 replies; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18  0:20 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > 
> > Because offsets into packs are expressed as unsigned long everywhere 
> > else (except in the current pack index on-disk format).
> 
> Until your work, that "unsigned long" was totally just an internal thing 
> that didn't actually bleed into anything else.

And would you please explain how my work changes that state of affairs?
Sorry but I don't follow you here.  Still _I_ wrote that code.

> > > For some structure like this, it sounds positively wrong. Pack-files 
> > > should be architecture-neutral, which means that they shouldn't depend on 
> > > word-size, and they should be in some neutral byte-order.
> > 
> > But they do.  Please consider this code:
> 
> Right. The pack-file itself. But the code that actually _generates_ it 
> mixes things in alarming ways.

???

> > > In contrast, the new union introduced in "next" is just horrid. There's 
> > > not even any way to know which member to use, except apparently that it 
> > > expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> > > true, but still - that's some ugly stuff.
> > 
> > This union should be looked at just like a sortable hash pointing to a 
> > base object so that deltas with the same base object can be sorted 
> > together.
> 
> .. and it sorts _differently_ on a big-endian vs little-endian thing, 
> doesn't it?

Sure.  But who cares?  The sorting is just there to 1) perform binary 
searches on the list of deltas based from a given object, and 2) find a 
list of all deltas with the same base object.

> So now the sort order depends on endianness and/or wordsize. That just 
> sounds really really wrong.

Again, who cares?  That ordering doesn't influence any data produced by 
the tool.  It is an internal and private strategy to speed up the 
_local_ _searching_ process.  It could be replaced by a dumb linear 
list walk if you wish and the end result i.e. 
the produced pack index would be exactly the same (with a significant 
slowdown notwitstanding).

So let me summarize:

 - the union is a hash.

 - the hash is either an offset value or a sha1 digest.

 - this hash is used for fast object lookup _only_.

 - it does sort differently on big vs little endian machines.

 - but we don't care at all because

 - it is a private algorithmic thing that doesn't "bleed" into any 
   _real_ data structure, and

 - it doesn't have any influence on the format of the end result.

 - it is only a runtime abstraction and nothing else.

 - It never gets into the pack nor the pack index themselves.

Do you still have issues with that?


Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  0:20                     ` Nicolas Pitre
@ 2006-10-18  0:57                       ` Linus Torvalds
  2006-10-18  2:08                         ` Nicolas Pitre
  2006-10-18  1:30                       ` Junio C Hamano
  1 sibling, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-18  0:57 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > 
> > .. and it sorts _differently_ on a big-endian vs little-endian thing, 
> > doesn't it?
> 
> Sure.  But who cares?  The sorting is just there to 1) perform binary 
> searches on the list of deltas based from a given object, and 2) find a 
> list of all deltas with the same base object.

_I_ care.

The new code is messy. It's fragile, and already showed one very 
fundamental bug which depended on architectures.

These things matter. We have had very few bugs in git, and one of the 
reasons is (I believe) that we haven't had ad-hoc code. I get _very_ 
nervous when you mix up SHA1 names with somethign totally different 
without even a flag to say which one it is. That's just nasty. The fact 
that the code then behaves (and behave_d_) differently on different 
architectures is just a sign of the problem.

"Who cares?" is not a good question to ask for a SCM. 

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  0:20                     ` Nicolas Pitre
  2006-10-18  0:57                       ` Linus Torvalds
@ 2006-10-18  1:30                       ` Junio C Hamano
  2006-10-18  2:23                         ` Nicolas Pitre
  1 sibling, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-18  1:30 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> So let me summarize:
>
>  - the union is a hash.
>
>  - the hash is either an offset value or a sha1 digest.
>
>  - this hash is used for fast object lookup _only_.
>
>  - it does sort differently on big vs little endian machines.
>
>  - but we don't care at all because
>
>  - it is a private algorithmic thing that doesn't "bleed" into any 
>    _real_ data structure, and
>
>  - it doesn't have any influence on the format of the end result.
>
>  - it is only a runtime abstraction and nothing else.
>
>  - It never gets into the pack nor the pack index themselves.
>
> Do you still have issues with that?

The part you pointed out to me about "accidental collision"
still bothers me somewhat.

Right now we do not produce ref-delta and ofs-delta in the same
stream, but if somebody did so then it would mean a disaster to
have an accidental collision of an 8-byte offset value plus
12-byte traiing NUL and another base object whose object name
happens to match that pattern.

I am actually Ok if we say the code assumes one stream has only
ref-delta or ofs-delta and never both.

But then I suspect the first pass of parse_pack_objects() should
make sure that assumption holds true for the pack being
inspected and barf if it is not.  Also the second pass do not
have to run two find_delta_childs() calls per delta object
because by that time we know which kind would never appear in
the packfile.

By the way can we call that find_delta_children() pretty please?

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-17 21:54                 ` Junio C Hamano
@ 2006-10-18  1:38                   ` Nicolas Pitre
  0 siblings, 0 replies; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18  1:38 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Linus Torvalds, Sergey Vlasov, git

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> Linus Torvalds <torvalds@osdl.org> writes:
> 
> > On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> >> On Tue, 17 Oct 2006, Sergey Vlasov wrote:
> >> > 
> >> > Yes, on x86_64 this is 24 because of 8-byte alignment for longs:
> >> 
> >> Ah bummer.  Then this is most likely the cause.  And here's a simple 
> >> fix (Junio please confirm):
> >
> > Why do you use "unsigned long" in the first place?
> >
> > For some structure like this, it sounds positively wrong. Pack-files 
> > should be architecture-neutral, which means that they shouldn't depend on 
> > word-size, and they should be in some neutral byte-order.
> 
> This is an in-core structure if I am not mistaken, and is never
> written out in the resulting pack file.  The program is reading
> from .pack and building .idx that corresponds to it.  I agree
> that it is ugly, but I do not hink using neutral byte-order in
> this part of the processing buys us anything in this particular
> case.

Exact.

> > In contrast, the new union introduced in "next" is just horrid. There's 
> > not even any way to know which member to use, except apparently that it 
> > expects that a SHA1 is never zero in the last 12 bytes. Which is probably 
> > true, but still - that's some ugly stuff.
> 
> Again I agree with the ugliness objection, and I raised the
> "expecting no collision" issue which Nico refuted later.  The
> code is probably safe (not just safe in practice but safe in
> theory as well), although that would not change the fact that it
> is ugly X-<.

The problem is that I just don't see what you find "ugly" in the thing.
Maybe it is just a matter of few missing comments?

Please let me walk you through bits of the code. I hope you'll 
understand better why it is like it is after that.

In parse_pack_objects() you can read this:

        /*
         * First pass:
         * - find locations of all objects;
         * - calculate SHA1 of all non-delta objects;
         * - remember base SHA1 for all deltas.
         */

So we reads through the whole pack to find where objects are located. 
This information is stored in the objects[] array for all objects.  
Entries for that array are:

struct object_entry
{
        unsigned long offset;
        enum object_type type;
        enum object_type real_type;
        unsigned char sha1[20];
};

ON that first pass only the offset and type can be determined for all 
objects.  The sha1 can be determined for non delta objects only.

There is a second array, deltas[], to record information about deltas as 
they are found:

struct delta_entry
{
        struct object_entry *obj;
        union delta_base base;
};

The "obj" member points to the entry in the objects[] array for given 
delta entry.  And the "base" member, as you may guess, is a reference to 
the base object for that delta.

So here we are with that dreaded union.

Now why isn't it a second struct object_entry pointer instead?  Because 
one kind of delta allows for its base object to appear later in the pack 
and all we've got for identifying that base object is a sha1 reference 
since we cannot predict where the base object will be.  With the other 
object type we _could_ find out about the location of the base object 
right away.  But that would mean adding a special case for those objects 
through a different code path unnecessarily.  Instead, the base offset 
is treated just like a very special sha1 value and the same array is 
used.  So maybe the comment that says "remember base SHA1 for all 
deltas" could be adjusted to mention "base sha1 or offset" to be exact.

Now what next:

        /* Sort deltas by base SHA1/offset for fast searching */
        qsort(deltas, nr_deltas, sizeof(struct delta_entry),
              compare_delta_entry);

This comment, though, is extremely accurate.  We sort delta entries "for 
fast searching".  There is no other use for the union blob.  To avoid 
the union there could have been two separate delta arrays: one for 
OBJ_REF_DELTA and another for OBJ_OFS_DELTA.  Then two 
compare_delta_entry functions whould have been needed instead of only 
one.  But because we only use the union blob for searching given another 
such union blob for target, we actually don't care if the union content 
is a sha1 or an offset, and we don't care if that offset is 32-bit, 
64-bit, little endian or big endian.  All that we care about is that the 
delta entries are sorted according to that union blob for fast binary 
search.  That is really all there is about it and therefore a single 
qsort() does the trick.

Next:

        /*
         * Second pass:
         * - for all non-delta objects, look if it is used as a base for
         *   deltas;
         * - if used as a base, uncompress the object and apply all deltas,
         *   recursively checking if the resulting object is used as a base
         *   for some more deltas.
         */

Here we want to fill out the blanks in the objects[] array for delta 
objects.  It is of course much more efficient to start from a given base 
object, inflate it, and use it successively on all delta objects that 
depend on it.  And let's do it recursively if the delta is also a 
base object itself while at it.

So how does the dreaded union comes into play?  It is simple.  For each 
non-delta objects, we create two union blobs.  One with the sha1 member, 
the other with the offset member.  Those blobs are used as the key to 
search for a range of deltas that have that non-delta object for base.  
This has the possibility to find two ranges: one for deltas with base as 
sha1, and the other range for deltas with offsets as sha1.  Once the 
delta ranges are known, the deltas are applied on that base object, the 
resulting object's sha1 is computed and 
the whole thing is repeated for those resulting objects as potential 
base objects for more deltas.

Junio mentioned the possibility for collision between a union value that 
should be treated as a sha1 and a union value that should be treated as 
a delta base offset.  Although such a collision is extremely improbable, 
it is still possible and if so the wrong delta type could be applied on 
top of the wrong base object.  To avoid this issue, a test is made on 
the delta type before it is replayed.  So if the search key 
was created using the sha1 union member, we make sure that it is used 
with OBJ_REF_DELTA objects only.  And ditto for the search key created 
with the offset union member.  This way there just can not be any 
mistake even if there happens to be a collision in the base object 
reference namespace.

Finally, the objects[] array is completely populated with objects sha1 
and offsets. The dreaded union is completely forgotten (the deltas array 
is actually freed).  And only then the pack index is written out to 
disk.

> Nico, could we have an un-uglied version please?

I'm really sorry to say that I don't know how I could make it less 
"ugly".  To me it is beautiful and efficient as is and I don't see why 
it needs to be done another way.  If you still think otherwise then your 
guidance would be highly appreciated.

Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  0:57                       ` Linus Torvalds
@ 2006-10-18  2:08                         ` Nicolas Pitre
  2006-10-18  3:12                           ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18  2:08 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> > > 
> > > .. and it sorts _differently_ on a big-endian vs little-endian thing, 
> > > doesn't it?
> > 
> > Sure.  But who cares?  The sorting is just there to 1) perform binary 
> > searches on the list of deltas based from a given object, and 2) find a 
> > list of all deltas with the same base object.
> 
> _I_ care.

OK, So I do.

> The new code is messy. It's fragile, and already showed one very 
> fundamental bug which depended on architectures.

My stance is that it is not fragile.  Sure it had one bug that depended 
on an architecture difference, but so was commit ac58c7b18e about.

The code also has many consistency checks all over so that it doesn't 
write out garbage if such bugs arise.  And that fundamental bug was 
actually a trivial one that was caught right away by such consistency 
check.

> These things matter. We have had very few bugs in git, and one of the 
> reasons is (I believe) that we haven't had ad-hoc code. I get _very_ 
> nervous when you mix up SHA1 names with somethign totally different 
> without even a flag to say which one it is. That's just nasty.

But there _is_ a flag for damn sake.  Did you at least try to understand 
the code and not just skim over it from 10000 feet above?

It is really simple:

 - if the found union content matches with a reference union initialized 
   through the sha1 member then deltas[j].obj->type == OBJ_REF_DELTA 
   must be true.

 - if the found union content matches with a reference union initialized 
   through the sha1 member then deltas[j].obj->type == OBJ_OFS_DELTA 
   must be true.

 - For all deltas with deltas[j].obj->type == OBJ_REF_DELTA there can 
   not be more than one of them with the same union value.

 - For all deltas with deltas[j].obj->type == OBJ_OFS_DELTA there can 
   not be more than one of them with the same union value.

There is _no_ confusion possible.

> The fact that the code then behaves (and behave_d_) differently on 
> different architectures is just a sign of the problem.

Does this mean that, with your own change to xdiff that has just been 
committed, you actually created a "problem"?  Because this is a change 
that creates different behaviors whether a 32-bit or 64-bit architecture 
is used, Right?

But of course not.  We want it to behave differently on 64-bit than 
32-bit.  My code is in the _same_ camp since I explicitly want it to 
sort numbers differently whether it is a little endian or big endian 
machine.

So this is not a problem this is a feature, and very by design.

> "Who cares?" is not a good question to ask for a SCM. 

Please just try to understand why I'm claming this is not important in 
this very case.  Please do me this favor.

Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  1:30                       ` Junio C Hamano
@ 2006-10-18  2:23                         ` Nicolas Pitre
  2006-10-18  4:16                           ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18  2:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> The part you pointed out to me about "accidental collision"
> still bothers me somewhat.

I'll try to clear them away.

> Right now we do not produce ref-delta and ofs-delta in the same
> stream, 

It is fully supported nevertheless.

> but if somebody did so then it would mean a disaster to have an 
> accidental collision of an 8-byte offset value plus 12-byte traiing 
> NUL and another base object whose object name happens to match that 
> pattern.

Not really.  The only effect that would have on the sorted list of 
delta entries -- such sorting used to bring all deltas with the same 
base object contigously -- is that those deltas might not be perfectly 
contigous wrt their base object.  This is why there is a test to skip 
deltas if they happen not to be of the expected type.

> I am actually Ok if we say the code assumes one stream has only
> ref-delta or ofs-delta and never both.

I'm perfectly OK with both types completely randomized.

> But then I suspect the first pass of parse_pack_objects() should
> make sure that assumption holds true for the pack being
> inspected and barf if it is not.

This is an unnecessary restriction though.

> Also the second pass do not have to run two find_delta_childs() calls 
> per delta object because by that time we know which kind would never 
> appear in the packfile.

True, but the flexibility is worth having I think.  It makes the thing 
more robust instead of less.

> By the way can we call that find_delta_children() pretty please?

I have no problem with that.

Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  2:08                         ` Nicolas Pitre
@ 2006-10-18  3:12                           ` Linus Torvalds
  2006-10-18  6:09                             ` Davide Libenzi
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-18  3:12 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Nicolas Pitre wrote:
> 
> But there _is_ a flag for damn sake.  Did you at least try to understand 
> the code and not just skim over it from 10000 feet above?

I only looked at it from the patches, and the actual data structure, 
and they didn't have it, so..

> There is _no_ confusion possible.

Ok. Good.

> Does this mean that, with your own change to xdiff that has just been 
> committed, you actually created a "problem"?  Because this is a change 
> that creates different behaviors whether a 32-bit or 64-bit architecture 
> is used, Right?

If you go back to that discussion, I actually pointed out several times 
that the whole bug _was_ actually introduced exactly because the xdiff 
code used things that behave differently depending on word-size.

My suggestion for a _proper_ fix was to not use "unsigned long" for that, 
and the patch I suggested (and eventually got merged) was to use the _low_ 
bits of the hash, exactly because the low bits are the ones that act the 
same, regardless of wordsize.

> But of course not.  We want it to behave differently on 64-bit than 
> 32-bit.

No, we actually don't. Not for xdiff, at least. The last thing you want is 
for different architectures to get different results. It's horrible. It 
means that bugs are hard to reproduce, and it means that even code that is 
"tested" is actually tested only for a particular architecture.

So the bug in xdiff was _exactly_ that somebody - totally incorrectly - 
thought it should work "better" on 64-bits. 

> Please just try to understand why I'm claming this is not important in 
> this very case.  Please do me this favor.

Maybe the code is fine. Maybe the particular detail wasn't important. But 
the original code didn't have _any_ dependencies on things like structure 
alignment that caused it to do strange things. 

And dammit, the fact is, I think the new format is just worse. I think it 
was a good thing to have the full SHA1 in the pack-file. I think the code 
got less understandable, and had more special cases, just because now we 
have two totally different kinds of deltas. So maybe I'm reacting to the 
fact that I think the bug happened in the first place for a very simple 
reason: the data structure wasn't unambiguous any more. 

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  2:23                         ` Nicolas Pitre
@ 2006-10-18  4:16                           ` Junio C Hamano
  2006-10-18  5:07                             ` Junio C Hamano
  0 siblings, 1 reply; 33+ messages in thread
From: Junio C Hamano @ 2006-10-18  4:16 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

>> but if somebody did so then it would mean a disaster to have an 
>> accidental collision of an 8-byte offset value plus 12-byte traiing 
>> NUL and another base object whose object name happens to match that 
>> pattern.
>
> Not really.  The only effect that would have on the sorted list of 
> delta entries -- such sorting used to bring all deltas with the same 
> base object contigously -- is that those deltas might not be perfectly 
> contigous wrt their base object.  This is why there is a test to skip 
> deltas if they happen not to be of the expected type.

Ah, I misread the code that uses union actually checks the type
in struct delta_entry (which embeds the union).  There won't be
any collision problem and you support both types at the same
time just fine.

And your patch to compare only the first 20-bytes makes sense
(assuming ulong is always shorter than 20-bytes which I think is
safe to assume).

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  4:16                           ` Junio C Hamano
@ 2006-10-18  5:07                             ` Junio C Hamano
  2006-10-18 10:00                               ` Johannes Schindelin
  2006-10-18 13:02                               ` Nicolas Pitre
  0 siblings, 2 replies; 33+ messages in thread
From: Junio C Hamano @ 2006-10-18  5:07 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git, Linus Torvalds

Junio C Hamano <junkio@cox.net> writes:

> Ah, I misread the code that uses union actually checks the type
> in struct delta_entry (which embeds the union).  There won't be
> any collision problem and you support both types at the same
> time just fine.
>
> And your patch to compare only the first 20-bytes makes sense
> (assuming ulong is always shorter than 20-bytes which I think is
> safe to assume).

Does this sound fair (the code is yours, just asking about the
log message)?

If we really wanted to be purist, we could run comparison with
the union and obj->type as two keys, but I do not think it is
worth it.

-- >8 --
From: Nicolas Pitre <nico@cam.org>
Date: Tue, 17 Oct 2006 16:23:26 -0400
Subject: [PATCH] index-pack: compare the first 20-bytes of the key.

The "union delta_base" is a strange beast.  It is a 20-byte
binary blob key to search a binary searchable deltas[] array,
each element of which uses it to represent its base object with
either a full 20-byte SHA-1 or an offset in the pack.  Which
representation is used is determined by another field of the
deltas[] array element, obj->type, so there is no room for
confusion, as long as we make sure we compare the keys for the
same type only with appropriate length.  The code compared the
full union with memcmp().

When storing the in-pack offset, the union was first cleared
before storing an unsigned long, so comparison worked fine.

On 64-bit architectures, however, the union typically is 24-byte
long; the code did not clear the remaining 4-byte alignment
padding when storing a full 20-byte SHA-1 representation.  Using
memcmp() to compare the whole union was wrong.

This fixes the comparison to look at the first 20-bytes of the
union, regardless of the architecture.  As long as ulong is
smaller than 20-bytes this works fine.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---
 index-pack.c |   14 ++++++++++----
 1 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/index-pack.c b/index-pack.c
index fffddd2..56c590e 100644
--- a/index-pack.c
+++ b/index-pack.c
@@ -23,6 +23,12 @@ union delta_base {
 	unsigned long offset;
 };

+/*
+ * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
+ * to memcmp() only the first 20 bytes.
+ */
+#define UNION_BASE_SZ	20
+
 struct delta_entry
 {
 	struct object_entry *obj;
@@ -211,7 +217,7 @@ static int find_delta(const union delta_
                 struct delta_entry *delta = &deltas[next];
                 int cmp;

-                cmp = memcmp(base, &delta->base, sizeof(*base));
+                cmp = memcmp(base, &delta->base, UNION_BASE_SZ);
                 if (!cmp)
                         return next;
                 if (cmp < 0) {
@@ -232,9 +238,9 @@ static int find_delta_childs(const union

 	if (first < 0)
 		return -1;
-	while (first > 0 && !memcmp(&deltas[first - 1].base, base, sizeof(*base)))
+	while (first > 0 && !memcmp(&deltas[first - 1].base, base, UNION_BASE_SZ))
 		--first;
-	while (last < end && !memcmp(&deltas[last + 1].base, base, sizeof(*base)))
+	while (last < end && !memcmp(&deltas[last + 1].base, base, UNION_BASE_SZ))
 		++last;
 	*first_index = first;
 	*last_index = last;
@@ -312,7 +318,7 @@ static int compare_delta_entry(const voi
 {
 	const struct delta_entry *delta_a = a;
 	const struct delta_entry *delta_b = b;
-	return memcmp(&delta_a->base, &delta_b->base, sizeof(union delta_base));
+	return memcmp(&delta_a->base, &delta_b->base, UNION_BASE_SZ);
 }

 static void parse_pack_objects(void)
-- 
1.4.2.4.gf9fe

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  3:12                           ` Linus Torvalds
@ 2006-10-18  6:09                             ` Davide Libenzi
  2006-10-18 14:56                               ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Davide Libenzi @ 2006-10-18  6:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Linus Torvalds wrote:

> > Does this mean that, with your own change to xdiff that has just been 
> > committed, you actually created a "problem"?  Because this is a change 
> > that creates different behaviors whether a 32-bit or 64-bit architecture 
> > is used, Right?
> 
> If you go back to that discussion, I actually pointed out several times 
> that the whole bug _was_ actually introduced exactly because the xdiff 
> code used things that behave differently depending on word-size.

Ehm, I think there's a little bit of confusion. The incorrect golden ratio 
prime selection for 64 bits machines was coalescing hash indexes into a 
very limited number of buckets, hence creating very bad performance on diff 
operations. The result of the diff would have been exacly the same, just 
coming out after the time for a cup of coffee and a croissant ;)

> So the bug in xdiff was _exactly_ that somebody - totally incorrectly - 
> thought it should work "better" on 64-bits. 

Haha, not exactly. I had just forgot about the incoming 64 bits world at 
the time.

- Davide

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  5:07                             ` Junio C Hamano
@ 2006-10-18 10:00                               ` Johannes Schindelin
  2006-10-18 13:13                                 ` Nicolas Pitre
  2006-10-18 13:02                               ` Nicolas Pitre
  1 sibling, 1 reply; 33+ messages in thread
From: Johannes Schindelin @ 2006-10-18 10:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, git, Linus Torvalds

Hi,

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> +/*
> + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
> + * to memcmp() only the first 20 bytes.
> + */
> +#define UNION_BASE_SZ	20

Excuse me for joining the game, but why don't you just use the 
recently introduced hashcmp() for that purpose? AFAIU you do exactly that, 
you compare hashes.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  5:07                             ` Junio C Hamano
  2006-10-18 10:00                               ` Johannes Schindelin
@ 2006-10-18 13:02                               ` Nicolas Pitre
  1 sibling, 0 replies; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18 13:02 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Linus Torvalds

On Tue, 17 Oct 2006, Junio C Hamano wrote:

> Junio C Hamano <junkio@cox.net> writes:
> 
> > Ah, I misread the code that uses union actually checks the type
> > in struct delta_entry (which embeds the union).  There won't be
> > any collision problem and you support both types at the same
> > time just fine.
> >
> > And your patch to compare only the first 20-bytes makes sense
> > (assuming ulong is always shorter than 20-bytes which I think is
> > safe to assume).
> 
> Does this sound fair (the code is yours, just asking about the
> log message)?
> 
> If we really wanted to be purist, we could run comparison with
> the union and obj->type as two keys, but I do not think it is
> worth it.
> 
> -- >8 --
> From: Nicolas Pitre <nico@cam.org>
> Date: Tue, 17 Oct 2006 16:23:26 -0400
> Subject: [PATCH] index-pack: compare the first 20-bytes of the key.
> 
> The "union delta_base" is a strange beast.  It is a 20-byte
> binary blob key to search a binary searchable deltas[] array,
> each element of which uses it to represent its base object with
> either a full 20-byte SHA-1 or an offset in the pack.  Which
> representation is used is determined by another field of the
> deltas[] array element, obj->type, so there is no room for
> confusion, as long as we make sure we compare the keys for the
> same type only with appropriate length.  The code compared the
> full union with memcmp().
> 
> When storing the in-pack offset, the union was first cleared
> before storing an unsigned long, so comparison worked fine.
> 
> On 64-bit architectures, however, the union typically is 24-byte
> long; the code did not clear the remaining 4-byte alignment
> padding when storing a full 20-byte SHA-1 representation.  Using
> memcmp() to compare the whole union was wrong.
> 
> This fixes the comparison to look at the first 20-bytes of the
> union, regardless of the architecture.  As long as ulong is
> smaller than 20-bytes this works fine.
> 
> Signed-off-by: Junio C Hamano <junkio@cox.net>

Signed-off-by: Nicolas Pitre <nico@cam.org>

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 10:00                               ` Johannes Schindelin
@ 2006-10-18 13:13                                 ` Nicolas Pitre
  0 siblings, 0 replies; 33+ messages in thread
From: Nicolas Pitre @ 2006-10-18 13:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, git, Linus Torvalds

On Wed, 18 Oct 2006, Johannes Schindelin wrote:

> Hi,
> 
> On Tue, 17 Oct 2006, Junio C Hamano wrote:
> 
> > +/*
> > + * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want
> > + * to memcmp() only the first 20 bytes.
> > + */
> > +#define UNION_BASE_SZ	20
> 
> Excuse me for joining the game, but why don't you just use the 
> recently introduced hashcmp() for that purpose? AFAIU you do exactly that, 
> you compare hashes.

Yes, and that is what I did originally.

But that could lead to false assumptions (and this thread already proved 
this code has its share of false assumption leads already).  The thing 
is that the memory chunk that is being compared is not always the same 
kind of hash as usually used with hashcmp().  Throughout the code 
hashcmp() is always used with a 20-byte sha1 digest.  In this case it 
can be either a 20-byte sha1 digest, or a long offset value.  And by 
using hashcmp() I would be afraid someone else could assume the hash is 
always a sha1 digest which it is not.

Nicolas

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18  6:09                             ` Davide Libenzi
@ 2006-10-18 14:56                               ` Linus Torvalds
  2006-10-18 16:17                                 ` Davide Libenzi
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-18 14:56 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Tue, 17 Oct 2006, Davide Libenzi wrote:
> 
> Ehm, I think there's a little bit of confusion. The incorrect golden ratio 
> prime selection for 64 bits machines was coalescing hash indexes into a 
> very limited number of buckets, hence creating very bad performance on diff 
> operations. The result of the diff would have been exacly the same, just 
> coming out after the time for a cup of coffee and a croissant ;)

But my point is, you would have been better off _without_ an algorithm 
that cared about the word-size at all, or with just using "uint32_t".

See? Yes, a "unsigned long" has more bits for hashing on a 64-bit 
architecture. But that's totally the wrong way of thinking about it. YOU 
DO NOT WANT MORE BITS! You want the same damn answer regardless of 
architecture!

A diff algorithm that gives different answers on a 32-bit LE architecture 
than on a 64-bit BE architecture is BROKEN. If I run on x86-64, I want the 
same answers I got on x86-32, and the same ones I get on ppc32. Anything 
else is SIMPLY NOT ACCEPTABLE!

So the whole idea that you should have used 64-bit values was broken, 
broken, broken. You should never have had anything that cared, because 
anything that cares is by definition buggy.

This is why we should use the _low_ bits. Never the high bits.

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 14:56                               ` Linus Torvalds
@ 2006-10-18 16:17                                 ` Davide Libenzi
  2006-10-18 16:52                                   ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Davide Libenzi @ 2006-10-18 16:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> On Tue, 17 Oct 2006, Davide Libenzi wrote:
> > 
> > Ehm, I think there's a little bit of confusion. The incorrect golden ratio 
> > prime selection for 64 bits machines was coalescing hash indexes into a 
> > very limited number of buckets, hence creating very bad performance on diff 
> > operations. The result of the diff would have been exacly the same, just 
> > coming out after the time for a cup of coffee and a croissant ;)
> 
> But my point is, you would have been better off _without_ an algorithm 
> that cared about the word-size at all, or with just using "uint32_t".

Yes. At the time I picked Knuth's hash function because it was simple and 
fast enough. But it needed special handling for different word sizes, 
exactly like kernel's hash_long() does.



> A diff algorithm that gives different answers on a 32-bit LE architecture 
> than on a 64-bit BE architecture is BROKEN. If I run on x86-64, I want the 
> same answers I got on x86-32, and the same ones I get on ppc32. Anything 
> else is SIMPLY NOT ACCEPTABLE!

Speaking in general, seen at the hash function level, of course an interface 
should not give different result for different word sizes or word endianess. 
Considering the diff algorithm as interface, as I said, the output was 
unaffected by the 64 bits word size. It was just very slow.



- Davide

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 16:17                                 ` Davide Libenzi
@ 2006-10-18 16:52                                   ` Linus Torvalds
  2006-10-18 21:21                                     ` Davide Libenzi
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-18 16:52 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Wed, 18 Oct 2006, Davide Libenzi wrote:
>
> Speaking in general, seen at the hash function level, of course an interface 
> should not give different result for different word sizes or word endianess. 
> Considering the diff algorithm as interface, as I said, the output was 
> unaffected by the 64 bits word size. It was just very slow.

Well, even the output may actually be affected, in the case of _real_ hash 
collisions (as opposed to just the hash _list_ collision that XDL_HASHLONG 
caused).

So I actually think it would be better to have "uint32_t" as the hash 
value - because that would mean that all diffs (or, in the case of the 
block-algorithm, the deltas) are guaranteed to give the same results 
regardless of architecture.

Right now, we actually generate a 64-bit hash value (BUT: for short lines, 
it's likely only _interesting_ in the low bits, so the high bits tend to 
have a very high likelihood of being zero). So hash collisions are 
different: on a 32-bit architecture, two lines may have the same hash, 
while on a 64-bit one, they are different.

And together with some of the limiters we have (eg XDL_MAX_EQLIMIT) hash 
collisions can sometimes affect the output.

Admittedly, in _practice_ this is really unlikely to affect anything 
(you'd get a valid diff in either case, they'd just possibly be subtly 
different, and the input data must be _really_ strange to even see that 
case), but I do think that the hash algorithm can matter.

NOTE! I'm not talking about XDL_HASHLONG(), I'm talking about the 
xdl_hash_record() hash, which returns differently-sized hash results on 
32-bit and 64-bit. And there are cases where we _only_ compare the hashes, 
and don't actually double-check the contents.

So I think that in _practice_ you can't see differences between a 32-bit 
version and a 64-bit one, but the possibility is there. Using "uint32_t" 
instead of "unsigned long" to keep track of hashes would avoid that 
theoretical problem (and might actually make for better performance on 
64-bit archtiectures, if only because of denser data structures and thus 
better cache behaviour).

			Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 16:52                                   ` Linus Torvalds
@ 2006-10-18 21:21                                     ` Davide Libenzi
  2006-10-18 21:48                                       ` Linus Torvalds
  0 siblings, 1 reply; 33+ messages in thread
From: Davide Libenzi @ 2006-10-18 21:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> On Wed, 18 Oct 2006, Davide Libenzi wrote:
> >
> > Speaking in general, seen at the hash function level, of course an interface 
> > should not give different result for different word sizes or word endianess. 
> > Considering the diff algorithm as interface, as I said, the output was 
> > unaffected by the 64 bits word size. It was just very slow.
> 
> Well, even the output may actually be affected, in the case of _real_ hash 
> collisions (as opposed to just the hash _list_ collision that XDL_HASHLONG 
> caused).
> 
> So I actually think it would be better to have "uint32_t" as the hash 
> value - because that would mean that all diffs (or, in the case of the 
> block-algorithm, the deltas) are guaranteed to give the same results 
> regardless of architecture.
> 
> Right now, we actually generate a 64-bit hash value (BUT: for short lines, 
> it's likely only _interesting_ in the low bits, so the high bits tend to 
> have a very high likelihood of being zero). So hash collisions are 
> different: on a 32-bit architecture, two lines may have the same hash, 
> while on a 64-bit one, they are different.
> 
> And together with some of the limiters we have (eg XDL_MAX_EQLIMIT) hash 
> collisions can sometimes affect the output.
> 
> Admittedly, in _practice_ this is really unlikely to affect anything 
> (you'd get a valid diff in either case, they'd just possibly be subtly 
> different, and the input data must be _really_ strange to even see that 
> case), but I do think that the hash algorithm can matter.
> 
> NOTE! I'm not talking about XDL_HASHLONG(), I'm talking about the 
> xdl_hash_record() hash, which returns differently-sized hash results on 
> 32-bit and 64-bit. And there are cases where we _only_ compare the hashes, 
> and don't actually double-check the contents.

The hash value (hence the hash bucket index) simply directs you to the 
bucket where a real record-compare loop is performed. Collisions here 
means only performance loss (you don't spread buckets enough), nothing 
more. So the different behaviour does not apply to the diff algo.
But, it would apply if the knowledge of the hash indexing would be 
"exported", for example inside an external file. Think about a trivial DB 
file where you store hash buckets on file an you want to lookup records 
based on the store hash layout. In that case, of course, if the hash 
function that generated the DB bucket layout is different from the one 
that you use to get the bucket index at lookup time, you're in trouble.
IOW if the hash function result is not "exported" is some way, it doesn't 
really matter if it's an 'unsigned long' or a 'uint32_t'. In the same way 
you cannot export binary structures using 'int' or 'long', and expect to 
be compatible over different architectures.

- Davide

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 21:21                                     ` Davide Libenzi
@ 2006-10-18 21:48                                       ` Linus Torvalds
  2006-10-18 22:34                                         ` Davide Libenzi
  0 siblings, 1 reply; 33+ messages in thread
From: Linus Torvalds @ 2006-10-18 21:48 UTC (permalink / raw)
  To: Davide Libenzi; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Wed, 18 Oct 2006, Davide Libenzi wrote:
> 
> The hash value (hence the hash bucket index) simply directs you to the 
> bucket where a real record-compare loop is performed.

As far as I can tell not all loops do a real "record-compare" thing.

Some of the hash loops _only_ look at the hash, and as such a bad hash 
will do more than just cause bad performance, it will actually degrade the 
diff itself. Isn't that what XDL_MAX_EQLIMIT effectively does?

Btw, the binary delta generator doesn't seem to have this issue at all: it 
uses "unsigned int" for the hash values, so the xdiff delta generation 
will give the same exact results on 32-bit and 64-bit architectures.

Or was that one of the changes by Nico? (I only looked at the git version 
of that code)

		Linus

^ permalink raw reply	[flat|nested] 33+ messages in thread

* Re: heads-up: git-index-pack in "next" is broken
  2006-10-18 21:48                                       ` Linus Torvalds
@ 2006-10-18 22:34                                         ` Davide Libenzi
  0 siblings, 0 replies; 33+ messages in thread
From: Davide Libenzi @ 2006-10-18 22:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Nicolas Pitre, Sergey Vlasov, Junio C Hamano, git

On Wed, 18 Oct 2006, Linus Torvalds wrote:

> 
> 
> On Wed, 18 Oct 2006, Davide Libenzi wrote:
> > 
> > The hash value (hence the hash bucket index) simply directs you to the 
> > bucket where a real record-compare loop is performed.
> 
> As far as I can tell not all loops do a real "record-compare" thing.
> 
> Some of the hash loops _only_ look at the hash, and as such a bad hash 
> will do more than just cause bad performance, it will actually degrade the 
> diff itself. Isn't that what XDL_MAX_EQLIMIT effectively does?

The XDL_MAX_EQLIMIT is used to limit the search for equal records, in the 
record-discard phase. Note though, that at that point that "ha" value is a 
record-class ID (every different record/line in the input has a unique ID).
Look at what xdl_classify_record() does. So in that case, XDL_HASHLONG can 
really simply be a bitmask. So comparing "ha" in the loop in there, does 
actually the right thing in any case (equal "ha" means really equal 
record).

> Btw, the binary delta generator doesn't seem to have this issue at all: it 
> uses "unsigned int" for the hash values, so the xdiff delta generation 
> will give the same exact results on 32-bit and 64-bit architectures.
> 
> Or was that one of the changes by Nico? (I only looked at the git version 
> of that code)

The binary diff in libxdiff uses a chaining hash, so even in that case it 
wouldn't have made a difference. I think Nico changed the hash to be a 
coalesced hash, and in that case it does change the output.

- Davide

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2006-10-18 22:34 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-10-17  4:55 heads-up: git-index-pack in "next" is broken Junio C Hamano
2006-10-17 15:39 ` Nicolas Pitre
2006-10-17 16:07   ` Junio C Hamano
2006-10-17 17:00     ` Nicolas Pitre
2006-10-17 18:11       ` Junio C Hamano
2006-10-17 18:47         ` Nicolas Pitre
2006-10-17 19:36           ` Sergey Vlasov
2006-10-17 20:10             ` Junio C Hamano
2006-10-17 20:25               ` Nicolas Pitre
2006-10-17 20:23             ` Nicolas Pitre
2006-10-17 20:51               ` Linus Torvalds
2006-10-17 21:21                 ` Nicolas Pitre
2006-10-17 21:46                   ` Linus Torvalds
2006-10-18  0:20                     ` Nicolas Pitre
2006-10-18  0:57                       ` Linus Torvalds
2006-10-18  2:08                         ` Nicolas Pitre
2006-10-18  3:12                           ` Linus Torvalds
2006-10-18  6:09                             ` Davide Libenzi
2006-10-18 14:56                               ` Linus Torvalds
2006-10-18 16:17                                 ` Davide Libenzi
2006-10-18 16:52                                   ` Linus Torvalds
2006-10-18 21:21                                     ` Davide Libenzi
2006-10-18 21:48                                       ` Linus Torvalds
2006-10-18 22:34                                         ` Davide Libenzi
2006-10-18  1:30                       ` Junio C Hamano
2006-10-18  2:23                         ` Nicolas Pitre
2006-10-18  4:16                           ` Junio C Hamano
2006-10-18  5:07                             ` Junio C Hamano
2006-10-18 10:00                               ` Johannes Schindelin
2006-10-18 13:13                                 ` Nicolas Pitre
2006-10-18 13:02                               ` Nicolas Pitre
2006-10-17 21:54                 ` Junio C Hamano
2006-10-18  1:38                   ` Nicolas Pitre

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).