All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 13/10] tests for various pack index features
@ 2007-04-10 20:26 Nicolas Pitre
  2007-04-11  2:57 ` Junio C Hamano
  0 siblings, 1 reply; 6+ messages in thread
From: Nicolas Pitre @ 2007-04-10 20:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

This is a fairly complete list of tests for various aspects of pack 
index versions 1 and  2.

Tests on index v2 include 32-bit and 64-bit offsets, as well as a nice 
demonstration of the flawed repacking integrity checks that index 
version 2 intend to solve over index version 1 with the per object CRC.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---

OK this should really be the last patch for this topic.

diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
new file mode 100755
index 0000000..3371964
--- /dev/null
+++ b/t/t5302-pack-index.sh
@@ -0,0 +1,147 @@
+#!/bin/sh
+#
+# Copyright (c) 2007 Nicolas Pitre
+#
+
+test_description='pack index with 64-bit offsets and object CRC'
+. ./test-lib.sh
+
+test_expect_success \
+    'setup' \
+    'rm -rf .git
+     git-init &&
+     for i in `seq -w 100`
+     do
+         echo $i >file_$i &&
+         dd if=/dev/urandom bs=8k count=1 >>file_$i &&
+         git-update-index --add file_$i || return 1
+     done &&
+     echo 101 >file_101 && tail -c 8k file_100 >>file_101 &&
+     git-update-index --add file_101 &&
+     tree=`git-write-tree` &&
+     commit=`git-commit-tree $tree </dev/null` && {
+	 echo $tree &&
+	 echo $commit &&
+	 git-ls-tree $tree | sed -e "s/.* \\([0-9a-f]*\\)	.*/\\1/"
+     } >obj-list &&
+     git-update-ref HEAD $commit'
+
+test_expect_success \
+    'pack-objects with index version 1' \
+    'pack1=$(git-pack-objects --index-version=1 test-1 <obj-list) &&
+     git-verify-pack -v "test-1-${pack1}.pack"'
+
+test_expect_success \
+    'pack-objects with index version 2' \
+    'pack2=$(git-pack-objects --index-version=2 test-2 <obj-list) &&
+     git-verify-pack -v "test-2-${pack2}.pack"'
+
+test_expect_success \
+    'both packs should be identical' \
+    'cmp "test-1-${pack1}.pack" "test-2-${pack2}.pack"'
+
+test_expect_failure \
+    'index v1 and index v2 should be different' \
+    'cmp "test-1-${pack1}.idx" "test-2-${pack2}.idx"'
+
+test_expect_success \
+    'index-pack with index version 1' \
+    'git-index-pack --index-version=1 -o 1.idx "test-1-${pack1}.pack"'
+
+test_expect_success \
+    'index-pack with index version 2' \
+    'git-index-pack --index-version=2 -o 2.idx "test-1-${pack1}.pack"'
+
+test_expect_success \
+    'index-pack results should match pack-objects ones' \
+    'cmp "test-1-${pack1}.idx" "1.idx" &&
+     cmp "test-2-${pack2}.idx" "2.idx"'
+
+test_expect_success \
+    'index v2: force some 64-bit offsets with pack-objects' \
+    'pack3=$(git-pack-objects --index-version=2,0x40000 test-3 <obj-list) &&
+     git-verify-pack -v "test-3-${pack3}.pack"'
+     
+test_expect_failure \
+    '64-bit offsets: should be different from previous index v2 results' \
+    'cmp "test-2-${pack2}.idx" "test-3-${pack3}.idx"'
+
+test_expect_success \
+    'index v2: force some 64-bit offsets with index-pack' \
+    'git-index-pack --index-version=2,0x40000 -o 3.idx "test-1-${pack1}.pack"'
+
+test_expect_success \
+    '64-bit offsets: index-pack result should match pack-objects one' \
+    'cmp "test-3-${pack3}.idx" "3.idx"'
+
+test_expect_success \
+    '[index v1] 1) stream pack to repository' \
+    'git-index-pack --index-version=1 --stdin < "test-1-${pack1}.pack" &&
+     git-prune-packed &&
+     test "`git-count-objects`" = "0 objects, 0 kilobytes" &&
+     cmp "test-1-${pack1}.pack" ".git/objects/pack/pack-${pack1}.pack" &&
+     cmp "test-1-${pack1}.idx"  ".git/objects/pack/pack-${pack1}.idx"'
+
+test_expect_success \
+    '[index v1] 2) create a stealth corruption in a delta base reference' \
+    '# this test assumes a delta smaller than 16 bytes at the end of the pack
+     git-show-index <1.idx | sort -n | tail -n 1 | (
+       read delta_offs delta_sha1 &&
+       git-cat-file blob "$delta_sha1" > blob_1 &&
+       chmod +w ".git/objects/pack/pack-${pack1}.pack" &&
+       dd of=".git/objects/pack/pack-${pack1}.pack" seek=$(($delta_offs + 1)) \
+	  if=".git/objects/pack/pack-${pack1}.idx" skip=$((256 * 4 + 4)) \
+	  bs=1 count=20 conv=notrunc &&
+       git-cat-file blob "$delta_sha1" > blob_2 )'
+
+test_expect_failure \
+    '[index v1] 3) corrupted delta happily returned wrong data' \
+    'cmp blob_1 blob_2'
+
+test_expect_failure \
+    '[index v1] 4) confirm that the pack is actually corrupted' \
+    'git-fsck --full $commit'
+
+test_expect_success \
+    '[index v1] 5) pack-objects happily reuses corrupted data' \
+    'pack4=$(git-pack-objects test-4 <obj-list) &&
+     test -f "test-4-${pack1}.pack"'
+
+test_expect_failure \
+    '[index v1] 6) newly created pack is BAD !' \
+    'git-verify-pack -v "test-4-${pack1}.pack"'
+
+test_expect_success \
+    '[index v2] 1) stream pack to repository' \
+    'rm -f .git/objects/pack/* &&
+     git-index-pack --index-version=2,0x40000 --stdin < "test-1-${pack1}.pack" &&
+     git-prune-packed &&
+     test "`git-count-objects`" = "0 objects, 0 kilobytes" &&
+     cmp "test-1-${pack1}.pack" ".git/objects/pack/pack-${pack1}.pack" &&
+     cmp "test-3-${pack1}.idx"  ".git/objects/pack/pack-${pack1}.idx"'
+
+test_expect_success \
+    '[index v2] 2) create a stealth corruption in a delta base reference' \
+    '# this test assumes a delta smaller than 16 bytes at the end of the pack
+     git-show-index <1.idx | sort -n | tail -n 1 | (
+       read delta_offs delta_sha1 delta_crc &&
+       git-cat-file blob "$delta_sha1" > blob_3 &&
+       chmod +w ".git/objects/pack/pack-${pack1}.pack" &&
+       dd of=".git/objects/pack/pack-${pack1}.pack" seek=$(($delta_offs + 1)) \
+	  if=".git/objects/pack/pack-${pack1}.idx" skip=$((8 + 256 * 4)) \
+	  bs=1 count=20 conv=notrunc &&
+       git-cat-file blob "$delta_sha1" > blob_4 )'
+
+test_expect_failure \
+    '[index v2] 3) corrupted delta happily returned wrong data' \
+    'cmp blob_3 blob_4'
+
+test_expect_failure \
+    '[index v2] 4) confirm that the pack is actually corrupted' \
+    'git-fsck --full $commit'
+
+test_expect_failure \
+    '[index v2] 5) pack-objects refuses to reuse corrupted data' \
+    'git-pack-objects test-5 <obj-list'
+
+test_done

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 13/10] tests for various pack index features
  2007-04-10 20:26 [PATCH 13/10] tests for various pack index features Nicolas Pitre
@ 2007-04-11  2:57 ` Junio C Hamano
  2007-04-11 12:57   ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Junio C Hamano @ 2007-04-11  2:57 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: git

Nicolas Pitre <nico@cam.org> writes:

> This is a fairly complete list of tests for various aspects of pack 
> index versions 1 and  2.
>
> Tests on index v2 include 32-bit and 64-bit offsets, as well as a nice 
> demonstration of the flawed repacking integrity checks that index 
> version 2 intend to solve over index version 1 with the per object CRC.
>
> Signed-off-by: Nicolas Pitre <nico@cam.org>
> ---
>
> OK this should really be the last patch for this topic.
>
> diff --git a/t/t5302-pack-index.sh b/t/t5302-pack-index.sh
> new file mode 100755
> index 0000000..3371964
> --- /dev/null
> +++ b/t/t5302-pack-index.sh
> @@ -0,0 +1,147 @@
> +#!/bin/sh
> +#
> +# Copyright (c) 2007 Nicolas Pitre
> +#
> +
> +test_description='pack index with 64-bit offsets and object CRC'
> +. ./test-lib.sh
> +
> +test_expect_success \
> +    'setup' \
> +    'rm -rf .git
> +     git-init &&
> +     for i in `seq -w 100`
> +     do
> +         echo $i >file_$i &&
> +         dd if=/dev/urandom bs=8k count=1 >>file_$i &&
> +         git-update-index --add file_$i || return 1
> +     done &&

Is there a way for our tests to be a bit more stable than
urandom?  I saw on the first run fsck was OOM-killed, but the
second and subsequent run did not.  It's a bit hard to diagnose.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 13/10] tests for various pack index features
  2007-04-11  2:57 ` Junio C Hamano
@ 2007-04-11 12:57   ` Nicolas Pitre
  2007-04-11 13:09     ` Olivier Galibert
  0 siblings, 1 reply; 6+ messages in thread
From: Nicolas Pitre @ 2007-04-11 12:57 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

On Tue, 10 Apr 2007, Junio C Hamano wrote:

> > +     for i in `seq -w 100`
> > +     do
> > +         echo $i >file_$i &&
> > +         dd if=/dev/urandom bs=8k count=1 >>file_$i &&
> > +         git-update-index --add file_$i || return 1
> > +     done &&
> 
> Is there a way for our tests to be a bit more stable than
> urandom?  I saw on the first run fsck was OOM-killed, but the
> second and subsequent run did not.  It's a bit hard to diagnose.

The problem here is that I really need large amount of random data that 
doesn't compress nor delta between objects.

Hmmm what we need is a random data generator that always produces the 
same thing.  I'll hack something to replace urandom.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 13/10] tests for various pack index features
  2007-04-11 12:57   ` Nicolas Pitre
@ 2007-04-11 13:09     ` Olivier Galibert
  2007-04-11 14:51       ` Shawn O. Pearce
  0 siblings, 1 reply; 6+ messages in thread
From: Olivier Galibert @ 2007-04-11 13:09 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Junio C Hamano, git

On Wed, Apr 11, 2007 at 08:57:09AM -0400, Nicolas Pitre wrote:
> Hmmm what we need is a random data generator that always produces the 
> same thing.  I'll hack something to replace urandom.

Don't hack something, ues the standard reference, the Mersenne Twister.

  http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html

PRNGs are the same as cryptosystems, it's very easy to hack up
something and get it very, very wrong.  And it's unnecessary, since
there are very good ones available.

  OG.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 13/10] tests for various pack index features
  2007-04-11 13:09     ` Olivier Galibert
@ 2007-04-11 14:51       ` Shawn O. Pearce
  2007-04-11 17:29         ` Nicolas Pitre
  0 siblings, 1 reply; 6+ messages in thread
From: Shawn O. Pearce @ 2007-04-11 14:51 UTC (permalink / raw)
  To: Olivier Galibert; +Cc: Nicolas Pitre, Junio C Hamano, git

Olivier Galibert <galibert@pobox.com> wrote:
> On Wed, Apr 11, 2007 at 08:57:09AM -0400, Nicolas Pitre wrote:
> > Hmmm what we need is a random data generator that always produces the 
> > same thing.  I'll hack something to replace urandom.
> 
> Don't hack something, ues the standard reference, the Mersenne Twister.
> 
>   http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
> 
> PRNGs are the same as cryptosystems, it's very easy to hack up
> something and get it very, very wrong.  And it's unnecessary, since
> there are very good ones available.

Indeed.  But Mersenne Twister doesn't have code to produce a random
file of size X given an initial constant seed of Y, does it?
A small program to produce X random bytes starting with seed Y
still needs to be hacked up.

Probably the smart thing to do here is to embed a copy of MT with
constant seeds so we always get the same data file produced on
every system, no matter what the implementation of the C library's
rand routine is.

Although MT is not GPL. It has its own license, one with a small
advertising clause...

-- 
Shawn.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 13/10] tests for various pack index features
  2007-04-11 14:51       ` Shawn O. Pearce
@ 2007-04-11 17:29         ` Nicolas Pitre
  0 siblings, 0 replies; 6+ messages in thread
From: Nicolas Pitre @ 2007-04-11 17:29 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Olivier Galibert, Junio C Hamano, git

On Wed, 11 Apr 2007, Shawn O. Pearce wrote:

> Olivier Galibert <galibert@pobox.com> wrote:
> > On Wed, Apr 11, 2007 at 08:57:09AM -0400, Nicolas Pitre wrote:
> > > Hmmm what we need is a random data generator that always produces the 
> > > same thing.  I'll hack something to replace urandom.
> > 
> > Don't hack something, ues the standard reference, the Mersenne Twister.
> > 
> >   http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt.html
> > 
> > PRNGs are the same as cryptosystems, it's very easy to hack up
> > something and get it very, very wrong.  And it's unnecessary, since
> > there are very good ones available.
> 
> Indeed.

Please don't get too excited.

We don't want a full fledged random number generator with a period of 
2^30000 that is faster than light and impossible to predict, etc, etc.

The _only_ thing we want is a convenient way to produce large files with 
garbage that is neither compressible nor deltifiable, but still 
reproducible.  And for that matter the Mersenne Twister algo is _way_ 
too heavy for our needs.

The one that I just implemented basically boils down to:

	while (count--) {
		next = next * 1103515245 + 12345;
		putchar((next >> 16) & 0xff);
	}

and that does the job perfectly well.


Nicolas

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2007-04-11 17:30 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-04-10 20:26 [PATCH 13/10] tests for various pack index features Nicolas Pitre
2007-04-11  2:57 ` Junio C Hamano
2007-04-11 12:57   ` Nicolas Pitre
2007-04-11 13:09     ` Olivier Galibert
2007-04-11 14:51       ` Shawn O. Pearce
2007-04-11 17:29         ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.