git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git hang with corrupted .pack
@ 2009-10-14  4:22 Andy Isaacson
  2009-10-14 14:23 ` Shawn O. Pearce
  0 siblings, 1 reply; 23+ messages in thread
From: Andy Isaacson @ 2009-10-14  4:22 UTC (permalink / raw)
  To: git

I have a clone of torvalds/linux-2.6.git that got corrupted; many git
commands hang with a backtrace like

% git cat-file -p 60fa3f769f7651a60125a0f44e3ffe3246d7cf39
...

#0  use_pack (p=0x1dcf310, w_cursor=0x7fffb4c0e430, offset=43544957, 
    left=0x7fffb4c0e348) at sha1_file.c:792
#1  0x00000000004990ed in unpack_compressed_entry (p=0x1dcf310, 
    w_curs=0x7fffb4c0e430, curpos=43544957, size=728) at sha1_file.c:1594
#2  0x000000000049b089 in unpack_entry (p=0x1dcf310, obj_offset=43544534, 
    type=0x7fffb4c0e7b4, sizep=0x7fffb4c0e7a8) at sha1_file.c:1821
#3  0x000000000049b5f2 in read_packed_sha1 (
    sha1=0x7fffb4c0e780 "`xxxxxxxxxxxxxxxxxxxxxxxxxxx", 
    type=0x7fffb4c0e7b4, size=0x7fffb4c0e7a8) at sha1_file.c:2054
#4  0x000000000049b6d6 in read_object (
    sha1=0x7fffb4c0e780 "`xxxxxxxxxxxxxxxxxxxxxxxxxxx", 
    type=0x7fffb4c0e7b4, size=0x7fffb4c0e7a8) at sha1_file.c:2142
#5  0x000000000049b765 in read_sha1_file (sha1=0x1dcf310 "", 
    type=0x7fffb4c0e430, size=0x0) at sha1_file.c:2158
#6  0x00000000004128da in cmd_cat_file (argc=<value optimized out>, 
    argv=<value optimized out>, prefix=<value optimized out>)
    at builtin-cat-file.c:125
#7  0x0000000000404293 in handle_internal_command (argc=3, argv=0x7fffb4c0ea30)
    at git.c:246
#8  0x0000000000404454 in main (argc=3, argv=0x7fffb4c0ea30) at git.c:438

(I overwrote the UTF8 that gdb helpfully spewed with 'x'.)

We're looping in unpack_compressed_entry, adding a fprintf immediately
after the call to git_inflate() shows:

git_inflate returned -5 next_in=0x7f5e602bc17d curpos=43544536 avail_in=293462652 avail_out=0
git_inflate returned -5 next_in=0x7f5e602bc17d curpos=43544957 avail_in=293462652 avail_out=0
git_inflate returned -5 next_in=0x7f5e602bc17d curpos=43544957 avail_in=293462652 avail_out=0
git_inflate returned -5 next_in=0x7f5e602bc17d curpos=43544957 avail_in=293462652 avail_out=0
git_inflate returned -5 next_in=0x7f5e602bc17d curpos=43544957 avail_in=293462652 avail_out=0

The pack is corrupt:

% hd -s $((0x2986fd0)) .git/objects/pack/pack-9836f9e49a483ad95e7de546d775a4f247e6ac74.pack
02986fd0  f6 17 44 74 4e d5 98 2d  78 9c 95 51 5d 8b db 30  |..DtN..-x..Q]..0|
02986fe0  10 7c d7 af d8 1f d0 18  cb df 0e a1 04 72 77 10  |.|...........rw.|
02986ff0  b8 f4 a0 97 f7 20 4b ab  58 77 8a 64 24 f9 92 fc  |..... K.Xw.d$...|
02987000  fb ca 4e 08 a5 b4 0f 7d  d3 ce 8e 66 76 77 82 43  |..N....}...fvw.C|
02987010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
02987020  00 00 00 00 00 00 00 00  0f 00 00 00 00 00 00 00  |................|
02987030  0f 00 00 00 33 a4 6d db  94 35 4d 5b 51 74 59 d9  |....3.m..5M[QtY.|
02987040  b5 b4 ca 51 ca a2 2a aa  28 83 88 b9 20 6c 0c bd  |...Q..*.(... l..|
02987050  75 b0 77 d6 08 d8 5d 3f  35 76 a3 0f b0 9a 81 e4  |u.w...]?5v......|
02987060  01 ac 0d 06 36 0c 09 b7  a7 ef 40 69 5d 95 6d 96  |....6.....@i].m.|
02987070  d3 0c 16 69 91 a6 24 a2  27 15 02 3a 78 55 66 f4  |...i..$.'..:xUf.|
02987080  b0 b7 ee 8b 69 e1 61 15  ee af f5 d9 5a 71 4d 74  |....i.a.....ZqMt|
02987090  6c 5f 16 d2 8e 46 b0 a0  ac 49 ac 3b de f4 2a 9a  |l_...F...I.;..*.|
029870a0  15 69 13 f5 ea a8 47 7e  bc bc 2f e1 45 5d 20 9c  |.i....G~../.E] .|
029870b0  2d 74 e3 d1 83 32 10 7a  84 b7 c3 d3 f6 e7 f3 66  |-t...2.z.......f|

and that corruption is almost certainly a kernel bug or hardware
failure, but git shouldn't lock up.

Ilari on #git suggested the following, which does resolve the hangs in
my case.

diff --git a/sha1_file.c b/sha1_file.c
index 4ea0b18..838df82 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1594,6 +1594,8 @@ static void *unpack_compressed_entry(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
+		if (stream.next_in == in)
+			break;
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);

If anyone has a suggestion for how 36 bytes of my .pack got overwritten,
I'm interested in that too; it's a Core 2 Duo (Thinkpad x200) running
Ubuntu Karmic beta with 2.6.31-13 (upgraded from -10 it looks like),
Intel GMA graphics, CONFIG_DMAR=n, ext4 on the internal HDD, which now
that I look at it actually does have a fair number of non-zero SMART
counters:

Device Model:     FUJITSU MHZ2160BH G1
Serial Number:    K60WT8C2HHRS
Firmware Version: 0084000A
User Capacity:    160,041,885,696 bytes
...
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   100   100   046    Pre-fail  Always       -       219593
  2 Throughput_Performance  0x0005   100   100   030    Pre-fail  Offline      -       27721728
  3 Spin_Up_Time            0x0003   100   100   025    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   099   099   000    Old_age   Always       -       406
  5 Reallocated_Sector_Ct   0x0033   100   100   024    Pre-fail  Always       -       8589934592000
  7 Seek_Error_Rate         0x000f   100   100   047    Pre-fail  Always       -       112
  8 Seek_Time_Performance   0x0005   100   100   019    Pre-fail  Offline      -       0
  9 Power_On_Hours          0x0032   097   097   000    Old_age   Always       -       1598
 10 Spin_Retry_Count        0x0013   100   100   020    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       284
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       78
193 Load_Cycle_Count        0x0032   100   100   000    Old_age   Always       -       1216
194 Temperature_Celsius     0x0022   100   100   000    Old_age   Always       -       38 (Lifetime Min/Max 21/46)
195 Hardware_ECC_Recovered  0x001a   100   100   000    Old_age   Always       -       247
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       -       457965568
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   253   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x000f   100   100   060    Pre-fail  Always       -       10448
203 Run_Out_Cancel          0x0002   100   100   000    Old_age   Always       -       1529011503750
240 Head_Flying_Hours       0x003e   200   200   000    Old_age   Always       -       0

-andy

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14  4:22 git hang with corrupted .pack Andy Isaacson
@ 2009-10-14 14:23 ` Shawn O. Pearce
  2009-10-14 16:09   ` Nicolas Pitre
  2009-10-20 16:52   ` Junio C Hamano
  0 siblings, 2 replies; 23+ messages in thread
From: Shawn O. Pearce @ 2009-10-14 14:23 UTC (permalink / raw)
  To: Andy Isaacson, Junio C Hamano; +Cc: git, Nicolas Pitre

Andy Isaacson <adi@hexapodia.org> wrote:
> We're looping in unpack_compressed_entry, adding a fprintf immediately
> after the call to git_inflate() shows:

Thanks, that was really quite helpful.  Junio/Nico, I think we can
just apply this patch to maint and include it in the next release:

--8<--
[PATCH] sha1_file: Fix infinite loop when pack is corrupted

Some types of corruption to a pack may confuse the deflate stream
which stores an object.  In Andy's reported case a 36 byte region
of the pack was overwritten, leading to what appeared to be a valid
deflate stream that was trying to produce a result larger than our
allocated output buffer could accept.

Z_BUF_ERROR is returned from inflate() if either the input buffer
needs more input bytes, or the output buffer has run out of space.
Previously we only considered the former case, as it meant we needed
to move the stream's input buffer to the next window in the pack.

We now abort the loop if inflate() returns Z_BUF_ERROR without
consuming the entire input buffer it was given, or has filled
the entire output buffer but has not yet returned Z_STREAM_END.
Either state is a clear indicator that this loop is not working
as expected, and should not continue.

This problem cannot occur with loose objects as we open the entire
loose object as a single buffer and treat Z_BUF_ERROR as an error.

Reported-by: Andy Isaacson <adi@hexapodia.org>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
---
 sha1_file.c |    4 ++++
 1 files changed, 4 insertions(+), 0 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4ea0b18..4cc8939 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1357,6 +1357,8 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
+		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
+			break;
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1594,6 +1596,8 @@ static void *unpack_compressed_entry(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
+		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
+			break;
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);
-- 
1.6.5.52.g0ff2e

-- 
Shawn.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 14:23 ` Shawn O. Pearce
@ 2009-10-14 16:09   ` Nicolas Pitre
  2009-10-14 16:12     ` Shawn O. Pearce
  2009-10-20 16:52   ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Nicolas Pitre @ 2009-10-14 16:09 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Andy Isaacson, Junio C Hamano, git

On Wed, 14 Oct 2009, Shawn O. Pearce wrote:

> Andy Isaacson <adi@hexapodia.org> wrote:
> > We're looping in unpack_compressed_entry, adding a fprintf immediately
> > after the call to git_inflate() shows:
> 
> Thanks, that was really quite helpful.  Junio/Nico, I think we can
> just apply this patch to maint and include it in the next release:
> 
> --8<--
> [PATCH] sha1_file: Fix infinite loop when pack is corrupted
> 
> Some types of corruption to a pack may confuse the deflate stream
> which stores an object.  In Andy's reported case a 36 byte region
> of the pack was overwritten, leading to what appeared to be a valid
> deflate stream that was trying to produce a result larger than our
> allocated output buffer could accept.
> 
> Z_BUF_ERROR is returned from inflate() if either the input buffer
> needs more input bytes, or the output buffer has run out of space.
> Previously we only considered the former case, as it meant we needed
> to move the stream's input buffer to the next window in the pack.
> 
> We now abort the loop if inflate() returns Z_BUF_ERROR without
> consuming the entire input buffer it was given, or has filled
> the entire output buffer but has not yet returned Z_STREAM_END.
> Either state is a clear indicator that this loop is not working
> as expected, and should not continue.
> 
> This problem cannot occur with loose objects as we open the entire
> loose object as a single buffer and treat Z_BUF_ERROR as an error.
> 
> Reported-by: Andy Isaacson <adi@hexapodia.org>
> Signed-off-by: Shawn O. Pearce <spearce@spearce.org>

Acked-by: Nicolas Pitre <nico@fluxnic.net>

This is unfortunate that making a test case for this isn't exactly 
trivial.


> ---
>  sha1_file.c |    4 ++++
>  1 files changed, 4 insertions(+), 0 deletions(-)
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index 4ea0b18..4cc8939 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1357,6 +1357,8 @@ unsigned long get_size_from_delta(struct packed_git *p,
>  		in = use_pack(p, w_curs, curpos, &stream.avail_in);
>  		stream.next_in = in;
>  		st = git_inflate(&stream, Z_FINISH);
> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
> +			break;
>  		curpos += stream.next_in - in;
>  	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
>  		 stream.total_out < sizeof(delta_head));
> @@ -1594,6 +1596,8 @@ static void *unpack_compressed_entry(struct packed_git *p,
>  		in = use_pack(p, w_curs, curpos, &stream.avail_in);
>  		stream.next_in = in;
>  		st = git_inflate(&stream, Z_FINISH);
> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
> +			break;
>  		curpos += stream.next_in - in;
>  	} while (st == Z_OK || st == Z_BUF_ERROR);
>  	git_inflate_end(&stream);
> -- 
> 1.6.5.52.g0ff2e
> 
> -- 
> Shawn.
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 16:09   ` Nicolas Pitre
@ 2009-10-14 16:12     ` Shawn O. Pearce
  2009-10-14 16:42       ` Nicolas Pitre
  0 siblings, 1 reply; 23+ messages in thread
From: Shawn O. Pearce @ 2009-10-14 16:12 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Andy Isaacson, Junio C Hamano, git

Nicolas Pitre <nico@fluxnic.net> wrote:
> > Some types of corruption to a pack may confuse the deflate stream
> > which stores an object.  In Andy's reported case a 36 byte region
> > of the pack was overwritten, leading to what appeared to be a valid
> > deflate stream that was trying to produce a result larger than our
> > allocated output buffer could accept.
...
> This is unfortunate that making a test case for this isn't exactly 
> trivial.

Hmmm.  We could do something like manually create a pack file of
one non-delta blob whose pack header length is 16, but use a zlib
stream whose result body is 64.  Prior to this fix, we'd be stuck
in the infinite loop.  :-)

Its a PITA to create though, you have to hand-craft the test vector
and save it in the repository, we can't produce such a pack with
any real code we ship.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 16:12     ` Shawn O. Pearce
@ 2009-10-14 16:42       ` Nicolas Pitre
  2009-10-14 18:03         ` Shawn O. Pearce
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Pitre @ 2009-10-14 16:42 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Andy Isaacson, Junio C Hamano, git

On Wed, 14 Oct 2009, Shawn O. Pearce wrote:

> Nicolas Pitre <nico@fluxnic.net> wrote:
> > > Some types of corruption to a pack may confuse the deflate stream
> > > which stores an object.  In Andy's reported case a 36 byte region
> > > of the pack was overwritten, leading to what appeared to be a valid
> > > deflate stream that was trying to produce a result larger than our
> > > allocated output buffer could accept.
> ...
> > This is unfortunate that making a test case for this isn't exactly 
> > trivial.
> 
> Hmmm.  We could do something like manually create a pack file of
> one non-delta blob whose pack header length is 16, but use a zlib
> stream whose result body is 64.  Prior to this fix, we'd be stuck
> in the infinite loop.  :-)

Ah, of course.

> Its a PITA to create though, you have to hand-craft the test vector
> and save it in the repository, we can't produce such a pack with
> any real code we ship.

Can be done easily with dd though, see do_corrupt_object() in t5303 for 
example.


Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 16:42       ` Nicolas Pitre
@ 2009-10-14 18:03         ` Shawn O. Pearce
  2009-10-14 18:39           ` Nicolas Pitre
  0 siblings, 1 reply; 23+ messages in thread
From: Shawn O. Pearce @ 2009-10-14 18:03 UTC (permalink / raw)
  To: Nicolas Pitre, Junio C Hamano; +Cc: Andy Isaacson, git

> Nicolas Pitre <nico@fluxnic.net> wrote:
> ...
> > This is unfortunate that making a test case for this isn't exactly 
> > trivial.

Junio, can you please squash this in?

diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
index 5132d41..5f6cd4f 100755
--- a/t/t5303-pack-corruption-resilience.sh
+++ b/t/t5303-pack-corruption-resilience.sh
@@ -275,4 +275,13 @@ test_expect_success \
      git cat-file blob $blob_2 > /dev/null &&
      git cat-file blob $blob_3 > /dev/null'
 
+test_expect_success \
+    'corrupting header to have too small output buffer fails unpack' \
+    'create_new_pack &&
+     git prune-packed &&
+     printf "\262\001" | do_corrupt_object $blob_1 0 &&
+     test_must_fail git cat-file blob $blob_1 > /dev/null &&
+     test_must_fail git cat-file blob $blob_2 > /dev/null &&
+     test_must_fail git cat-file blob $blob_3 > /dev/null'
+
 test_done

-- 
Shawn.

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 18:03         ` Shawn O. Pearce
@ 2009-10-14 18:39           ` Nicolas Pitre
  2009-10-15  7:39             ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Pitre @ 2009-10-14 18:39 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Junio C Hamano, Andy Isaacson, git

On Wed, 14 Oct 2009, Shawn O. Pearce wrote:

> Junio, can you please squash this in?
> 
> diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh
> index 5132d41..5f6cd4f 100755
> --- a/t/t5303-pack-corruption-resilience.sh
> +++ b/t/t5303-pack-corruption-resilience.sh
> @@ -275,4 +275,13 @@ test_expect_success \
>       git cat-file blob $blob_2 > /dev/null &&
>       git cat-file blob $blob_3 > /dev/null'
>  
> +test_expect_success \
> +    'corrupting header to have too small output buffer fails unpack' \
> +    'create_new_pack &&
> +     git prune-packed &&
> +     printf "\262\001" | do_corrupt_object $blob_1 0 &&
> +     test_must_fail git cat-file blob $blob_1 > /dev/null &&
> +     test_must_fail git cat-file blob $blob_2 > /dev/null &&
> +     test_must_fail git cat-file blob $blob_3 > /dev/null'
> +
>  test_done
> 
> -- 

I confirm this test without the fix reproduces the infinite loop (and 
does stall the test suite).


Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 18:39           ` Nicolas Pitre
@ 2009-10-15  7:39             ` Junio C Hamano
  2009-10-20 15:14               ` Alex Riesen
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2009-10-15  7:39 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn O. Pearce, Junio C Hamano, Andy Isaacson, git

Nicolas Pitre <nico@fluxnic.net> writes:

> I confirm this test without the fix reproduces the infinite loop (and 
> does stall the test suite).

Thanks, both of you.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-15  7:39             ` Junio C Hamano
@ 2009-10-20 15:14               ` Alex Riesen
  2009-10-20 15:23                 ` Sverre Rabbelier
                                   ` (2 more replies)
  0 siblings, 3 replies; 23+ messages in thread
From: Alex Riesen @ 2009-10-20 15:14 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

[-- Attachment #1: Type: text/plain, Size: 1963 bytes --]

On Thu, Oct 15, 2009 at 09:39, Junio C Hamano <gitster@pobox.com> wrote:
> Nicolas Pitre <nico@fluxnic.net> writes:
>
>> I confirm this test without the fix reproduces the infinite loop (and
>> does stall the test suite).
>
> Thanks, both of you.

I seem to have problems with this change (on Cygwin). Sometimes
accessing an object in a pack fails in unpack_compressed_entry.
When it happens, both avail_in and avail_out of the stream are 0,
and the reported status is Z_BUF_ERROR.
Output with the second attached patch:

error: *** inflate error: 0x862380 size=1256, avail_in=0 (was 697),
avail_out=0 (was 1256)
error: *** unpack_compressed_entry failed
error: failed to read object 3296766eb5531ef051ae392114de5d75556f5613
at offset 2620741 from
.git/objects/pack/pack-996206790aaefbf4d34c86b3ff546bb924546b7c.pack
fatal: object 3296766eb5531ef051ae392114de5d75556f5613 is corrupted

I cannot reproduce the problem on a normal system (a 64bit, reasonably modern
Linux in my case). An attempt to use an upgraded zlib on this cygwin system was
not successful, there was an updated library, but a clean recompile didn't
change anything.

I worked the case around by allocating a bit more than uncompressed data need.
In case someone else also sees the problem, below is how. The size of the
overallocation is arbitrary.

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..66c2519 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1585,11 +1585,11 @@ static void *unpack_compressed_entry(struct
packed_git *p,
 	z_stream stream;
 	unsigned char *buffer, *in;

-	buffer = xmalloc(size + 1);
-	buffer[size] = 0;
+	buffer = xmalloc(size + 8);
+	memset(buffer + size, 0, 8);
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 8;

 	git_inflate_init(&stream);
 	do {
-- 
1.6.5.59.g000dd

The problematic repo is a little big to post (it's my cygwin-git repo),
I'll have to find hosting for it first.

[-- Attachment #2: 0001-Workaround-inflate-sometimes-failing-to-unpack-data.diff --]
[-- Type: application/octet-stream, Size: 1341 bytes --]

From 643fdf50e4343c3ce3a4b99183dba8d35a10c877 Mon Sep 17 00:00:00 2001
From: Alex Riesen <raa.lkml@gmail.com>
Date: Tue, 20 Oct 2009 16:42:26 +0200
Subject: [PATCH 1/2] Workaround inflate sometimes fails to unpack data

When it happens, zlib stream's avail_in and avail_out are 0, the returned
status is Z_BUF_ERROR. The values weren't 0 before calling inflate.

error: *** inflate error: 0x862380 size=1256, avail_in=0 (was 697), avail_out=0 (was 1256)
error: *** unpack_compressed_entry failed
error: failed to read object 3296766eb5531ef051ae392114de5d75556f5613 at offset 2620741 from .git/objects/pack/pack-996206790aaefbf4d34c86b3ff546bb924546b7c.pack
fatal: object 3296766eb5531ef051ae392114de5d75556f5613 is corrupted
---
 sha1_file.c |    6 +++---
 1 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..66c2519 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1585,11 +1585,11 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	z_stream stream;
 	unsigned char *buffer, *in;
 
-	buffer = xmalloc(size + 1);
-	buffer[size] = 0;
+	buffer = xmalloc(size + 8);
+	memset(buffer + size, 0, 8);
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 8;
 
 	git_inflate_init(&stream);
 	do {
-- 
1.6.5.59.g000dd


[-- Attachment #3: 0002-Debugging-strange-failure-in-zlibs-inflate.diff --]
[-- Type: application/octet-stream, Size: 1910 bytes --]

From fc5b47d953f171d7591349acad9a23d3ec683ce9 Mon Sep 17 00:00:00 2001
From: Alex Riesen <raa.lkml@gmail.com>
Date: Tue, 20 Oct 2009 15:53:33 +0200
Subject: [PATCH 2/2] Debugging strange failure in zlibs inflate

---
 sha1_file.c |   18 ++++++++++++++++++
 1 files changed, 18 insertions(+), 0 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 66c2519..1bf240e 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1594,10 +1594,24 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
+		unsigned prev_ain = stream.avail_in;
+		unsigned prev_aout = stream.avail_out;
+		if (!stream.avail_in)
+			error("*** avail_in=0, inflate will fail!");
+		if (!stream.avail_out)
+			error("*** avail_out=0, inflate will fail!");
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
 		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
+                {
+			error("*** inflate error: %p size=%lu, "
+			      "avail_in=%d (was %u), "
+			      "avail_out=%d (was %u)",
+			      buffer, size,
+			      stream.avail_in, prev_ain,
+			      stream.avail_out, prev_aout);
 			break;
+                }
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);
@@ -1654,6 +1668,8 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
 		delta_base_cached -= ent->size;
 	} else {
 		ret = xmemdupz(ent->data, ent->size);
+		if (!ret)
+			fprintf(stderr, "*** no memory\n");
 	}
 	*type = ent->type;
 	*base_size = ent->size;
@@ -1815,6 +1831,8 @@ void *unpack_entry(struct packed_git *p, off_t obj_offset,
 	case OBJ_BLOB:
 	case OBJ_TAG:
 		data = unpack_compressed_entry(p, &w_curs, curpos, *sizep);
+		if (!data)
+			error("*** unpack_compressed_entry failed");
 		break;
 	default:
 		data = NULL;
-- 
1.6.5.59.g000dd


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 15:14               ` Alex Riesen
@ 2009-10-20 15:23                 ` Sverre Rabbelier
  2009-10-20 15:36                   ` Alex Riesen
  2009-10-26  2:35                 ` Junio C Hamano
  2009-11-03 21:31                 ` Pascal Obry
  2 siblings, 1 reply; 23+ messages in thread
From: Sverre Rabbelier @ 2009-10-20 15:23 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

Heya,

On Tue, Oct 20, 2009 at 10:14, Alex Riesen <raa.lkml@gmail.com> wrote:
> -       buffer = xmalloc(size + 1);
> +       buffer = xmalloc(size + 8);

> -       stream.avail_out = size;
> +       stream.avail_out = size + 8;

That seems wrong at first glance, you go from '+1' to '+8' on the
first part, and then from '+0' to '+8' in the second part, am I
missing something obvious?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 15:23                 ` Sverre Rabbelier
@ 2009-10-20 15:36                   ` Alex Riesen
  0 siblings, 0 replies; 23+ messages in thread
From: Alex Riesen @ 2009-10-20 15:36 UTC (permalink / raw)
  To: Sverre Rabbelier
  Cc: Junio C Hamano, Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

On Tue, Oct 20, 2009 at 17:23, Sverre Rabbelier <srabbelier@gmail.com> wrote:
> Heya,
>
> On Tue, Oct 20, 2009 at 10:14, Alex Riesen <raa.lkml@gmail.com> wrote:
>> -       buffer = xmalloc(size + 1);
>> +       buffer = xmalloc(size + 8);
>
>> -       stream.avail_out = size;
>> +       stream.avail_out = size + 8;
>
> That seems wrong at first glance, you go from '+1' to '+8' on the
> first part, and then from '+0' to '+8' in the second part, am I
> missing something obvious?

Yes. The "size" (the variable) is never changed. It is just the output
buffer size, not how many data expected or something like that.
IOW, I just wanted the buffer to be obviously (to zlib) more than enough.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-14 14:23 ` Shawn O. Pearce
  2009-10-14 16:09   ` Nicolas Pitre
@ 2009-10-20 16:52   ` Junio C Hamano
  2009-10-20 17:13     ` Junio C Hamano
  1 sibling, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2009-10-20 16:52 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Andy Isaacson, git, Nicolas Pitre, Alex Riesen

"Shawn O. Pearce" <spearce@spearce.org> writes:

> Z_BUF_ERROR is returned from inflate() if either the input buffer
> needs more input bytes, or the output buffer has run out of space.
> Previously we only considered the former case, as it meant we needed
> to move the stream's input buffer to the next window in the pack.
>
> We now abort the loop if inflate() returns Z_BUF_ERROR without
> consuming the entire input buffer it was given, or has filled
> the entire output buffer but has not yet returned Z_STREAM_END.
> Either state is a clear indicator that this loop is not working
> as expected, and should not continue.

When the inflated contents is of size 0, avail_out would be 0 and avail_in
would still have something because the input stream needs to have the end
of stream marker that is more than zero byte long.

If that is more than one-byte long, and your avail_in originally fed only
the first byte from that sequence, what happens?  Wouldn't inflate eat all
what was given (now avail_in is zero), updated its internal state but it
still hasn't produced anything (avail_out is zero)?

I am not saying the end-of-stream is more than one-byte long (I didn't
check), but we had a similar bug arising from confusing "no more output
data" and "fully consumed input stream" (e.g. 456cdf6 (Fix loose object
uncompression check., 2007-03-19).

Something like that may be what is happening to cause problem Alex is
seeing.

I think the corrupt packdata detection needs an output buffer at least
one-byte larger than what is needed to store the correct result.  Then
when we get BUF_ERROR:

 - We never look at avail to see if it is zero; !avail_out is not the same
   as "it stopped because it ran out of output space".  It might mean
   "there is nothing more to come but the input stream ended before
   signalling that fact to the inflate engine fully".

 - We do look at avail_out to find how much data we ended up getting.  If
   inflate has consumed more buffer than we planned to give it, the stream
   is corrupt (at least it is not what we expected to see);

>  		st = git_inflate(&stream, Z_FINISH);
> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
> +			break;
>  		curpos += stream.next_in - in;
>  	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
>  		 stream.total_out < sizeof(delta_head));
> @@ -1594,6 +1596,8 @@ static void *unpack_compressed_entry(struct packed_git *p,
>  		in = use_pack(p, w_curs, curpos, &stream.avail_in);
>  		stream.next_in = in;
>  		st = git_inflate(&stream, Z_FINISH);
> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
> +			break;
>  		curpos += stream.next_in - in;
>  	} while (st == Z_OK || st == Z_BUF_ERROR);
>  	git_inflate_end(&stream);
> -- 
> 1.6.5.52.g0ff2e
>
> -- 
> Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 16:52   ` Junio C Hamano
@ 2009-10-20 17:13     ` Junio C Hamano
  2009-10-20 19:33       ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2009-10-20 17:13 UTC (permalink / raw)
  To: Shawn O. Pearce; +Cc: Andy Isaacson, git, Nicolas Pitre, Alex Riesen

Junio C Hamano <gitster@pobox.com> writes:

>> We now abort the loop if inflate() returns Z_BUF_ERROR without
>> consuming the entire input buffer it was given, or has filled
>> the entire output buffer but has not yet returned Z_STREAM_END.
>> Either state is a clear indicator that this loop is not working
>> as expected, and should not continue.
>
> When the inflated contents is of size 0, avail_out would be 0 and avail_in
> would still have something because the input stream needs to have the end
> of stream marker that is more than zero byte long.

After thinking about this a bit more, I am reasonably sure that this is
it.  The contents does not have to be a 0-length string, but you would hit
this if the pure-data portion of the deflated stream aligns at the end of
your (un)pack window and it happens to require another use_pack() to move
the window to read the end-of-stream signal.  In that situation, the
output buffer has already been filled, but you haven't read the input
stream fully.  Would't the new check incorrectly trigger in such a case?

>>  		st = git_inflate(&stream, Z_FINISH);
>> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
>> +			break;

We won't see this on 64-bit platforms because we use larger (un)pack
window and the condition is much less likely to be met.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 17:13     ` Junio C Hamano
@ 2009-10-20 19:33       ` Junio C Hamano
  2009-10-20 19:46         ` Nicolas Pitre
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2009-10-20 19:33 UTC (permalink / raw)
  To: Shawn O. Pearce, Nicolas Pitre; +Cc: Andy Isaacson, git, Alex Riesen

Junio C Hamano <gitster@pobox.com> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>>> We now abort the loop if inflate() returns Z_BUF_ERROR without
>>> consuming the entire input buffer it was given, or has filled
>>> the entire output buffer but has not yet returned Z_STREAM_END.
>>> Either state is a clear indicator that this loop is not working
>>> as expected, and should not continue.
>>
>> When the inflated contents is of size 0, avail_out would be 0 and avail_in
>> would still have something because the input stream needs to have the end
>> of stream marker that is more than zero byte long.
>
> After thinking about this a bit more, I am reasonably sure that this is
> it.  The contents does not have to be a 0-length string, but you would hit
> this if the pure-data portion of the deflated stream aligns at the end of
> your (un)pack window and it happens to require another use_pack() to move
> the window to read the end-of-stream signal.  In that situation, the
> output buffer has already been filled, but you haven't read the input
> stream fully.  Would't the new check incorrectly trigger in such a case?
>
>>>  		st = git_inflate(&stream, Z_FINISH);
>>> +		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
>>> +			break;
>
> We won't see this on 64-bit platforms because we use larger (un)pack
> window and the condition is much less likely to be met.

Perhaps it would be as simple as this?

We deliberately give one byte more than what we expect to see and error
out when we do get that extra byte filled.

 sha1_file.c |   17 +++++++++--------
 1 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..8c9f392 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1344,7 +1344,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
 			          off_t curpos)
 {
 	const unsigned char *data;
-	unsigned char delta_head[20], *in;
+	unsigned char delta_head[21], *in;
 	z_stream stream;
 	int st;
 
@@ -1357,13 +1357,14 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
+		 stream.total_out < (sizeof(delta_head) - 1));
 	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+	if ((st != Z_STREAM_END) &&
+	    stream.total_out != sizeof(delta_head) - 1) {
 		error("delta data unpack-initial failed");
 		return 0;
 	}
@@ -1589,15 +1590,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 19:33       ` Junio C Hamano
@ 2009-10-20 19:46         ` Nicolas Pitre
  2009-10-20 20:50           ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Nicolas Pitre @ 2009-10-20 19:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Shawn O. Pearce, Andy Isaacson, git, Alex Riesen

On Tue, 20 Oct 2009, Junio C Hamano wrote:

> Perhaps it would be as simple as this?
> 
> We deliberately give one byte more than what we expect to see and error
> out when we do get that extra byte filled.
> 
>  sha1_file.c |   17 +++++++++--------
>  1 files changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/sha1_file.c b/sha1_file.c
> index 4cc8939..8c9f392 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1344,7 +1344,7 @@ unsigned long get_size_from_delta(struct packed_git *p,
>  			          off_t curpos)
>  {
>  	const unsigned char *data;
> -	unsigned char delta_head[20], *in;
> +	unsigned char delta_head[21], *in;

I didn't spend the time needed to think about this issue and your 
proposed fix yet.  However I think that using sizeof(delta_head)-1 
makes the code a bit confusing.  At this point i'd use:

	int size = sizeof(delta_head) - 1;

and use that variable instead just like it is done in 
unpack_compressed_entry() to have the same code pattern.


Nicolas

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 19:46         ` Nicolas Pitre
@ 2009-10-20 20:50           ` Junio C Hamano
  2009-10-22  6:06             ` Junio C Hamano
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2009-10-20 20:50 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Shawn O. Pearce, Andy Isaacson, git, Alex Riesen

Nicolas Pitre <nico@fluxnic.net> writes:

> I didn't spend the time needed to think about this issue and your 
> proposed fix yet.  However I think that using sizeof(delta_head)-1 
> makes the code a bit confusing.  At this point i'd use:
>
> 	int size = sizeof(delta_head) - 1;
>
> and use that variable instead just like it is done in 
> unpack_compressed_entry() to have the same code pattern.

Sounds good.  Here is a reroll with a bit more explanation.

-- >8 --
From: Junio C Hamano <gitster@pobox.com>
Date: Tue, 20 Oct 2009 12:40:02 -0700
Subject: [PATCH] Fix "corrupt input stream" check while reading from packfiles

An ealier "fix" made us break out of the loop when we get Z_BUF_ERROR back
from inflate(), and either the input stream still had some data to
consume, or we have already got the full output we expected.

This is the same kind of mistake as we corrected with 456cdf6 (Fix loose
object uncompression check., 2007-03-19); it is valid for inflate() to
produce full output before it consumes the input stream fully; e.g.
immediately before reading the end of stream marker.

Instead, detect corrupt input stream by feeding the input as long as
inflate() wants to without detecting a real error, and giving it an output
buffer that is one byte longer than necessary.  If it touches the extra
byte, we know that the input stream is corrupt; otherwise inflate() will
notice the broken input stream by itself.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
 sha1_file.c |   18 ++++++++++--------
 1 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..f0907b8 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1344,7 +1344,8 @@ unsigned long get_size_from_delta(struct packed_git *p,
 			          off_t curpos)
 {
 	const unsigned char *data;
-	unsigned char delta_head[20], *in;
+	unsigned char delta_head[21], *in;
+	unsigned long expected_size = sizeof(delta_head) - 1;
 	z_stream stream;
 	int st;
 
@@ -1357,13 +1358,14 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
-		 stream.total_out < sizeof(delta_head));
+		 stream.total_out < expected_size);
 	git_inflate_end(&stream);
-	if ((st != Z_STREAM_END) && stream.total_out != sizeof(delta_head)) {
+	if ((st != Z_STREAM_END) &&
+	    stream.total_out != expected_size) {
 		error("delta data unpack-initial failed");
 		return 0;
 	}
@@ -1589,15 +1591,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be! */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);
-- 
1.6.5.1.107.gba912

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 20:50           ` Junio C Hamano
@ 2009-10-22  6:06             ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2009-10-22  6:06 UTC (permalink / raw)
  To: Nicolas Pitre, Shawn O. Pearce; +Cc: Andy Isaacson, git, Alex Riesen

Junio C Hamano <gitster@pobox.com> writes:

> Nicolas Pitre <nico@fluxnic.net> writes:
>
>> I didn't spend the time needed to think about this issue and your 
>> proposed fix yet.

I looked at the issue again, and came to the conclusion that we need quite
different fix for the two inflate callsites, as they do quite different
things.

-- >8 --
Subject: Fix incorrect error check while reading deflated pack data

The loop in get_size_from_delta() feeds a deflated delta data from the
pack stream _until_ we get inflated result of 20 bytes[*] or we reach the
end of stream.

    Side note. This magic number 20 does not have anything to do with the
    size of the hash we use, but comes from 1a3b55c (reduce delta head
    inflated size, 2006-10-18).

The loop reads like this:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while ((st == Z_OK || st == Z_BUF_ERROR) &&
             stream.total_out < sizeof(delta_head));

This git_inflate() can return:

 - Z_STREAM_END, if use_pack() fed it enough input and the delta itself
   was smaller than 20 bytes;

 - Z_OK, when some progress has been made;

 - Z_BUF_ERROR, if no progress is possible, because we either ran out of
   input (due to corrupt pack), or we ran out of output before we saw the
   end of the stream.

The fix b3118bd (sha1_file: Fix infinite loop when pack is corrupted,
2009-10-14) attempted was against a corruption that appears to be a valid
stream that produces a result larger than the output buffer, but we are
not even trying to read the stream to the end in this loop.  If avail_out
becomes zero, total_out will be the same as sizeof(delta_head) so the loop
will terminate without the "fix".  There is no fix from b3118bd needed for
this loop, in other words.

The loop in unpack_compressed_entry() is quite a different story.  It
feeds a deflated stream (either delta or base) and allows the stream to
produce output up to what we expect but no more.

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

This _does_ risk falling into an endless interation, as we can exhaust
avail_out if the length we expect is smaller than what the stream wants to
produce (due to pack corruption).  In such a case, avail_out will become
zero and inflate() will return Z_BUF_ERROR, while avail_in may (or may
not) be zero.

But this is not a right fix:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
+       if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out)
+               break; /* wants more input??? */
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

as Z_BUF_ERROR from inflate() may be telling us that avail_in has also run
out before reading the end of stream marker.  In such a case, both avail_in
and avail_out would be zero, and the loop should iterate to allow the end
of stream marker to be seen by inflate from the input stream.

The right fix for this loop is likely to be to increment the initial
avail_out by one (we allocate one extra byte to terminate it with NUL
anyway, so there is no risk to overrun the buffer), and break out if we
see that avail_out has become zero, in order to detect that the stream
wants to produce more than what we expect.  After the loop, we have a
check that exactly tests this condition:

    if ((st != Z_STREAM_END) || stream.total_out != size) {
        free(buffer);
        return NULL;
    }

So here is a patch (without my previous botched attempts) to fix this
issue.  The first hunk reverts the corresponding hunk from b3118bd, and
the second hunk is the same fix proposed earlier. 

---
 sha1_file.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..63981fb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1357,8 +1357,6 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1589,15 +1587,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 15:14               ` Alex Riesen
  2009-10-20 15:23                 ` Sverre Rabbelier
@ 2009-10-26  2:35                 ` Junio C Hamano
  2009-10-26  7:07                   ` Alex Riesen
  2009-10-26 14:23                   ` Shawn O. Pearce
  2009-11-03 21:31                 ` Pascal Obry
  2 siblings, 2 replies; 23+ messages in thread
From: Junio C Hamano @ 2009-10-26  2:35 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

Alex Riesen <raa.lkml@gmail.com> writes:

> On Thu, Oct 15, 2009 at 09:39, Junio C Hamano <gitster@pobox.com> wrote:
>> Nicolas Pitre <nico@fluxnic.net> writes:
>>
>>> I confirm this test without the fix reproduces the infinite loop (and
>>> does stall the test suite).
>>
>> Thanks, both of you.
>
> I seem to have problems with this change (on Cygwin). Sometimes
> accessing an object in a pack fails in unpack_compressed_entry.
> When it happens, both avail_in and avail_out of the stream are 0,
> and the reported status is Z_BUF_ERROR.

Could you test the patch in

    http://mid.gmane.org/7vd44gm089.fsf@alter.siamese.dyndns.org

without your workaround?

-- >8 --
Subject: Fix incorrect error check while reading deflated pack data

I looked at the issue again, and came to the conclusion that we need quite
different fix for the two inflate callsites, as they do quite different
things.

-- >8 --
Subject: Fix incorrect error check while reading deflated pack data

The loop in get_size_from_delta() feeds a deflated delta data from the
pack stream _until_ we get inflated result of 20 bytes[*] or we reach the
end of stream.

    Side note. This magic number 20 does not have anything to do with the
    size of the hash we use, but comes from 1a3b55c (reduce delta head
    inflated size, 2006-10-18).

The loop reads like this:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while ((st == Z_OK || st == Z_BUF_ERROR) &&
             stream.total_out < sizeof(delta_head));

This git_inflate() can return:

 - Z_STREAM_END, if use_pack() fed it enough input and the delta itself
   was smaller than 20 bytes;

 - Z_OK, when some progress has been made;

 - Z_BUF_ERROR, if no progress is possible, because we either ran out of
   input (due to corrupt pack), or we ran out of output before we saw the
   end of the stream.

The fix b3118bd (sha1_file: Fix infinite loop when pack is corrupted,
2009-10-14) attempted was against a corruption that appears to be a valid
stream that produces a result larger than the output buffer, but we are
not even trying to read the stream to the end in this loop.  If avail_out
becomes zero, total_out will be the same as sizeof(delta_head) so the loop
will terminate without the "fix".  There is no fix from b3118bd needed for
this loop, in other words.

The loop in unpack_compressed_entry() is quite a different story.  It
feeds a deflated stream (either delta or base) and allows the stream to
produce output up to what we expect but no more.

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

This _does_ risk falling into an endless interation, as we can exhaust
avail_out if the length we expect is smaller than what the stream wants to
produce (due to pack corruption).  In such a case, avail_out will become
zero and inflate() will return Z_BUF_ERROR, while avail_in may (or may
not) be zero.

But this is not a right fix:

    do {
        in = use_pack();
        stream.next_in = in;
        st = git_inflate(&stream, Z_FINISH);
+       if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out)
+               break; /* wants more input??? */
        curpos += stream.next_in - in;
    } while (st == Z_OK || st == Z_BUF_ERROR)

as Z_BUF_ERROR from inflate() may be telling us that avail_in has also run
out before reading the end of stream marker.  In such a case, both avail_in
and avail_out would be zero, and the loop should iterate to allow the end
of stream marker to be seen by inflate from the input stream.

The right fix for this loop is likely to be to increment the initial
avail_out by one (we allocate one extra byte to terminate it with NUL
anyway, so there is no risk to overrun the buffer), and break out if we
see that avail_out has become zero, in order to detect that the stream
wants to produce more than what we expect.  After the loop, we have a
check that exactly tests this condition:

    if ((st != Z_STREAM_END) || stream.total_out != size) {
        free(buffer);
        return NULL;
    }

So here is a patch (without my previous botched attempts) to fix this
issue.  The first hunk reverts the corresponding hunk from b3118bd, and
the second hunk is the same fix proposed earlier. 

---
 sha1_file.c |    8 +++-----
 1 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index 4cc8939..63981fb 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1357,8 +1357,6 @@ unsigned long get_size_from_delta(struct packed_git *p,
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
 		curpos += stream.next_in - in;
 	} while ((st == Z_OK || st == Z_BUF_ERROR) &&
 		 stream.total_out < sizeof(delta_head));
@@ -1589,15 +1587,15 @@ static void *unpack_compressed_entry(struct packed_git *p,
 	buffer[size] = 0;
 	memset(&stream, 0, sizeof(stream));
 	stream.next_out = buffer;
-	stream.avail_out = size;
+	stream.avail_out = size + 1;
 
 	git_inflate_init(&stream);
 	do {
 		in = use_pack(p, w_curs, curpos, &stream.avail_in);
 		stream.next_in = in;
 		st = git_inflate(&stream, Z_FINISH);
-		if (st == Z_BUF_ERROR && (stream.avail_in || !stream.avail_out))
-			break;
+		if (!stream.avail_out)
+			break; /* the payload is larger than it should be */
 		curpos += stream.next_in - in;
 	} while (st == Z_OK || st == Z_BUF_ERROR);
 	git_inflate_end(&stream);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-26  2:35                 ` Junio C Hamano
@ 2009-10-26  7:07                   ` Alex Riesen
  2009-10-26 14:23                   ` Shawn O. Pearce
  1 sibling, 0 replies; 23+ messages in thread
From: Alex Riesen @ 2009-10-26  7:07 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

On Mon, Oct 26, 2009 at 03:35, Junio C Hamano <gitster@pobox.com> wrote:
>
> Could you test the patch in
>
>    http://mid.gmane.org/7vd44gm089.fsf@alter.siamese.dyndns.org
>
> without your workaround?
>
> -- >8 --
> Subject: Fix incorrect error check while reading deflated pack data
>

I did. It works as intended since about 3 days. It also helped my home
server (a 32bit Linux system), which also experienced the problem.

Thanks!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-26  2:35                 ` Junio C Hamano
  2009-10-26  7:07                   ` Alex Riesen
@ 2009-10-26 14:23                   ` Shawn O. Pearce
  1 sibling, 0 replies; 23+ messages in thread
From: Shawn O. Pearce @ 2009-10-26 14:23 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Alex Riesen, Nicolas Pitre, Andy Isaacson, git

Junio C Hamano <gitster@pobox.com> wrote:
> Alex Riesen <raa.lkml@gmail.com> writes:
> > I seem to have problems with this change (on Cygwin). Sometimes
> > accessing an object in a pack fails in unpack_compressed_entry.
> > When it happens, both avail_in and avail_out of the stream are 0,
> > and the reported status is Z_BUF_ERROR.
...
> Subject: Fix incorrect error check while reading deflated pack data

Wow.  
 
> The right fix for this loop is likely to be to increment the initial
> avail_out by one (we allocate one extra byte to terminate it with NUL
> anyway, so there is no risk to overrun the buffer), and break out if we
> see that avail_out has become zero, in order to detect that the stream
> wants to produce more than what we expect.  After the loop, we have a
> check that exactly tests this condition:
> 
>     if ((st != Z_STREAM_END) || stream.total_out != size) {
>         free(buffer);
>         return NULL;
>     }
> 
> So here is a patch (without my previous botched attempts) to fix this
> issue.  The first hunk reverts the corresponding hunk from b3118bd, and
> the second hunk is the same fix proposed earlier. 

ACK.  This looks right to me too.  I forgot about that end-of-stream
marker on the input buffer, and my testing failed to have a stream
where the end-of-stream marker was in the next back window, so this
"fix" wasn't triggering.

*sigh*  Thanks for digging into this and fixing it while I was away.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-10-20 15:14               ` Alex Riesen
  2009-10-20 15:23                 ` Sverre Rabbelier
  2009-10-26  2:35                 ` Junio C Hamano
@ 2009-11-03 21:31                 ` Pascal Obry
  2009-11-03 22:28                   ` Shawn O. Pearce
  2 siblings, 1 reply; 23+ messages in thread
From: Pascal Obry @ 2009-11-03 21:31 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Nicolas Pitre, Shawn O. Pearce, Andy Isaacson, git

Le 20/10/2009 17:14, Alex Riesen a écrit :
> I seem to have problems with this change (on Cygwin). Sometimes
> accessing an object in a pack fails in unpack_compressed_entry.
> When it happens, both avail_in and avail_out of the stream are 0,
> and the reported status is Z_BUF_ERROR.
> Output with the second attached patch:
>
> error: *** inflate error: 0x862380 size=1256, avail_in=0 (was 697),
> avail_out=0 (was 1256)
> error: *** unpack_compressed_entry failed
> error: failed to read object 3296766eb5531ef051ae392114de5d75556f5613
> at offset 2620741 from
> .git/objects/pack/pack-996206790aaefbf4d34c86b3ff546bb924546b7c.pack
> fatal: object 3296766eb5531ef051ae392114de5d75556f5613 is corrupted

I have this problem on some repository on Cygwin too. For now I have 
reverted to Git 1.6.4 which seems to be working fine.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|    http://www.obry.net  -  http://v2p.fr.eu.org
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver keys.gnupg.net --recv-key F949BD3B

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-11-03 21:31                 ` Pascal Obry
@ 2009-11-03 22:28                   ` Shawn O. Pearce
  2009-11-03 22:34                     ` Pascal Obry
  0 siblings, 1 reply; 23+ messages in thread
From: Shawn O. Pearce @ 2009-11-03 22:28 UTC (permalink / raw)
  To: Pascal Obry
  Cc: Alex Riesen, Junio C Hamano, Nicolas Pitre, Andy Isaacson, git

Pascal Obry <pascal@obry.net> wrote:
> Le 20/10/2009 17:14, Alex Riesen a ??crit :
>> I seem to have problems with this change (on Cygwin). Sometimes
>> accessing an object in a pack fails in unpack_compressed_entry.
>> When it happens, both avail_in and avail_out of the stream are 0,
>> and the reported status is Z_BUF_ERROR.
>> Output with the second attached patch:
>>
>> error: *** inflate error: 0x862380 size=1256, avail_in=0 (was 697),
>> avail_out=0 (was 1256)
>> error: *** unpack_compressed_entry failed
>> error: failed to read object 3296766eb5531ef051ae392114de5d75556f5613
>> at offset 2620741 from
>> .git/objects/pack/pack-996206790aaefbf4d34c86b3ff546bb924546b7c.pack
>> fatal: object 3296766eb5531ef051ae392114de5d75556f5613 is corrupted
>
> I have this problem on some repository on Cygwin too. For now I have  
> reverted to Git 1.6.4 which seems to be working fine.

It was fixed in 1.6.5.2 by Junio.

-- 
Shawn.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: git hang with corrupted .pack
  2009-11-03 22:28                   ` Shawn O. Pearce
@ 2009-11-03 22:34                     ` Pascal Obry
  0 siblings, 0 replies; 23+ messages in thread
From: Pascal Obry @ 2009-11-03 22:34 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Alex Riesen, Junio C Hamano, Nicolas Pitre, Andy Isaacson, git

Shawn,

> It was fixed in 1.6.5.2 by Junio.

Hum... I was using today Git from master. Just after I send my message I 
tried 1.6.5 and it was working... I went back to master and it is now 
working... Really strange... Maybe a file did not get properly 
recompiled. I doubt it... but for sure it was failing and I build git 
every day from master! Strange!

Anyway, it was a false alarm.

Pascal.

-- 

--|------------------------------------------------------
--| Pascal Obry                           Team-Ada Member
--| 45, rue Gabriel Peri - 78114 Magny Les Hameaux FRANCE
--|------------------------------------------------------
--|    http://www.obry.net  -  http://v2p.fr.eu.org
--| "The best way to travel is by means of imagination"
--|
--| gpg --keyserver keys.gnupg.net --recv-key F949BD3B

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2009-11-03 22:34 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-14  4:22 git hang with corrupted .pack Andy Isaacson
2009-10-14 14:23 ` Shawn O. Pearce
2009-10-14 16:09   ` Nicolas Pitre
2009-10-14 16:12     ` Shawn O. Pearce
2009-10-14 16:42       ` Nicolas Pitre
2009-10-14 18:03         ` Shawn O. Pearce
2009-10-14 18:39           ` Nicolas Pitre
2009-10-15  7:39             ` Junio C Hamano
2009-10-20 15:14               ` Alex Riesen
2009-10-20 15:23                 ` Sverre Rabbelier
2009-10-20 15:36                   ` Alex Riesen
2009-10-26  2:35                 ` Junio C Hamano
2009-10-26  7:07                   ` Alex Riesen
2009-10-26 14:23                   ` Shawn O. Pearce
2009-11-03 21:31                 ` Pascal Obry
2009-11-03 22:28                   ` Shawn O. Pearce
2009-11-03 22:34                     ` Pascal Obry
2009-10-20 16:52   ` Junio C Hamano
2009-10-20 17:13     ` Junio C Hamano
2009-10-20 19:33       ` Junio C Hamano
2009-10-20 19:46         ` Nicolas Pitre
2009-10-20 20:50           ` Junio C Hamano
2009-10-22  6:06             ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).