* "failed to read delta base object at..." @ 2008-08-25 16:46 J. Bruce Fields 2008-08-25 18:58 ` Nicolas Pitre 2008-08-25 19:01 ` Linus Torvalds 0 siblings, 2 replies; 32+ messages in thread From: J. Bruce Fields @ 2008-08-25 16:46 UTC (permalink / raw) To: git Today I got this: fatal: failed to read delta base object at 3025976 from /home/bfields/local/linux-2.6/.git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack This has happened once before recently, I believe with a pack that had just been created on a recent fetch. (If I remember correctly, this was soon after a failed suspend/resume cycle that might have interrupted an in-progress fetch; could that possible explain the error?) In that case I reset origin/master, deleted a tag or two, and fetched, and the problem seemed to be fixed. --b. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 16:46 "failed to read delta base object at..." J. Bruce Fields @ 2008-08-25 18:58 ` Nicolas Pitre 2008-08-25 21:18 ` J. Bruce Fields 2008-08-25 19:01 ` Linus Torvalds 1 sibling, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-25 18:58 UTC (permalink / raw) To: J. Bruce Fields; +Cc: git On Mon, 25 Aug 2008, J. Bruce Fields wrote: > Today I got this: > > fatal: failed to read delta base object at 3025976 from > /home/bfields/local/linux-2.6/.git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack > > This has happened once before recently, I believe with a pack that had > just been created on a recent fetch. (If I remember correctly, this was > soon after a failed suspend/resume cycle that might have interrupted an > in-progress fetch; could that possible explain the error?) In that case > I reset origin/master, deleted a tag or two, and fetched, and the > problem seemed to be fixed. The above error is indicative of a corrupted pack on disk. To confirm it you could use 'git verify-pack' with the given pack file. With a sufficiently recent git, you only need to copy over another pack containing the corrupted object, or the object itself in loose form, into your object store to "fix" it. As to the source of disk corruptions... that's up to you to find the cause amongst many (including a failed suspend). Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 18:58 ` Nicolas Pitre @ 2008-08-25 21:18 ` J. Bruce Fields 0 siblings, 0 replies; 32+ messages in thread From: J. Bruce Fields @ 2008-08-25 21:18 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git On Mon, Aug 25, 2008 at 02:58:05PM -0400, Nicolas Pitre wrote: > On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > > Today I got this: > > > > fatal: failed to read delta base object at 3025976 from > > /home/bfields/local/linux-2.6/.git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack > > > > This has happened once before recently, I believe with a pack that had > > just been created on a recent fetch. (If I remember correctly, this was > > soon after a failed suspend/resume cycle that might have interrupted an > > in-progress fetch; could that possible explain the error?) In that case > > I reset origin/master, deleted a tag or two, and fetched, and the > > problem seemed to be fixed. > > The above error is indicative of a corrupted pack on disk. To confirm > it you could use 'git verify-pack' with the given pack file. That gives: error: Packfile .git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack SHA1 mismatch with itself No surprise, I assume. > With a sufficiently recent git, you only need to copy over another pack > containing the corrupted object, or the object itself in loose form, > into your object store to "fix" it. Yeah, I just did a git repack -a -d in a known good repository and copied the resulting pack over. Seems OK. Thanks. --b. > As to the source of disk corruptions... that's up to you to find the > cause amongst many (including a failed suspend). ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 16:46 "failed to read delta base object at..." J. Bruce Fields 2008-08-25 18:58 ` Nicolas Pitre @ 2008-08-25 19:01 ` Linus Torvalds 2008-08-25 21:31 ` J. Bruce Fields 1 sibling, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2008-08-25 19:01 UTC (permalink / raw) To: J. Bruce Fields; +Cc: git On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > Today I got this: > > fatal: failed to read delta base object at 3025976 from > /home/bfields/local/linux-2.6/.git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack This is almost certainly due to some corruption. Basically, the call to "cache_or_unpack_entry()" failed, which in turn is because 'unpack_entry()' will have failed. And since you didn't see any other error, that failure is almost certainly due to unpack_compressed_entry() having failed. We don't print out _why_ (which is a bit sad), but the only thing that unpack_compressed_entry() does is to just "inflate()" the data at that offset. So it probably got a zlib data error, or an adler32 crc failure. > This has happened once before recently, I believe with a pack that had > just been created on a recent fetch. (If I remember correctly, this was > soon after a failed suspend/resume cycle that might have interrupted an > in-progress fetch; could that possible explain the error?) In that case > I reset origin/master, deleted a tag or two, and fetched, and the > problem seemed to be fixed. An interrupted fetch shouldn't have caused this, it really should only happen if you have some actual filesystem data error. Something didn't get written back correctly, or the page cache isn't coherent (or it got corrupted by something else like a wild kernel pointer, of course). Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 19:01 ` Linus Torvalds @ 2008-08-25 21:31 ` J. Bruce Fields 2008-08-25 21:37 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: J. Bruce Fields @ 2008-08-25 21:31 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Thanks to you and Nicolas for the responses. On Mon, Aug 25, 2008 at 12:01:42PM -0700, Linus Torvalds wrote: > On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > > > Today I got this: > > > > fatal: failed to read delta base object at 3025976 from > > /home/bfields/local/linux-2.6/.git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack > > This is almost certainly due to some corruption. Basically, the call to > "cache_or_unpack_entry()" failed, which in turn is because > 'unpack_entry()' will have failed. > > And since you didn't see any other error, that failure is almost certainly > due to unpack_compressed_entry() having failed. We don't print out _why_ > (which is a bit sad), but the only thing that unpack_compressed_entry() > does is to just "inflate()" the data at that offset. > > So it probably got a zlib data error, or an adler32 crc failure. > > > This has happened once before recently, I believe with a pack that had > > just been created on a recent fetch. (If I remember correctly, this was > > soon after a failed suspend/resume cycle that might have interrupted an > > in-progress fetch; could that possible explain the error?) In that case > > I reset origin/master, deleted a tag or two, and fetched, and the > > problem seemed to be fixed. > > An interrupted fetch shouldn't have caused this, it really should only > happen if you have some actual filesystem data error. Something didn't get > written back correctly, or the page cache isn't coherent (or it got > corrupted by something else like a wild kernel pointer, of course). OK. I seem to recall these pack files are created with something like open write sync close rename ? This is just ext3 with data=writeback on a local laptop disk, ubuntu's 2.6.24-21-generic. Would it be any use trying to look more closely at the pack in connection for any hints? (But with my git repo back I'm happy enough to just forget this for now if there's not anything obvious to try.) --b. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 21:31 ` J. Bruce Fields @ 2008-08-25 21:37 ` Linus Torvalds 2008-08-25 22:13 ` J. Bruce Fields 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2008-08-25 21:37 UTC (permalink / raw) To: J. Bruce Fields; +Cc: git On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > OK. I seem to recall these pack files are created with something like > > open > write > sync > close > rename > > ? Yes. We're trying to be _extremely_ safe and only do things that should work for everything. > This is just ext3 with data=writeback on a local laptop disk, > ubuntu's 2.6.24-21-generic. Would it be any use trying to look more > closely at the pack in connection for any hints? You still have the packfile that caused problems available somewhere? If so, absolutely yes. If you have the corrupt pack, please make it available. > (But with my git repo back I'm happy enough to just forget this for now > if there's not anything obvious to try.) With the actual corrupt pack, we can make a fairly intelligent guess about exactly what the corruption was. Was it a flipped bit, or what? So if you have it, please do send it over. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 21:37 ` Linus Torvalds @ 2008-08-25 22:13 ` J. Bruce Fields 2008-08-25 23:59 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: J. Bruce Fields @ 2008-08-25 22:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Mon, Aug 25, 2008 at 02:37:39PM -0700, Linus Torvalds wrote: > > > On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > > > OK. I seem to recall these pack files are created with something like > > > > open > > write > > sync > > close > > rename > > > > ? > > Yes. We're trying to be _extremely_ safe and only do things that should > work for everything. > > > This is just ext3 with data=writeback on a local laptop disk, > > ubuntu's 2.6.24-21-generic. Would it be any use trying to look more > > closely at the pack in connection for any hints? > > You still have the packfile that caused problems available somewhere? If > so, absolutely yes. If you have the corrupt pack, please make it > available. > > > (But with my git repo back I'm happy enough to just forget this for now > > if there's not anything obvious to try.) > > With the actual corrupt pack, we can make a fairly intelligent guess about > exactly what the corruption was. Was it a flipped bit, or what? So if you > have it, please do send it over. OK! It's in: http://www.citi.umich.edu/u/bfields/bad-pack/ I assume the .idx file isn't interesting, but it's there anyway in case. --b. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 22:13 ` J. Bruce Fields @ 2008-08-25 23:59 ` Linus Torvalds 2008-08-26 20:43 ` Jason McMullan ` (2 more replies) 0 siblings, 3 replies; 32+ messages in thread From: Linus Torvalds @ 2008-08-25 23:59 UTC (permalink / raw) To: J. Bruce Fields; +Cc: git On Mon, 25 Aug 2008, J. Bruce Fields wrote: > > OK! It's in: > > http://www.citi.umich.edu/u/bfields/bad-pack/ > > I assume the .idx file isn't interesting, but it's there anyway in case. It makes things slightly easier. But yes, the file is corrupt. The most obvious corruption is simply the comparison that git-fsck does with the SHA1 that the pack-file itself contains: all pack-files have at the end the SHA1 of the contents of the pack-file, and git fsck says: error: .git/objects/pack/pack-f7261d96cf1161b1b0a1593f673a67d0f2469e9b.pack SHA1 checksum mismatch so it's definitely not a valid pack-file, and perhaps more importantly, since git calculates the SHA1 as it writes the file out, the data really doesn't match what git wrote, and it got corrupted at some point in between the generation of the data and the reading back. (It _could_ still be some git corruption where git itself corrupted it after it had done the SHA1 writing, but I doubt it). As to the exact error: zlib reports "invalid distance too far back" as the error message when unpacking, which doesn't really tell me much. It's just a common data error. But since I actually _have_ that corrupt object in my pack too, I can actually look at what it should be. It's object 784716eb7fc5735a3e6b6209223508984013819f (it's a blob encoding for arch/arm/mach-pxa/sleep.S from earlier this year), and in my pack-file it's at byte offset 48094431-48098049. So I can actually extract just those bytes from my pack, and compare them against the bytes in your corrupt pack. It's interesting, because the differences aren't all that big, but they aren't a single word either. Here's a "diff -u" of "od -t x1" output: --- good-object.hex 2008-08-25 16:33:41.000000000 -0700 +++ bad-object.hex 2008-08-25 16:33:50.000000000 -0700 @@ -113,16 +113,16 @@ 0003400 e1 b1 dc ef 45 fa b1 ec 0f a0 2c 72 1b 25 55 f7 0003420 5a d6 f7 5c 72 dd 8b b2 44 dd 25 e6 c6 33 09 98 0003440 03 9b d1 49 8e 30 e6 7a df 76 76 bb cc c9 fb 44 -0003460 a0 14 8d 7d cd 12 75 81 2b c4 d9 71 27 62 ae b5 -0003500 87 a9 4f 76 c5 90 82 04 2b 15 53 47 ad 97 d1 02 -0003520 98 14 df 45 7c 5f 93 b1 08 fd b6 bf e1 62 1b 00 -0003540 b5 f6 ee 6c 70 30 78 73 74 38 3c 7f f3 e3 e5 90 -0003560 39 7e 2e df e5 e1 c5 c1 29 62 02 fd 61 36 61 7d -0003600 c0 83 a0 4f 17 04 02 c6 ec 59 2e c1 7d 20 41 b1 -0003620 fd 51 56 d0 b1 74 b9 76 81 4c 47 63 f0 10 76 15 -0003640 db a7 0b b8 d1 6a f8 59 73 8c c4 e5 c9 ab 8b cb -0003660 37 32 5d 4a 91 c3 c1 c9 77 33 b4 2c b7 61 a0 37 -0003700 47 f7 9b 3d b6 a5 99 5a cf 49 ef e0 3f 54 08 f4 +0003460 a0 14 8d 7d 22 00 00 00 6b 57 fe ff 55 57 fe ff +0003500 aa 57 fe ff b0 00 0e 00 d9 66 22 00 00 00 00 e0 +0003520 19 9f fe ff ff ff d1 43 fe ff d1 43 fe ff 00 c0 +0003540 bf fe f0 33 69 bf b2 33 69 bf 00 00 00 10 01 fe +0003560 bf fe 66 01 00 00 01 1f 37 17 2e 59 fe ff 00 00 +0003600 00 00 c2 41 fe ff 00 00 01 1f 37 17 44 42 fe ff +0003620 00 01 fe ff 01 1f 37 07 d4 42 fe ff 02 01 a8 44 +0003640 00 f8 bf fe 6b 57 fe ff 55 57 fe ff 97 57 fe ff +0003660 00 f8 bf fe 04 00 fe ff ab 45 fe ff 02 00 fe ff +0003700 ae 41 49 50 aa 40 fe ff cf 49 ef e0 3f 54 08 f4 0003720 1e 2b 6f 59 e5 47 06 f4 80 a7 76 3a 38 a7 88 0a 0003740 30 72 a2 0a ed e1 b0 d5 6d 6f 9e ca 96 c7 28 d0 0003760 2c 19 b6 db 51 97 a9 f6 c3 31 8a 50 22 87 7a 97 and the difference literally seems to be just a block of less than 160 bytes. But it's certainly not a single-bit error, and it's also not a disk block or anything like that. Looking at the first line that differs, they are: -0003460 a0 14 8d 7d cd 12 75 81 2b c4 d9 71 27 62 ae b5 +0003460 a0 14 8d 7d 22 00 00 00 6b 57 fe ff 55 57 fe ff so the differences start at offset 03464. The last line is: -0003700 47 f7 9b 3d b6 a5 99 5a cf 49 ef e0 3f 54 08 f4 +0003700 ae 41 49 50 aa 40 fe ff cf 49 ef e0 3f 54 08 f4 so they re-join at 03708 (octal offsets because of 'od' behavior). But in between there seems to be nothing in common. So the corrupt data looks like 22 00 00 00 6b 57 fe ff 55 57 fe ff 0003500 aa 57 fe ff b0 00 0e 00 d9 66 22 00 00 00 00 e0 0003520 19 9f fe ff ff ff d1 43 fe ff d1 43 fe ff 00 c0 0003540 bf fe f0 33 69 bf b2 33 69 bf 00 00 00 10 01 fe 0003560 bf fe 66 01 00 00 01 1f 37 17 2e 59 fe ff 00 00 0003600 00 00 c2 41 fe ff 00 00 01 1f 37 17 44 42 fe ff 0003620 00 01 fe ff 01 1f 37 07 d4 42 fe ff 02 01 a8 44 0003640 00 f8 bf fe 6b 57 fe ff 55 57 fe ff 97 57 fe ff 0003660 00 f8 bf fe 04 00 fe ff ab 45 fe ff 02 00 fe ff 0003700 ae 41 49 50 aa 40 fe ff and I don't see what the patern is, except that there's a lot of "fe ff" in there. 148 bytes, no obvious source. It definitely does _not_ look like compressed input (it's too regular to be something zlib spits out), nor is it text. Nor is it valid x86 assembly, so it's not some code-segment data either. And as far as I can tell, that's the _only_ corruption in the whole file, but I didn't really double-check. Does anybody see a pattern? Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 23:59 ` Linus Torvalds @ 2008-08-26 20:43 ` Jason McMullan 2008-08-26 21:01 ` Jason McMullan 2008-08-26 20:55 ` "failed to read delta base object at..." J. Bruce Fields 2008-08-27 20:14 ` Junio C Hamano 2 siblings, 1 reply; 32+ messages in thread From: Jason McMullan @ 2008-08-26 20:43 UTC (permalink / raw) To: git; +Cc: bfields Linus Torvalds wrote: > So the corrupt data looks like > > [snip] > > And as far as I can tell, that's the _only_ corruption in the whole file, > but I didn't really double-check. > > Does anybody see a pattern? > Was this pack created on a journaled file system? Reiserfs? Ext3? If there's journal corruption in a commonly used filesystem, That Would Be Bad. I would suspect the Reiserfs 'file tail' behavour and journalling have something to do with it, only because I still use and like ReiserV3+LVM (dynamic grow and offline shrink baby!). Only something like silently knifing files in the back would cause me to leave my beloved RedrumFS. Which I probably should. Eventually. Once ZFS changes it's license. HA! - Jason McMullan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-26 20:43 ` Jason McMullan @ 2008-08-26 21:01 ` Jason McMullan 2008-08-27 17:05 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: Jason McMullan @ 2008-08-26 21:01 UTC (permalink / raw) To: git; +Cc: bfields Jason McMullan wrote: > > Was this pack created on a journaled file system? Reiserfs? Ext3? > > If there's journal corruption in a commonly used filesystem, > That Would Be Bad. In private mail, it was indicated that this was an ext3 with the data=writeback option. From mount(1), man page, ext3 section: writeback Data ordering is not preserved - data may be written into the main file system after its metadata has been commit‐ ted to the journal. This is rumoured to be the highest- throughput option. It guarantees internal file system integrity, however it can allow old data to appear in files after a crash and journal recovery. All bets are off when data=writeback. - Jason McMullan ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-26 21:01 ` Jason McMullan @ 2008-08-27 17:05 ` Linus Torvalds 2008-08-27 19:17 ` Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2008-08-27 17:05 UTC (permalink / raw) To: Jason McMullan; +Cc: git, bfields On Tue, 26 Aug 2008, Jason McMullan wrote: > > All bets are off when data=writeback. Not the way git writes pack-files. It does a fsync() before moving them into place (at least newer git versions do), so the data is stable. I do worry about wild pointers. I can't recognize the data, and it definitely doesn't look like any git internal data structures, but 16-bit data _is_ what zlib internally uses for things like the decoding tables. So if there is some use-after-free issue, I could imagine things like this happening inside of git. People do occasionally run valgrind on git, though, and it's been clean in the past, but I don't know if that has ever been done on the threaded packing, for example. For example, the corrupting data had patterns like this: 00 f8 bf fe 6b 57 fe ff 55 57 fe ff 97 57 fe ff where the pattern _could_ be something like { 00 f8 febf }, { 6b 57 fffe }, { 55 57 fffe }, { 97 57 fffe }, assuming that the "fe ff" pattern really is meaningful and is a 16-bit little-endian word. And the thign is, zlib "code" tables look exactly like that: typedef struct { unsigned char op; /* operation, extra bits, table bits */ unsigned char bits; /* bits in this part of the code */ unsigned short val; /* offset in table or code value */ } code; /* op values as set by inflate_table(): 00000000 - literal 0000tttt - table link, tttt != 0 is the number of table index bits 0001eeee - length or distance, eeee is the number of extra bits 01100000 - end of block 01000000 - invalid code */ but those particular op/val things don't make sense in that context either. But I don't know zlib that well, maybe the deflate routines use some other model. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-27 17:05 ` Linus Torvalds @ 2008-08-27 19:17 ` Nicolas Pitre 2008-08-27 19:48 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-27 19:17 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jason McMullan, git, bfields On Wed, 27 Aug 2008, Linus Torvalds wrote: > > > On Tue, 26 Aug 2008, Jason McMullan wrote: > > > > All bets are off when data=writeback. > > Not the way git writes pack-files. It does a fsync() before moving them > into place (at least newer git versions do), so the data is stable. And isn't the bad data block size and alignment a bit odd for a filesystem crash corruption? > I do worry about wild pointers. I can't recognize the data, and it > definitely doesn't look like any git internal data structures, but 16-bit > data _is_ what zlib internally uses for things like the decoding tables. > > > So if there is some use-after-free issue, I could imagine things like this > happening inside of git. People do occasionally run valgrind on git, > though, and it's been clean in the past, but I don't know if that has ever > been done on the threaded packing, for example. However, in the pack-objects case, it is almost impossible to have such a corruption since the data is SHA1 summed immediately before being written out. In the index-pack case, it is even less likely since the data is SHA1 summed immediately after being written out. So there is no window for random pointer access corrupting the data without also influencing the pack checksum outcome. Therefore, if the pack content doesn't match its checksum, then the corruption must have occurred outside of git, otherwise the checksum would have included corrupted data already and the pack checksum would match. Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-27 19:17 ` Nicolas Pitre @ 2008-08-27 19:48 ` Linus Torvalds 2008-08-27 20:46 ` Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2008-08-27 19:48 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Jason McMullan, git, bfields On Wed, 27 Aug 2008, Nicolas Pitre wrote: > > And isn't the bad data block size and alignment a bit odd for a > filesystem crash corruption? Yes. If it was a filesystem issue, I'd expect it to be at least disk block aligned (512 bytes, most of the time) and more likely filesystem block aligned (ie mostly 4kB). However, if we were to re-write the file afterwards, it could still get non-block-aligned corruption - simply because there was a non-block-aligned rewrite that got lost. But we don't actually ever do that, except for the header and the SHA1 at the end in some unusual cases. > However, in the pack-objects case, it is almost impossible to have such > a corruption since the data is SHA1 summed immediately before being > written out. Yes. Anything that uses the "sha1write()" model (which includes the regular pack-file _and_ the index) should generally be pretty safe. However, we do have this odd case of fixing up the pack after-the-fact when we receive it from somebody else (because we get a thin pack and don't know how many objects the final result will have). And that case seems to be not as safe, because it - re-reads the file to recompute the SHA1 This is understandable, and it's fairly ok, but it does mean that there is a bigger chance of the SHA1 matching if something has corrupted the file in the meantime! (That was not the case of this corruption, obviously, since the SHA1 didn't match) - but it also forgets to fsync the result, because it only did that in one path rather in all cases of fixup. Again, this wasn't actually the cause of this corruption, because the corruption wasn't near the header or tail, so if it had been due to a missed write due to missing an fsync, the pattern would have been different. Anyway, we should fix the latter problem regardless, even if it's (a) damn unlikely and (b) definietly not the case in this thing. The fix is trivial - just move the "fsync_or_die()" into the fixup routine rather than doing it in one of the callers. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Linus --- builtin-pack-objects.c | 1 - pack-write.c | 1 + 2 files changed, 1 insertions(+), 1 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 2dadec1..d394c49 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -499,7 +499,6 @@ static void write_pack_file(void) } else { int fd = sha1close(f, NULL, 0); fixup_pack_header_footer(fd, sha1, pack_tmp_name, nr_written); - fsync_or_die(fd, pack_tmp_name); close(fd); } diff --git a/pack-write.c b/pack-write.c index a8f0269..ddcfd37 100644 --- a/pack-write.c +++ b/pack-write.c @@ -179,6 +179,7 @@ void fixup_pack_header_footer(int pack_fd, SHA1_Final(pack_file_sha1, &c); write_or_die(pack_fd, pack_file_sha1, 20); + fsync_or_die(pack_fd, pack_name); } char *index_pack_lockfile(int ip_out) ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-27 19:48 ` Linus Torvalds @ 2008-08-27 20:46 ` Nicolas Pitre 2008-08-29 2:05 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-27 20:46 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jason McMullan, git, bfields [-- Attachment #1: Type: TEXT/PLAIN, Size: 1500 bytes --] On Wed, 27 Aug 2008, Linus Torvalds wrote: > On Wed, 27 Aug 2008, Nicolas Pitre wrote: > > > However, in the pack-objects case, it is almost impossible to have such > > a corruption since the data is SHA1 summed immediately before being > > written out. > > Yes. Anything that uses the "sha1write()" model (which includes the > regular pack-file _and_ the index) should generally be pretty safe. What that means is that if git was the cause of the corruption itself then the pack would still match its checksum (verify-pâck would still fail nevertheless). > However, we do have this odd case of fixing up the pack after-the-fact > when we receive it from somebody else (because we get a thin pack and > don't know how many objects the final result will have). And that case > seems to be not as safe, because it > > - re-reads the file to recompute the SHA1 > > This is understandable, and it's fairly ok, but it does mean that there > is a bigger chance of the SHA1 matching if something has corrupted the > file in the meantime! I think that can be fixed. When reading the file back, it is possible to compute 2 sha1s: one to compare with the recieved one using original pack header, and the second which would be the final one. FRom a certain offset, new objects were added, so that first sha1 is validated against the received one and reset, and at the end, it should correspond to the sha1 of added objects that we should compute when writing them. Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum 2008-08-27 20:46 ` Nicolas Pitre @ 2008-08-29 2:05 ` Nicolas Pitre 2008-08-29 2:07 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 2008-08-29 4:55 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Shawn O. Pearce 0 siblings, 2 replies; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 2:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Shawn O. Pearce, git On Wed, 27 Aug 2008, Nicolas Pitre wrote: > On Wed, 27 Aug 2008, Linus Torvalds wrote: > > > However, we do have this odd case of fixing up the pack after-the-fact > > when we receive it from somebody else (because we get a thin pack and > > don't know how many objects the final result will have). And that case > > seems to be not as safe, because it > > > > - re-reads the file to recompute the SHA1 > > > > This is understandable, and it's fairly ok, but it does mean that there > > is a bigger chance of the SHA1 matching if something has corrupted the > > file in the meantime! > > I think that can be fixed. When reading the file back, it is possible > to compute 2 sha1s: one to compare with the recieved one using original > pack header, and the second which would be the final one. FRom a > certain offset, new objects were added, so that first sha1 is validated > against the received one and reset, and at the end, it should correspond > to the sha1 of added objects that we should compute when writing them. So here is 3 patches for this in C git (Shawn already did it for jgit). There are 3 spots where this needs to be plugged: in pack-objects, index-pack and fast-import. I did the first two. I don't think this is that pertinent in fast-import because: - we want fast-import to be fast - after fast-import is done, a full repack is typically done which would revalidate everything (maybe fast-import should use index v2 with its per object CRC32 though) But maybe I'm wrong about that. In any case, I am leaving the fast-import to someone more knowledgeable about its code than I do. Nicolas Pitre (3): improve reliability of fixup_pack_header_footer() pack-objects: use fixup_pack_header_footer()'s validation mode index-pack: use fixup_pack_header_footer()'s validation mode Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/3] improve reliability of fixup_pack_header_footer() 2008-08-29 2:05 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Nicolas Pitre @ 2008-08-29 2:07 ` Nicolas Pitre 2008-08-29 2:07 ` [PATCH 2/3] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre 2008-08-29 4:44 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Shawn O. Pearce 2008-08-29 4:55 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Shawn O. Pearce 1 sibling, 2 replies; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 2:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Currently, this function has the potential to read corrupted pack data from disk and give it a valid SHA1 checksum. Let's add the ability to validate SHA1 checksum of existing data along the way, including before and after any arbitrary point in the pack. Signed-off-by: Nicolas Pitre <nico@cam.org> --- builtin-pack-objects.c | 3 +- fast-import.c | 3 +- index-pack.c | 3 +- pack-write.c | 78 ++++++++++++++++++++++++++++++++++++++++------- pack.h | 2 +- 5 files changed, 73 insertions(+), 16 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index ef3befe..ec80f14 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -498,7 +498,8 @@ static void write_pack_file(void) sha1close(f, sha1, CSUM_FSYNC); } else { int fd = sha1close(f, NULL, 0); - fixup_pack_header_footer(fd, sha1, pack_tmp_name, nr_written); + fixup_pack_header_footer(fd, sha1, pack_tmp_name, + nr_written, NULL, 0); close(fd); } diff --git a/fast-import.c b/fast-import.c index 7089e6f..d85b3a5 100644 --- a/fast-import.c +++ b/fast-import.c @@ -951,7 +951,8 @@ static void end_packfile(void) close_pack_windows(pack_data); fixup_pack_header_footer(pack_data->pack_fd, pack_data->sha1, - pack_data->pack_name, object_count); + pack_data->pack_name, object_count, + NULL, 0); close(pack_data->pack_fd); idx_name = keep_pack(create_index()); diff --git a/index-pack.c b/index-pack.c index 728af7d..411b80d 100644 --- a/index-pack.c +++ b/index-pack.c @@ -982,7 +982,8 @@ int main(int argc, char **argv) nr_objects - nr_objects_initial); stop_progress_msg(&progress, msg); fixup_pack_header_footer(output_fd, sha1, - curr_pack, nr_objects); + curr_pack, nr_objects, + NULL, 0); } if (nr_deltas != nr_resolved_deltas) die("pack has %d unresolved deltas", diff --git a/pack-write.c b/pack-write.c index ddcfd37..0c0abce 100644 --- a/pack-write.c +++ b/pack-write.c @@ -144,41 +144,95 @@ char *write_idx_file(char *index_name, struct pack_idx_entry **objects, return index_name; } +/* + * Update pack header with object_count and compute new SHA1 for pack data + * associated to pack_fd, and write that SHA1 at the end. That new SHA1 + * is also returned in new_pack_sha1. + * + * If partial_pack_sha1 is non null, then the SHA1 of the existing pack + * (without the header update) is computed and validated against the one + * provided in partial_pack_sha1. The validation is performed at + * partial_pack_offset bytes in the pack file, or at the end of the pack + * file if partial_pack_offset is zero. Also, when partial_pack_offset is + * non zero, the SHA1 of the remaining data (i.e. from partial_pack_offset + * to the end) is returned in partial_pack_sha1. + * + * Note that new_pack_sha1 is updated last, so both new_pack_sha1 and + * partial_pack_sha1 can refer to the same buffer if the caller is not + * interested in the resulting SHA1 of pack data above partial_pack_offset. + */ void fixup_pack_header_footer(int pack_fd, - unsigned char *pack_file_sha1, + unsigned char *new_pack_sha1, const char *pack_name, - uint32_t object_count) + uint32_t object_count, + unsigned char *partial_pack_sha1, + off_t partial_pack_offset) { static const int buf_sz = 128 * 1024; - SHA_CTX c; + SHA_CTX old_sha1_ctx, new_sha1_ctx; struct pack_header hdr; char *buf; + SHA1_Init(&old_sha1_ctx); + SHA1_Init(&new_sha1_ctx); + + if (partial_pack_sha1 && !partial_pack_offset) { + partial_pack_offset = lseek(pack_fd, 0, SEEK_CUR); + if (partial_pack_offset == (off_t)-1) + die("Can't get size of %s: %s", pack_name, strerror(errno)); + } + if (lseek(pack_fd, 0, SEEK_SET) != 0) - die("Failed seeking to start: %s", strerror(errno)); + die("Failed seeking to start of %s: %s", pack_name, strerror(errno)); if (read_in_full(pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr)) die("Unable to reread header of %s: %s", pack_name, strerror(errno)); if (lseek(pack_fd, 0, SEEK_SET) != 0) - die("Failed seeking to start: %s", strerror(errno)); + die("Failed seeking to start of %s: %s", pack_name, strerror(errno)); + SHA1_Update(&old_sha1_ctx, &hdr, sizeof(hdr)); hdr.hdr_entries = htonl(object_count); + SHA1_Update(&new_sha1_ctx, &hdr, sizeof(hdr)); write_or_die(pack_fd, &hdr, sizeof(hdr)); - - SHA1_Init(&c); - SHA1_Update(&c, &hdr, sizeof(hdr)); + partial_pack_offset -= sizeof(hdr); buf = xmalloc(buf_sz); for (;;) { - ssize_t n = xread(pack_fd, buf, buf_sz); + ssize_t m, n; + m = (partial_pack_sha1 && partial_pack_offset < buf_sz) ? + partial_pack_offset : buf_sz; + n = xread(pack_fd, buf, m); if (!n) break; if (n < 0) die("Failed to checksum %s: %s", pack_name, strerror(errno)); - SHA1_Update(&c, buf, n); + SHA1_Update(&new_sha1_ctx, buf, n); + + if (!partial_pack_sha1) + continue; + + SHA1_Update(&old_sha1_ctx, buf, n); + partial_pack_offset -= n; + if (partial_pack_offset == 0) { + unsigned char sha1[20]; + SHA1_Final(sha1, &old_sha1_ctx); + if (hashcmp(sha1, partial_pack_sha1) != 0) + die("Unexpected checksum for %s " + "(disk corruption?)", pack_name); + /* + * Now let's compute the SHA1 of the remainder of the + * pack, which also means making partial_pack_offset + * big enough not to matter anymore. + */ + SHA1_Init(&old_sha1_ctx); + partial_pack_offset = ~partial_pack_offset; + partial_pack_offset -= MSB(partial_pack_offset, 1); + } } free(buf); - SHA1_Final(pack_file_sha1, &c); - write_or_die(pack_fd, pack_file_sha1, 20); + if (partial_pack_sha1) + SHA1_Final(partial_pack_sha1, &old_sha1_ctx); + SHA1_Final(new_pack_sha1, &new_sha1_ctx); + write_or_die(pack_fd, new_pack_sha1, 20); fsync_or_die(pack_fd, pack_name); } diff --git a/pack.h b/pack.h index 76e6aa2..a883334 100644 --- a/pack.h +++ b/pack.h @@ -58,7 +58,7 @@ struct pack_idx_entry { extern char *write_idx_file(char *index_name, struct pack_idx_entry **objects, int nr_objects, unsigned char *sha1); extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); extern int verify_pack(struct packed_git *); -extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t); +extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); extern char *index_pack_lockfile(int fd); #define PH_ERROR_EOF (-1) -- 1.6.0.1.174.g97d7e.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/3] pack-objects: use fixup_pack_header_footer()'s validation mode 2008-08-29 2:07 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre @ 2008-08-29 2:07 ` Nicolas Pitre 2008-08-29 2:07 ` [PATCH 3/3] index-pack: " Nicolas Pitre 2008-08-29 4:44 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Shawn O. Pearce 1 sibling, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 2:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git When limiting the pack size, a new header has to be written to the pack and a new SHA1 computed. Make sure that the SHA1 of what is being read back matches the SHA1 of what was written. Signed-off-by: Nicolas Pitre <nico@cam.org> --- builtin-pack-objects.c | 4 ++-- csum-file.c | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index ec80f14..487943e 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -497,9 +497,9 @@ static void write_pack_file(void) } else if (nr_written == nr_remaining) { sha1close(f, sha1, CSUM_FSYNC); } else { - int fd = sha1close(f, NULL, 0); + int fd = sha1close(f, sha1, 0); fixup_pack_header_footer(fd, sha1, pack_tmp_name, - nr_written, NULL, 0); + nr_written, sha1, 0); close(fd); } diff --git a/csum-file.c b/csum-file.c index ace64f1..2838954 100644 --- a/csum-file.c +++ b/csum-file.c @@ -42,11 +42,11 @@ int sha1close(struct sha1file *f, unsigned char *result, unsigned int flags) sha1flush(f, offset); f->offset = 0; } + SHA1_Final(f->buffer, &f->ctx); + if (result) + hashcpy(result, f->buffer); if (flags & (CSUM_CLOSE | CSUM_FSYNC)) { /* write checksum and close fd */ - SHA1_Final(f->buffer, &f->ctx); - if (result) - hashcpy(result, f->buffer); sha1flush(f, 20); if (flags & CSUM_FSYNC) fsync_or_die(f->fd, f->name); -- 1.6.0.1.174.g97d7e.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/3] index-pack: use fixup_pack_header_footer()'s validation mode 2008-08-29 2:07 ` [PATCH 2/3] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre @ 2008-08-29 2:07 ` Nicolas Pitre 0 siblings, 0 replies; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 2:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git When completing a thin pack, a new header has to be written to the pack and a new SHA1 computed. Make sure that the SHA1 of what is being read back matches the SHA1 of what was written for both: the original pack and the appended objects. To do so, a couple write_or_die() calls were converted to sha1write() which has the advantage of doing some buffering as well as handling SHA1 and CRC32 checksum already. Signed-off-by: Nicolas Pitre <nico@cam.org> --- index-pack.c | 43 +++++++++++++++++++++++++------------------ 1 files changed, 25 insertions(+), 18 deletions(-) diff --git a/index-pack.c b/index-pack.c index 411b80d..a6e91fe 100644 --- a/index-pack.c +++ b/index-pack.c @@ -654,7 +654,7 @@ static void parse_pack_objects(unsigned char *sha1) } } -static int write_compressed(int fd, void *in, unsigned int size, uint32_t *obj_crc) +static int write_compressed(struct sha1file *f, void *in, unsigned int size) { z_stream stream; unsigned long maxsize; @@ -674,13 +674,12 @@ static int write_compressed(int fd, void *in, unsigned int size, uint32_t *obj_c deflateEnd(&stream); size = stream.total_out; - write_or_die(fd, out, size); - *obj_crc = crc32(*obj_crc, out, size); + sha1write(f, out, size); free(out); return size; } -static struct object_entry *append_obj_to_pack( +static struct object_entry *append_obj_to_pack(struct sha1file *f, const unsigned char *sha1, void *buf, unsigned long size, enum object_type type) { @@ -696,15 +695,15 @@ static struct object_entry *append_obj_to_pack( s >>= 7; } header[n++] = c; - write_or_die(output_fd, header, n); - obj[0].idx.crc32 = crc32(0, Z_NULL, 0); - obj[0].idx.crc32 = crc32(obj[0].idx.crc32, header, n); + crc32_begin(f); + sha1write(f, header, n); obj[0].size = size; obj[0].hdr_size = n; obj[0].type = type; obj[0].real_type = type; obj[1].idx.offset = obj[0].idx.offset + n; - obj[1].idx.offset += write_compressed(output_fd, buf, size, &obj[0].idx.crc32); + obj[1].idx.offset += write_compressed(f, buf, size); + obj[0].idx.crc32 = crc32_end(f); hashcpy(obj->idx.sha1, sha1); return obj; } @@ -716,7 +715,7 @@ static int delta_pos_compare(const void *_a, const void *_b) return a->obj_no - b->obj_no; } -static void fix_unresolved_deltas(int nr_unresolved) +static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved) { struct delta_entry **sorted_by_pos; int i, n = 0; @@ -754,8 +753,8 @@ static void fix_unresolved_deltas(int nr_unresolved) if (check_sha1_signature(d->base.sha1, base_obj.data, base_obj.size, typename(type))) die("local object %s is corrupt", sha1_to_hex(d->base.sha1)); - base_obj.obj = append_obj_to_pack(d->base.sha1, base_obj.data, - base_obj.size, type); + base_obj.obj = append_obj_to_pack(f, d->base.sha1, + base_obj.data, base_obj.size, type); link_base_data(NULL, &base_obj); find_delta_children(&d->base, &first, &last); @@ -875,7 +874,7 @@ int main(int argc, char **argv) const char *keep_name = NULL, *keep_msg = NULL; char *index_name_buf = NULL, *keep_name_buf = NULL; struct pack_idx_entry **idx_objects; - unsigned char sha1[20]; + unsigned char pack_sha1[20]; int nongit = 0; setup_git_directory_gently(&nongit); @@ -962,13 +961,15 @@ int main(int argc, char **argv) parse_pack_header(); objects = xmalloc((nr_objects + 1) * sizeof(struct object_entry)); deltas = xmalloc(nr_objects * sizeof(struct delta_entry)); - parse_pack_objects(sha1); + parse_pack_objects(pack_sha1); if (nr_deltas == nr_resolved_deltas) { stop_progress(&progress); /* Flush remaining pack final 20-byte SHA1. */ flush(); } else { if (fix_thin_pack) { + struct sha1file *f; + unsigned char read_sha1[20], tail_sha1[20]; char msg[48]; int nr_unresolved = nr_deltas - nr_resolved_deltas; int nr_objects_initial = nr_objects; @@ -977,13 +978,19 @@ int main(int argc, char **argv) objects = xrealloc(objects, (nr_objects + nr_unresolved + 1) * sizeof(*objects)); - fix_unresolved_deltas(nr_unresolved); + f = sha1fd(output_fd, curr_pack); + fix_unresolved_deltas(f, nr_unresolved); sprintf(msg, "completed with %d local objects", nr_objects - nr_objects_initial); stop_progress_msg(&progress, msg); - fixup_pack_header_footer(output_fd, sha1, + sha1close(f, tail_sha1, 0); + hashcpy(read_sha1, pack_sha1); + fixup_pack_header_footer(output_fd, pack_sha1, curr_pack, nr_objects, - NULL, 0); + read_sha1, consumed_bytes-20); + if (hashcmp(read_sha1, tail_sha1) != 0) + die("Unexpected tail checksum for %s " + "(disk corruption?)", curr_pack); } if (nr_deltas != nr_resolved_deltas) die("pack has %d unresolved deltas", @@ -996,13 +1003,13 @@ int main(int argc, char **argv) idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *)); for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; - curr_index = write_idx_file(index_name, idx_objects, nr_objects, sha1); + curr_index = write_idx_file(index_name, idx_objects, nr_objects, pack_sha1); free(idx_objects); final(pack_name, curr_pack, index_name, curr_index, keep_name, keep_msg, - sha1); + pack_sha1); free(objects); free(index_name_buf); free(keep_name_buf); -- 1.6.0.1.174.g97d7e.dirty ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] improve reliability of fixup_pack_header_footer() 2008-08-29 2:07 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 2008-08-29 2:07 ` [PATCH 2/3] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre @ 2008-08-29 4:44 ` Shawn O. Pearce 2008-08-29 13:08 ` Nicolas Pitre 1 sibling, 1 reply; 32+ messages in thread From: Shawn O. Pearce @ 2008-08-29 4:44 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@cam.org> wrote: > Currently, this function has the potential to read corrupted pack data > from disk and give it a valid SHA1 checksum. Let's add the ability to > validate SHA1 checksum of existing data along the way, including before > and after any arbitrary point in the pack. I found your implementation in fixup_pack_header to be very contorted. Did you read the JGit patch I posted? I think its implementation is more elegant and easier to follow. Oh, and its BSD licensed... so easy for a GPLv2 project to borrow. ;-) > void fixup_pack_header_footer(int pack_fd, ... > + if (partial_pack_sha1 && !partial_pack_offset) { > + partial_pack_offset = lseek(pack_fd, 0, SEEK_CUR); > + if (partial_pack_offset == (off_t)-1) Eh? I find this to be nonsense. If the caller gave us a SHA-1 but wants us to do fixup then they have increased the size of the pack. Which means they must pass us the original length. Computing it here is relying on the caller leaving the file pointer positioned at the old end. Who the heck does that after fixing a thin pack? > buf = xmalloc(buf_sz); > for (;;) { > - ssize_t n = xread(pack_fd, buf, buf_sz); > + ssize_t m, n; > + m = (partial_pack_sha1 && partial_pack_offset < buf_sz) ? > + partial_pack_offset : buf_sz; > + n = xread(pack_fd, buf, m); Why not read the full buf_sz per round and then use only part of the buffer in SHA1_Update(&old_sha1_ctx)? It took me a while to figure out what the heck you were trying to measure here. It also would be a few less CPU instructions if you always just read buf_sz and leave the min test to after the "if (!partial_pack_sha1)" below. > + if (!partial_pack_sha1) > + continue; > + > + SHA1_Update(&old_sha1_ctx, buf, n); > + partial_pack_offset -= n; > + if (partial_pack_offset == 0) { > + unsigned char sha1[20]; > + SHA1_Final(sha1, &old_sha1_ctx); > + if (hashcmp(sha1, partial_pack_sha1) != 0) > + die("Unexpected checksum for %s " > + "(disk corruption?)", pack_name); > + /* > + * Now let's compute the SHA1 of the remainder of the > + * pack, which also means making partial_pack_offset > + * big enough not to matter anymore. > + */ > + SHA1_Init(&old_sha1_ctx); > + partial_pack_offset = ~partial_pack_offset; > + partial_pack_offset -= MSB(partial_pack_offset, 1); That's freaking brilliant. And something I missed in JGit. The way its implemented is downright difficult to follow though. I'll admit it took me a good 10 minutes to understand what you were doing here, and why. -- Shawn. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] improve reliability of fixup_pack_header_footer() 2008-08-29 4:44 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Shawn O. Pearce @ 2008-08-29 13:08 ` Nicolas Pitre 2008-08-29 14:30 ` Shawn O. Pearce 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 13:08 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Junio C Hamano, git On Thu, 28 Aug 2008, Shawn O. Pearce wrote: > Nicolas Pitre <nico@cam.org> wrote: > > Currently, this function has the potential to read corrupted pack data > > from disk and give it a valid SHA1 checksum. Let's add the ability to > > validate SHA1 checksum of existing data along the way, including before > > and after any arbitrary point in the pack. > > I found your implementation in fixup_pack_header to be very > contorted. Did you read the JGit patch I posted? I think its > implementation is more elegant and easier to follow. Oh, and its > BSD licensed... so easy for a GPLv2 project to borrow. ;-) Well, actually, I don't find the JGit implementation easier to follow at all. Of course I wrote the C version while you wrote the Java version. Maybe without our respective bias then things are just fine in both cases. > > void fixup_pack_header_footer(int pack_fd, > ... > > + if (partial_pack_sha1 && !partial_pack_offset) { > > + partial_pack_offset = lseek(pack_fd, 0, SEEK_CUR); > > + if (partial_pack_offset == (off_t)-1) > > Eh? > > I find this to be nonsense. If the caller gave us a SHA-1 but > wants us to do fixup then they have increased the size of the pack. > Which means they must pass us the original length. Computing it > here is relying on the caller leaving the file pointer positioned > at the old end. Who the heck does that after fixing a thin pack? This is not for thin packs but for split packs in repack-objects. And yeah, the caller has the offset information already in that case too, so this could be removed. It just felt more generic that way originally. > > buf = xmalloc(buf_sz); > > for (;;) { > > - ssize_t n = xread(pack_fd, buf, buf_sz); > > + ssize_t m, n; > > + m = (partial_pack_sha1 && partial_pack_offset < buf_sz) ? > > + partial_pack_offset : buf_sz; > > + n = xread(pack_fd, buf, m); > > Why not read the full buf_sz per round and then use only part of > the buffer in SHA1_Update(&old_sha1_ctx)? It took me a while to > figure out what the heck you were trying to measure here. It also > would be a few less CPU instructions if you always just read buf_sz > and leave the min test to after the "if (!partial_pack_sha1)" below. I doubt it. That would only move the offset test from before the read to after it, and actually complicate things for restarting the checksumming after the special offset was crossed. And we're talking about a comparison with a conditional move instructions only. And another thing I had in store (but for which you _again_ beat me to :-) ) is to realign data reads onto filesystem blocks. > > + if (!partial_pack_sha1) > > + continue; > > + > > + SHA1_Update(&old_sha1_ctx, buf, n); > > + partial_pack_offset -= n; > > + if (partial_pack_offset == 0) { > > + unsigned char sha1[20]; > > + SHA1_Final(sha1, &old_sha1_ctx); > > + if (hashcmp(sha1, partial_pack_sha1) != 0) > > + die("Unexpected checksum for %s " > > + "(disk corruption?)", pack_name); > > + /* > > + * Now let's compute the SHA1 of the remainder of the > > + * pack, which also means making partial_pack_offset > > + * big enough not to matter anymore. > > + */ > > + SHA1_Init(&old_sha1_ctx); > > + partial_pack_offset = ~partial_pack_offset; > > + partial_pack_offset -= MSB(partial_pack_offset, 1); > > That's freaking brilliant. And something I missed in JGit. > > The way its implemented is downright difficult to follow though. > I'll admit it took me a good 10 minutes to understand what you were > doing here, and why. Again maybe that's just because you're just too biased by your own implementation. I don't consider this code particularly obscur. Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] improve reliability of fixup_pack_header_footer() 2008-08-29 13:08 ` Nicolas Pitre @ 2008-08-29 14:30 ` Shawn O. Pearce 2008-08-29 20:07 ` [PATCH 0/5] pack header rewriting improvements Nicolas Pitre 2008-08-29 20:14 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 0 siblings, 2 replies; 32+ messages in thread From: Shawn O. Pearce @ 2008-08-29 14:30 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, git Nicolas Pitre <nico@cam.org> wrote: > On Thu, 28 Aug 2008, Shawn O. Pearce wrote: > > > > I found your implementation in fixup_pack_header to be very > > contorted. Did you read the JGit patch I posted? > > Well, actually, I don't find the JGit implementation easier to follow at > all. Of course I wrote the C version while you wrote the Java version. > Maybe without our respective bias then things are just fine in both > cases. I probably should have gone and edited my eariler comments after I reached the end of the patch and the light dawned about why you are summing the tail. Most of what I found to be complex in your code was just to handle that boundary condition at partial_pack_offset and restart the checksum for the tail. But I honestly cannot see an easier way to write that, so what you have is just fine. > > > void fixup_pack_header_footer(int pack_fd, > > ... > > > + if (partial_pack_sha1 && !partial_pack_offset) { > > > + partial_pack_offset = lseek(pack_fd, 0, SEEK_CUR); > > > + if (partial_pack_offset == (off_t)-1) > > > > Eh? > > > > I find this to be nonsense. > > This is not for thin packs but for split packs in repack-objects. And > yeah, the caller has the offset information already in that case too, so > this could be removed. It just felt more generic that way originally. Oh, yea, that makes sense. It still seems like playing with fire. I'd rather the caller pass in the proper offset than rely on it being the current position of the fd. Especially if the caller does actually have it available. If you change anything, I'd like to see this lseek(SEEK_CUR) go away. > And another thing I had in store (but for which you _again_ beat me to :-) ) > is to realign data reads onto filesystem blocks. That _really_ made the JGit code ugly. But I think its worth it. I also want to try and buffer the whole object appending we do during fixThinPack(), as right now we write the object header in one write and then compressed data bursts in the others. Moving it to at least write a full 4k at a time should remove about 2 write calls per object. > > That's freaking brilliant. And something I missed in JGit. > > The way its implemented is downright difficult to follow though. > > I'll admit it took me a good 10 minutes to understand what you were > > doing here, and why. > > Again maybe that's just because you're just too biased by your own > implementation. I don't consider this code particularly obscur. My own code in JGit got "obscure" when I added this check too. I take back the remark above. -- Shawn. ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 0/5] pack header rewriting improvements 2008-08-29 14:30 ` Shawn O. Pearce @ 2008-08-29 20:07 ` Nicolas Pitre 2008-08-29 20:07 ` [PATCH 1/5] pack-objects: improve returned information from write_one() Nicolas Pitre 2008-08-29 20:14 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 1 sibling, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git This replaces my earlier 3-patch series called "don't let disk corruptions escape pack SHA1 checksum" with this set. Same purpose but better! Once again I'm leaving fast-import out. I'm sure Shawn will figure it out much faster than I could. Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* [PATCH 1/5] pack-objects: improve returned information from write_one() 2008-08-29 20:07 ` [PATCH 0/5] pack header rewriting improvements Nicolas Pitre @ 2008-08-29 20:07 ` Nicolas Pitre 2008-08-29 20:07 ` [PATCH 2/5] improve reliability of fixup_pack_header_footer() Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git This function returns 0 when the current object couldn't be written due to the pack size limit, otherwise the current offset in the pack. There is a problem with this approach however, since current object could be a delta and its delta base might just have been written in the same write_one() call, but those successfully written objects are not accounted in the offset variable tracked by the caller. Currently this is not an issue but a subsequent patch will need this. Signed-off-by: Nicolas Pitre <nico@cam.org> --- builtin-pack-objects.c | 29 ++++++++++++----------------- 1 files changed, 12 insertions(+), 17 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index ef3befe..a615fcc 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -410,25 +410,22 @@ static unsigned long write_object(struct sha1file *f, return hdrlen + datalen; } -static off_t write_one(struct sha1file *f, +static int write_one(struct sha1file *f, struct object_entry *e, - off_t offset) + off_t *offset) { unsigned long size; /* offset is non zero if object is written already. */ if (e->idx.offset || e->preferred_base) - return offset; + return 1; /* if we are deltified, write out base object first. */ - if (e->delta) { - offset = write_one(f, e->delta, offset); - if (!offset) - return 0; - } + if (e->delta && !write_one(f, e->delta, offset)) + return 0; - e->idx.offset = offset; - size = write_object(f, e, offset); + e->idx.offset = *offset; + size = write_object(f, e, *offset); if (!size) { e->idx.offset = 0; return 0; @@ -436,9 +433,10 @@ static off_t write_one(struct sha1file *f, written_list[nr_written++] = &e->idx; /* make sure off_t is sufficiently large not to wrap */ - if (offset > offset + size) + if (*offset > *offset + size) die("pack too large for current definition of off_t"); - return offset + size; + *offset += size; + return 1; } /* forward declaration for write_pack_file */ @@ -448,7 +446,7 @@ static void write_pack_file(void) { uint32_t i = 0, j; struct sha1file *f; - off_t offset, offset_one, last_obj_offset = 0; + off_t offset; struct pack_header hdr; uint32_t nr_remaining = nr_result; time_t last_mtime = 0; @@ -480,11 +478,8 @@ static void write_pack_file(void) offset = sizeof(hdr); nr_written = 0; for (; i < nr_objects; i++) { - last_obj_offset = offset; - offset_one = write_one(f, objects + i, offset); - if (!offset_one) + if (!write_one(f, objects + i, &offset)) break; - offset = offset_one; display_progress(progress_state, written); } -- 1.6.0.1.212.g35f9f ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 2/5] improve reliability of fixup_pack_header_footer() 2008-08-29 20:07 ` [PATCH 1/5] pack-objects: improve returned information from write_one() Nicolas Pitre @ 2008-08-29 20:07 ` Nicolas Pitre 2008-08-29 20:08 ` [PATCH 3/5] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:07 UTC (permalink / raw) To: Junio C Hamano; +Cc: git Currently, this function has the potential to read corrupted pack data from disk and give it a valid SHA1 checksum. Let's add the ability to validate SHA1 checksum of existing data along the way, including before and after any arbitrary point in the pack. Signed-off-by: Nicolas Pitre <nico@cam.org> --- builtin-pack-objects.c | 3 +- fast-import.c | 3 +- index-pack.c | 3 +- pack-write.c | 71 +++++++++++++++++++++++++++++++++++++++-------- pack.h | 2 +- 5 files changed, 66 insertions(+), 16 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index a615fcc..79417cb 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -493,7 +493,8 @@ static void write_pack_file(void) sha1close(f, sha1, CSUM_FSYNC); } else { int fd = sha1close(f, NULL, 0); - fixup_pack_header_footer(fd, sha1, pack_tmp_name, nr_written); + fixup_pack_header_footer(fd, sha1, pack_tmp_name, + nr_written, NULL, 0); close(fd); } diff --git a/fast-import.c b/fast-import.c index 7089e6f..d85b3a5 100644 --- a/fast-import.c +++ b/fast-import.c @@ -951,7 +951,8 @@ static void end_packfile(void) close_pack_windows(pack_data); fixup_pack_header_footer(pack_data->pack_fd, pack_data->sha1, - pack_data->pack_name, object_count); + pack_data->pack_name, object_count, + NULL, 0); close(pack_data->pack_fd); idx_name = keep_pack(create_index()); diff --git a/index-pack.c b/index-pack.c index 728af7d..411b80d 100644 --- a/index-pack.c +++ b/index-pack.c @@ -982,7 +982,8 @@ int main(int argc, char **argv) nr_objects - nr_objects_initial); stop_progress_msg(&progress, msg); fixup_pack_header_footer(output_fd, sha1, - curr_pack, nr_objects); + curr_pack, nr_objects, + NULL, 0); } if (nr_deltas != nr_resolved_deltas) die("pack has %d unresolved deltas", diff --git a/pack-write.c b/pack-write.c index ddcfd37..f9776c5 100644 --- a/pack-write.c +++ b/pack-write.c @@ -144,41 +144,88 @@ char *write_idx_file(char *index_name, struct pack_idx_entry **objects, return index_name; } +/* + * Update pack header with object_count and compute new SHA1 for pack data + * associated to pack_fd, and write that SHA1 at the end. That new SHA1 + * is also returned in new_pack_sha1. + * + * If partial_pack_sha1 is non null, then the SHA1 of the existing pack + * (without the header update) is computed and validated against the + * one provided in partial_pack_sha1. The validation is performed at + * partial_pack_offset bytes in the pack file. The SHA1 of the remaining + * data (i.e. from partial_pack_offset to the end) is then computed and + * returned in partial_pack_sha1. + * + * Note that new_pack_sha1 is updated last, so both new_pack_sha1 and + * partial_pack_sha1 can refer to the same buffer if the caller is not + * interested in the resulting SHA1 of pack data above partial_pack_offset. + */ void fixup_pack_header_footer(int pack_fd, - unsigned char *pack_file_sha1, + unsigned char *new_pack_sha1, const char *pack_name, - uint32_t object_count) + uint32_t object_count, + unsigned char *partial_pack_sha1, + off_t partial_pack_offset) { static const int buf_sz = 128 * 1024; - SHA_CTX c; + SHA_CTX old_sha1_ctx, new_sha1_ctx; struct pack_header hdr; char *buf; + SHA1_Init(&old_sha1_ctx); + SHA1_Init(&new_sha1_ctx); + if (lseek(pack_fd, 0, SEEK_SET) != 0) - die("Failed seeking to start: %s", strerror(errno)); + die("Failed seeking to start of %s: %s", pack_name, strerror(errno)); if (read_in_full(pack_fd, &hdr, sizeof(hdr)) != sizeof(hdr)) die("Unable to reread header of %s: %s", pack_name, strerror(errno)); if (lseek(pack_fd, 0, SEEK_SET) != 0) - die("Failed seeking to start: %s", strerror(errno)); + die("Failed seeking to start of %s: %s", pack_name, strerror(errno)); + SHA1_Update(&old_sha1_ctx, &hdr, sizeof(hdr)); hdr.hdr_entries = htonl(object_count); + SHA1_Update(&new_sha1_ctx, &hdr, sizeof(hdr)); write_or_die(pack_fd, &hdr, sizeof(hdr)); - - SHA1_Init(&c); - SHA1_Update(&c, &hdr, sizeof(hdr)); + partial_pack_offset -= sizeof(hdr); buf = xmalloc(buf_sz); for (;;) { - ssize_t n = xread(pack_fd, buf, buf_sz); + ssize_t m, n; + m = (partial_pack_sha1 && partial_pack_offset < buf_sz) ? + partial_pack_offset : buf_sz; + n = xread(pack_fd, buf, m); if (!n) break; if (n < 0) die("Failed to checksum %s: %s", pack_name, strerror(errno)); - SHA1_Update(&c, buf, n); + SHA1_Update(&new_sha1_ctx, buf, n); + + if (!partial_pack_sha1) + continue; + + SHA1_Update(&old_sha1_ctx, buf, n); + partial_pack_offset -= n; + if (partial_pack_offset == 0) { + unsigned char sha1[20]; + SHA1_Final(sha1, &old_sha1_ctx); + if (hashcmp(sha1, partial_pack_sha1) != 0) + die("Unexpected checksum for %s " + "(disk corruption?)", pack_name); + /* + * Now let's compute the SHA1 of the remainder of the + * pack, which also means making partial_pack_offset + * big enough not to matter anymore. + */ + SHA1_Init(&old_sha1_ctx); + partial_pack_offset = ~partial_pack_offset; + partial_pack_offset -= MSB(partial_pack_offset, 1); + } } free(buf); - SHA1_Final(pack_file_sha1, &c); - write_or_die(pack_fd, pack_file_sha1, 20); + if (partial_pack_sha1) + SHA1_Final(partial_pack_sha1, &old_sha1_ctx); + SHA1_Final(new_pack_sha1, &new_sha1_ctx); + write_or_die(pack_fd, new_pack_sha1, 20); fsync_or_die(pack_fd, pack_name); } diff --git a/pack.h b/pack.h index 76e6aa2..a883334 100644 --- a/pack.h +++ b/pack.h @@ -58,7 +58,7 @@ struct pack_idx_entry { extern char *write_idx_file(char *index_name, struct pack_idx_entry **objects, int nr_objects, unsigned char *sha1); extern int check_pack_crc(struct packed_git *p, struct pack_window **w_curs, off_t offset, off_t len, unsigned int nr); extern int verify_pack(struct packed_git *); -extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t); +extern void fixup_pack_header_footer(int, unsigned char *, const char *, uint32_t, unsigned char *, off_t); extern char *index_pack_lockfile(int fd); #define PH_ERROR_EOF (-1) -- 1.6.0.1.212.g35f9f ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 3/5] pack-objects: use fixup_pack_header_footer()'s validation mode 2008-08-29 20:07 ` [PATCH 2/5] improve reliability of fixup_pack_header_footer() Nicolas Pitre @ 2008-08-29 20:08 ` Nicolas Pitre 2008-08-29 20:08 ` [PATCH 4/5] index-pack: " Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git When limiting the pack size, a new header has to be written to the pack and a new SHA1 computed. Make sure that the SHA1 of what is being read back matches the SHA1 of what was written. Signed-off-by: Nicolas Pitre <nico@cam.org> --- builtin-pack-objects.c | 4 ++-- csum-file.c | 6 +++--- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index 79417cb..ba2cf00 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -492,9 +492,9 @@ static void write_pack_file(void) } else if (nr_written == nr_remaining) { sha1close(f, sha1, CSUM_FSYNC); } else { - int fd = sha1close(f, NULL, 0); + int fd = sha1close(f, sha1, 0); fixup_pack_header_footer(fd, sha1, pack_tmp_name, - nr_written, NULL, 0); + nr_written, sha1, offset); close(fd); } diff --git a/csum-file.c b/csum-file.c index ace64f1..2838954 100644 --- a/csum-file.c +++ b/csum-file.c @@ -42,11 +42,11 @@ int sha1close(struct sha1file *f, unsigned char *result, unsigned int flags) sha1flush(f, offset); f->offset = 0; } + SHA1_Final(f->buffer, &f->ctx); + if (result) + hashcpy(result, f->buffer); if (flags & (CSUM_CLOSE | CSUM_FSYNC)) { /* write checksum and close fd */ - SHA1_Final(f->buffer, &f->ctx); - if (result) - hashcpy(result, f->buffer); sha1flush(f, 20); if (flags & CSUM_FSYNC) fsync_or_die(f->fd, f->name); -- 1.6.0.1.212.g35f9f ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 4/5] index-pack: use fixup_pack_header_footer()'s validation mode 2008-08-29 20:08 ` [PATCH 3/5] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre @ 2008-08-29 20:08 ` Nicolas Pitre 2008-08-29 20:08 ` [PATCH 5/5] fixup_pack_header_footer(): use nicely aligned buffer sizes Nicolas Pitre 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git When completing a thin pack, a new header has to be written to the pack and a new SHA1 computed. Make sure that the SHA1 of what is being read back matches the SHA1 of what was written for both: the original pack and the appended objects. To do so, a couple write_or_die() calls were converted to sha1write() which has the advantage of doing some buffering as well as handling SHA1 and CRC32 checksum already. Signed-off-by: Nicolas Pitre <nico@cam.org> --- index-pack.c | 43 +++++++++++++++++++++++++------------------ 1 files changed, 25 insertions(+), 18 deletions(-) diff --git a/index-pack.c b/index-pack.c index 411b80d..a6e91fe 100644 --- a/index-pack.c +++ b/index-pack.c @@ -654,7 +654,7 @@ static void parse_pack_objects(unsigned char *sha1) } } -static int write_compressed(int fd, void *in, unsigned int size, uint32_t *obj_crc) +static int write_compressed(struct sha1file *f, void *in, unsigned int size) { z_stream stream; unsigned long maxsize; @@ -674,13 +674,12 @@ static int write_compressed(int fd, void *in, unsigned int size, uint32_t *obj_c deflateEnd(&stream); size = stream.total_out; - write_or_die(fd, out, size); - *obj_crc = crc32(*obj_crc, out, size); + sha1write(f, out, size); free(out); return size; } -static struct object_entry *append_obj_to_pack( +static struct object_entry *append_obj_to_pack(struct sha1file *f, const unsigned char *sha1, void *buf, unsigned long size, enum object_type type) { @@ -696,15 +695,15 @@ static struct object_entry *append_obj_to_pack( s >>= 7; } header[n++] = c; - write_or_die(output_fd, header, n); - obj[0].idx.crc32 = crc32(0, Z_NULL, 0); - obj[0].idx.crc32 = crc32(obj[0].idx.crc32, header, n); + crc32_begin(f); + sha1write(f, header, n); obj[0].size = size; obj[0].hdr_size = n; obj[0].type = type; obj[0].real_type = type; obj[1].idx.offset = obj[0].idx.offset + n; - obj[1].idx.offset += write_compressed(output_fd, buf, size, &obj[0].idx.crc32); + obj[1].idx.offset += write_compressed(f, buf, size); + obj[0].idx.crc32 = crc32_end(f); hashcpy(obj->idx.sha1, sha1); return obj; } @@ -716,7 +715,7 @@ static int delta_pos_compare(const void *_a, const void *_b) return a->obj_no - b->obj_no; } -static void fix_unresolved_deltas(int nr_unresolved) +static void fix_unresolved_deltas(struct sha1file *f, int nr_unresolved) { struct delta_entry **sorted_by_pos; int i, n = 0; @@ -754,8 +753,8 @@ static void fix_unresolved_deltas(int nr_unresolved) if (check_sha1_signature(d->base.sha1, base_obj.data, base_obj.size, typename(type))) die("local object %s is corrupt", sha1_to_hex(d->base.sha1)); - base_obj.obj = append_obj_to_pack(d->base.sha1, base_obj.data, - base_obj.size, type); + base_obj.obj = append_obj_to_pack(f, d->base.sha1, + base_obj.data, base_obj.size, type); link_base_data(NULL, &base_obj); find_delta_children(&d->base, &first, &last); @@ -875,7 +874,7 @@ int main(int argc, char **argv) const char *keep_name = NULL, *keep_msg = NULL; char *index_name_buf = NULL, *keep_name_buf = NULL; struct pack_idx_entry **idx_objects; - unsigned char sha1[20]; + unsigned char pack_sha1[20]; int nongit = 0; setup_git_directory_gently(&nongit); @@ -962,13 +961,15 @@ int main(int argc, char **argv) parse_pack_header(); objects = xmalloc((nr_objects + 1) * sizeof(struct object_entry)); deltas = xmalloc(nr_objects * sizeof(struct delta_entry)); - parse_pack_objects(sha1); + parse_pack_objects(pack_sha1); if (nr_deltas == nr_resolved_deltas) { stop_progress(&progress); /* Flush remaining pack final 20-byte SHA1. */ flush(); } else { if (fix_thin_pack) { + struct sha1file *f; + unsigned char read_sha1[20], tail_sha1[20]; char msg[48]; int nr_unresolved = nr_deltas - nr_resolved_deltas; int nr_objects_initial = nr_objects; @@ -977,13 +978,19 @@ int main(int argc, char **argv) objects = xrealloc(objects, (nr_objects + nr_unresolved + 1) * sizeof(*objects)); - fix_unresolved_deltas(nr_unresolved); + f = sha1fd(output_fd, curr_pack); + fix_unresolved_deltas(f, nr_unresolved); sprintf(msg, "completed with %d local objects", nr_objects - nr_objects_initial); stop_progress_msg(&progress, msg); - fixup_pack_header_footer(output_fd, sha1, + sha1close(f, tail_sha1, 0); + hashcpy(read_sha1, pack_sha1); + fixup_pack_header_footer(output_fd, pack_sha1, curr_pack, nr_objects, - NULL, 0); + read_sha1, consumed_bytes-20); + if (hashcmp(read_sha1, tail_sha1) != 0) + die("Unexpected tail checksum for %s " + "(disk corruption?)", curr_pack); } if (nr_deltas != nr_resolved_deltas) die("pack has %d unresolved deltas", @@ -996,13 +1003,13 @@ int main(int argc, char **argv) idx_objects = xmalloc((nr_objects) * sizeof(struct pack_idx_entry *)); for (i = 0; i < nr_objects; i++) idx_objects[i] = &objects[i].idx; - curr_index = write_idx_file(index_name, idx_objects, nr_objects, sha1); + curr_index = write_idx_file(index_name, idx_objects, nr_objects, pack_sha1); free(idx_objects); final(pack_name, curr_pack, index_name, curr_index, keep_name, keep_msg, - sha1); + pack_sha1); free(objects); free(index_name_buf); free(keep_name_buf); -- 1.6.0.1.212.g35f9f ^ permalink raw reply related [flat|nested] 32+ messages in thread
* [PATCH 5/5] fixup_pack_header_footer(): use nicely aligned buffer sizes 2008-08-29 20:08 ` [PATCH 4/5] index-pack: " Nicolas Pitre @ 2008-08-29 20:08 ` Nicolas Pitre 2008-08-31 7:10 ` Junio C Hamano 0 siblings, 1 reply; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:08 UTC (permalink / raw) To: Junio C Hamano; +Cc: git It should be more efficient to use nicely aligned buffer sizes, either for filesystem operations or SHA1 checksums. Also, using a relatively small nominal size might allow for the data to remain in L1 cache between both SHA1_Update() calls. Signed-off-by: Nicolas Pitre <nico@cam.org> --- pack-write.c | 11 ++++++++--- 1 files changed, 8 insertions(+), 3 deletions(-) diff --git a/pack-write.c b/pack-write.c index f9776c5..939ed56 100644 --- a/pack-write.c +++ b/pack-write.c @@ -167,7 +167,7 @@ void fixup_pack_header_footer(int pack_fd, unsigned char *partial_pack_sha1, off_t partial_pack_offset) { - static const int buf_sz = 128 * 1024; + int aligned_sz, buf_sz = 8 * 1024; SHA_CTX old_sha1_ctx, new_sha1_ctx; struct pack_header hdr; char *buf; @@ -188,10 +188,11 @@ void fixup_pack_header_footer(int pack_fd, partial_pack_offset -= sizeof(hdr); buf = xmalloc(buf_sz); + aligned_sz = buf_sz - sizeof(hdr); for (;;) { ssize_t m, n; - m = (partial_pack_sha1 && partial_pack_offset < buf_sz) ? - partial_pack_offset : buf_sz; + m = (partial_pack_sha1 && partial_pack_offset < aligned_sz) ? + partial_pack_offset : aligned_sz; n = xread(pack_fd, buf, m); if (!n) break; @@ -199,6 +200,10 @@ void fixup_pack_header_footer(int pack_fd, die("Failed to checksum %s: %s", pack_name, strerror(errno)); SHA1_Update(&new_sha1_ctx, buf, n); + aligned_sz -= n; + if (!aligned_sz) + aligned_sz = buf_sz; + if (!partial_pack_sha1) continue; -- 1.6.0.1.212.g35f9f ^ permalink raw reply related [flat|nested] 32+ messages in thread
* Re: [PATCH 5/5] fixup_pack_header_footer(): use nicely aligned buffer sizes 2008-08-29 20:08 ` [PATCH 5/5] fixup_pack_header_footer(): use nicely aligned buffer sizes Nicolas Pitre @ 2008-08-31 7:10 ` Junio C Hamano 0 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2008-08-31 7:10 UTC (permalink / raw) To: Nicolas Pitre; +Cc: git Thanks -- it was a very good read. I wish every series I receive is clearly done like this one. Queued for 'next'. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 1/3] improve reliability of fixup_pack_header_footer() 2008-08-29 14:30 ` Shawn O. Pearce 2008-08-29 20:07 ` [PATCH 0/5] pack header rewriting improvements Nicolas Pitre @ 2008-08-29 20:14 ` Nicolas Pitre 1 sibling, 0 replies; 32+ messages in thread From: Nicolas Pitre @ 2008-08-29 20:14 UTC (permalink / raw) To: Shawn O. Pearce; +Cc: Junio C Hamano, git On Fri, 29 Aug 2008, Shawn O. Pearce wrote: > Oh, yea, that makes sense. It still seems like playing with fire. > > I'd rather the caller pass in the proper offset than rely on it > being the current position of the fd. Especially if the caller > does actually have it available. > > If you change anything, I'd like to see this lseek(SEEK_CUR) go away. Done. I struggled a bit since it simply didn't work initially -- see the first patch in the updated series. > > And another thing I had in store (but for which you _again_ beat me to :-) ) > > is to realign data reads onto filesystem blocks. > > That _really_ made the JGit code ugly. But I think its worth it. See my version. I think it is reasonably clear. > I also want to try and buffer the whole object appending we do > during fixThinPack(), as right now we write the object header in > one write and then compressed data bursts in the others. Moving it > to at least write a full 4k at a time should remove about 2 write > calls per object. Yep. In the C git case, moving to sha1write() added that buffering for free. Nicolas ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum 2008-08-29 2:05 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Nicolas Pitre 2008-08-29 2:07 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre @ 2008-08-29 4:55 ` Shawn O. Pearce 1 sibling, 0 replies; 32+ messages in thread From: Shawn O. Pearce @ 2008-08-29 4:55 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, Linus Torvalds, git Nicolas Pitre <nico@cam.org> wrote: > > So here is 3 patches for this in C git (Shawn already did it for jgit). > > There are 3 spots where this needs to be plugged: in pack-objects, > index-pack and fast-import. I did the first two. > > I don't think this is that pertinent in fast-import because: > > - we want fast-import to be fast > > - after fast-import is done, a full repack is typically done which > would revalidate everything (maybe fast-import should use index v2 > with its per object CRC32 though) > > But maybe I'm wrong about that. In any case, I am leaving the > fast-import to someone more knowledgeable about its code than I do. fast-import is the backend to a number of incremental import tools. They stream data in from foreign VCS systems (notably Perforce) and write into a live repository. That repository may not get repacked for days/weeks. We should treat it just like index-pack. So I think we should plug it. The cost to keep a running SHA-1 for what we have written is pretty low. I'd rather trade off a minute or so on an hour long import for data safety than have a chance for data corruption going unnoticed by the end user. And now that index v2 is available, yea, fast-import should write its index in that format so that a later repack can safely delta reuse if it (by rare chance) decides the current packed form is the best representation. This is especially true with "git gc --auto" triggering a repack every so often due to the incremental import packs I just metioned above. I've just been too busy with other things to add index v2 support. -- Shawn. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 23:59 ` Linus Torvalds 2008-08-26 20:43 ` Jason McMullan @ 2008-08-26 20:55 ` J. Bruce Fields 2008-08-27 20:14 ` Junio C Hamano 2 siblings, 0 replies; 32+ messages in thread From: J. Bruce Fields @ 2008-08-26 20:55 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Mon, Aug 25, 2008 at 04:59:26PM -0700, Linus Torvalds wrote: > and the difference literally seems to be just a block of less than 160 > bytes. But it's certainly not a single-bit error, and it's also not a disk > block or anything like that. > > Looking at the first line that differs, they are: > > -0003460 a0 14 8d 7d cd 12 75 81 2b c4 d9 71 27 62 ae b5 > +0003460 a0 14 8d 7d 22 00 00 00 6b 57 fe ff 55 57 fe ff > > so the differences start at offset 03464. The last line is: > > -0003700 47 f7 9b 3d b6 a5 99 5a cf 49 ef e0 3f 54 08 f4 > +0003700 ae 41 49 50 aa 40 fe ff cf 49 ef e0 3f 54 08 f4 > > so they re-join at 03708 (octal offsets because of 'od' behavior). But in > between there seems to be nothing in common. > > So the corrupt data looks like > > 22 00 00 00 6b 57 fe ff 55 57 fe ff > 0003500 aa 57 fe ff b0 00 0e 00 d9 66 22 00 00 00 00 e0 > 0003520 19 9f fe ff ff ff d1 43 fe ff d1 43 fe ff 00 c0 > 0003540 bf fe f0 33 69 bf b2 33 69 bf 00 00 00 10 01 fe > 0003560 bf fe 66 01 00 00 01 1f 37 17 2e 59 fe ff 00 00 > 0003600 00 00 c2 41 fe ff 00 00 01 1f 37 17 44 42 fe ff > 0003620 00 01 fe ff 01 1f 37 07 d4 42 fe ff 02 01 a8 44 > 0003640 00 f8 bf fe 6b 57 fe ff 55 57 fe ff 97 57 fe ff > 0003660 00 f8 bf fe 04 00 fe ff ab 45 fe ff 02 00 fe ff > 0003700 ae 41 49 50 aa 40 fe ff > > and I don't see what the patern is, except that there's a lot of "fe ff" > in there. and all aligned on 2-byte boundaries? I don't see anything suggestive there either. > 148 bytes, no obvious source. It definitely does _not_ look like > compressed input (it's too regular to be something zlib spits out), nor is > it text. Nor is it valid x86 assembly, so it's not some code-segment data > either. > > And as far as I can tell, that's the _only_ corruption in the whole file, > but I didn't really double-check. > > Does anybody see a pattern? Got me. Thanks for taking a look at it. --b. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: "failed to read delta base object at..." 2008-08-25 23:59 ` Linus Torvalds 2008-08-26 20:43 ` Jason McMullan 2008-08-26 20:55 ` "failed to read delta base object at..." J. Bruce Fields @ 2008-08-27 20:14 ` Junio C Hamano 2 siblings, 0 replies; 32+ messages in thread From: Junio C Hamano @ 2008-08-27 20:14 UTC (permalink / raw) To: Linus Torvalds; +Cc: J. Bruce Fields, git Linus Torvalds <torvalds@linux-foundation.org> writes: > And as far as I can tell, that's the _only_ corruption in the whole file, > but I didn't really double-check. Just FYI, replacing these 3619 bytes in JBF's packfile with the good object with dd conv=notrunc bs=1 seek=3025976 count=3619 makes the fixed pack pass verify-pack, so this part seems to be the only corruption. ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2008-08-31 7:12 UTC | newest] Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2008-08-25 16:46 "failed to read delta base object at..." J. Bruce Fields 2008-08-25 18:58 ` Nicolas Pitre 2008-08-25 21:18 ` J. Bruce Fields 2008-08-25 19:01 ` Linus Torvalds 2008-08-25 21:31 ` J. Bruce Fields 2008-08-25 21:37 ` Linus Torvalds 2008-08-25 22:13 ` J. Bruce Fields 2008-08-25 23:59 ` Linus Torvalds 2008-08-26 20:43 ` Jason McMullan 2008-08-26 21:01 ` Jason McMullan 2008-08-27 17:05 ` Linus Torvalds 2008-08-27 19:17 ` Nicolas Pitre 2008-08-27 19:48 ` Linus Torvalds 2008-08-27 20:46 ` Nicolas Pitre 2008-08-29 2:05 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Nicolas Pitre 2008-08-29 2:07 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 2008-08-29 2:07 ` [PATCH 2/3] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre 2008-08-29 2:07 ` [PATCH 3/3] index-pack: " Nicolas Pitre 2008-08-29 4:44 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Shawn O. Pearce 2008-08-29 13:08 ` Nicolas Pitre 2008-08-29 14:30 ` Shawn O. Pearce 2008-08-29 20:07 ` [PATCH 0/5] pack header rewriting improvements Nicolas Pitre 2008-08-29 20:07 ` [PATCH 1/5] pack-objects: improve returned information from write_one() Nicolas Pitre 2008-08-29 20:07 ` [PATCH 2/5] improve reliability of fixup_pack_header_footer() Nicolas Pitre 2008-08-29 20:08 ` [PATCH 3/5] pack-objects: use fixup_pack_header_footer()'s validation mode Nicolas Pitre 2008-08-29 20:08 ` [PATCH 4/5] index-pack: " Nicolas Pitre 2008-08-29 20:08 ` [PATCH 5/5] fixup_pack_header_footer(): use nicely aligned buffer sizes Nicolas Pitre 2008-08-31 7:10 ` Junio C Hamano 2008-08-29 20:14 ` [PATCH 1/3] improve reliability of fixup_pack_header_footer() Nicolas Pitre 2008-08-29 4:55 ` [PATCH 0/3] don't let disk corruptions escape pack SHA1 checksum Shawn O. Pearce 2008-08-26 20:55 ` "failed to read delta base object at..." J. Bruce Fields 2008-08-27 20:14 ` Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).