* propagating repo corruption across clone @ 2013-03-24 18:31 Jeff King 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason ` (3 more replies) 0 siblings, 4 replies; 60+ messages in thread From: Jeff King @ 2013-03-24 18:31 UTC (permalink / raw) To: git I saw this post-mortem on recent disk corruption seen on git.kde.org: http://jefferai.org/2013/03/24/too-perfect-a-mirror/ The interesting bit to me is that object corruption propagated across a clone (and oddly, that --mirror made complaints about corruption go away). I did a little testing and found some curious results (this ended up long; skip to the bottom for my conclusions). Here's a fairly straight-forward corruption recipe: -- >8 -- obj_to_file() { echo ".git/objects/$(echo $1 | sed 's,..,&/,')" } # corrupt a single byte inside the object corrupt_object() { fn=$(obj_to_file "$1") && chmod +w "$fn" && printf '\0' | dd of="$fn" bs=1 conv=notrunc seek=10 } git init repo && cd repo && echo content >file && git add file && git commit -m one && corrupt_object $(git rev-parse HEAD:file) -- 8< -- report git clone . fast-local report git clone --no-local . no-local report git -c transfer.unpackLimit=1 clone --no-local . index-pack report git -c fetch.fsckObjects=1 clone --no-local . fsck and here is how clone reacts in a few situations: $ git clone --bare . local-bare && echo WORKED Cloning into bare repository 'local-bare'... done. WORKED We don't notice the problem during the transport phase, which is to be expected; we're using the fast "just hardlink it" code path. So that's OK. $ git clone . local-tree && echo WORKED Cloning into 'local-tree'... done. error: inflate: data stream error (invalid distance too far back) error: unable to unpack d95f3ad14dee633a758d2e331151e950dd13e4ed header WORKED We _do_ see a problem during the checkout phase, but we don't propagate a checkout failure to the exit code from clone. That is bad in general, and should probably be fixed. Though it would never find corruption of older objects in the history, anyway, so checkout should not be relied on for robustness. $ git clone --no-local . non-local && echo WORKED Cloning into 'non-local'... remote: Counting objects: 3, done. remote: error: inflate: data stream error (invalid distance too far back) remote: error: unable to unpack d95f3ad14dee633a758d2e331151e950dd13e4ed header remote: error: inflate: data stream error (invalid distance too far back) remote: fatal: loose object d95f3ad14dee633a758d2e331151e950dd13e4ed (stored in ./objects/d9/5f3ad14dee633a758d2e331151e950dd13e4ed) is corrupt error: git upload-pack: git-pack-objects died with error. fatal: git upload-pack: aborting due to possible repository corruption on the remote side. remote: aborting due to possible repository corruption on the remote side. fatal: early EOF fatal: index-pack failed Here we detect the error. It's noticed by pack-objects on the remote side as it tries to put the bogus object into a pack. But what if we already have a pack that's been corrupted, and pack-objects is just pushing out entries without doing any recompression? Let's change our corrupt_object to: corrupt_object() { git repack -ad && pack=`echo .git/objects/pack/*.pack` && chmod +w "$pack" && printf '\0' | dd of="$pack" bs=1 conv=notrunc seek=175 } and try again: $ git clone --no-local . non-local && echo WORKED Cloning into 'non-local'... remote: Counting objects: 3, done. remote: Total 3 (delta 0), reused 3 (delta 0) error: inflate: data stream error (invalid distance too far back) fatal: pack has bad object at offset 169: inflate returned -3 fatal: index-pack failed Great, we still notice the problem in unpack-objects on the receiving end. But what if there's a more subtle corruption, where filesystem corruption points the directory entry for one object at the inode of another. Like: corrupt_object() { corrupt=$(echo corrupted | git hash-object -w --stdin) && mv -f $(obj_to_file $corrupt) $(obj_to_file $1) } This is going to be more subtle, because the object in the packfile is self-consistent but the object graph as a whole is broken. $ git clone --no-local . non-local && echo WORKED Cloning into 'non-local'... remote: Counting objects: 3, done. remote: Total 3 (delta 0), reused 0 (delta 0) Receiving objects: 100% (3/3), done. error: unable to find d95f3ad14dee633a758d2e331151e950dd13e4ed WORKED Like the --local cases earlier, we notice the missing object during the checkout phase, but do not correctly propagate the error. We do not notice the sha1 mis-match on the sending side (which we could, if we checked the sha1 as we were sending). We do not notice the broken object graph during the receive process either. I would have expected check_everything_connected to handle this, but we don't actually call it during clone! If you do this: $ git init non-local && cd non-local && git fetch .. remote: Counting objects: 3, done. remote: Total 3 (delta 0), reused 3 (delta 0) Unpacking objects: 100% (3/3), done. fatal: missing blob object 'd95f3ad14dee633a758d2e331151e950dd13e4ed' error: .. did not send all necessary objects we do notice. And one final check: $ git -c transfer.fsckobjects=1 clone --no-local . fsck Cloning into 'fsck'... remote: Counting objects: 3, done. remote: Total 3 (delta 0), reused 3 (delta 0) Receiving objects: 100% (3/3), done. error: unable to find d95f3ad14dee633a758d2e331151e950dd13e4ed fatal: object of unexpected type fatal: index-pack failed Fscking the incoming objects does work, but of course it comes at a cost in the normal case (for linux-2.6, I measured an increase in CPU time with "index-pack --strict" from ~2.5 minutes to ~4 minutes). And I think it is probably overkill for finding corruption; index-pack already recognizes bit corruption inside an object, and check_everything_connected can detect object graph problems much more cheaply. One thing I didn't check is bit corruption inside a packed object that still correctly zlib inflates. check_everything_connected will end up reading all of the commits and trees (to walk them), but not the blobs. And I don't think that we explicitly re-sha1 every incoming object (only if we detect a possible collision). So it may be that transfer.fsckObjects would save us there (it also introduces new problems if there are ignorable warnings in the objects you receive, like zero-padded trees). So I think at the very least we should: 1. Make sure clone propagates errors from checkout to the final exit code. 2. Teach clone to run check_everything_connected. I don't have details on the KDE corruption, or why it wasn't detected (if it was one of the cases I mentioned above, or a more subtle issue). -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-24 18:31 propagating repo corruption across clone Jeff King @ 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason 2013-03-24 19:23 ` Jeff King 2013-03-24 19:16 ` Ilari Liusvaara ` (2 subsequent siblings) 3 siblings, 1 reply; 60+ messages in thread From: Ævar Arnfjörð Bjarmason @ 2013-03-24 19:01 UTC (permalink / raw) To: Jeff King; +Cc: git On Sun, Mar 24, 2013 at 7:31 PM, Jeff King <peff@peff.net> wrote: > > I don't have details on the KDE corruption, or why it wasn't detected > (if it was one of the cases I mentioned above, or a more subtle issue). One thing worth mentioning is this part of the article: "Originally, mirrored clones were in fact not used, but non-mirrored clones on the anongits come with their own set of issues, and are more prone to getting stopped up by legitimate, authenticated force pushes, ref deletions, and so on – and if we set the refspec such that those are allowed through silently, we don’t gain much. " So the only reason they were even using --mirror was because they were running into those problems with fetching. So aside from the problems with --mirror I think we should have something that updates your local refs to be exactly like they are on the other end, i.e. deletes some, non-fast-forwards others etc. (obviously behind several --force options and so on). But such an option *wouldn't* accept corrupted objects. That would give KDE and other parties a safe way to do exact repo mirroring like this, wouldn't protect them from someone maliciously deleting all the refs in all the repos, but would prevent FS corruption from propagating. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason @ 2013-03-24 19:23 ` Jeff King 2013-03-25 13:43 ` Jeff Mitchell 0 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-24 19:23 UTC (permalink / raw) To: Ævar Arnfjörð Bjarmason; +Cc: git On Sun, Mar 24, 2013 at 08:01:33PM +0100, Ævar Arnfjörð Bjarmason wrote: > On Sun, Mar 24, 2013 at 7:31 PM, Jeff King <peff@peff.net> wrote: > > > > I don't have details on the KDE corruption, or why it wasn't detected > > (if it was one of the cases I mentioned above, or a more subtle issue). > > One thing worth mentioning is this part of the article: > > "Originally, mirrored clones were in fact not used, but non-mirrored > clones on the anongits come with their own set of issues, and are more > prone to getting stopped up by legitimate, authenticated force pushes, > ref deletions, and so on – and if we set the refspec such that those > are allowed through silently, we don’t gain much. " > > So the only reason they were even using --mirror was because they were > running into those problems with fetching. I think the --mirror thing is a red herring. It should not be changing the transport used, and that is the part of git that is expected to catch such corruption. But I haven't seen exactly what the corruption is, nor exactly what commands they used to clone. I've invited the blog author to give more details in this thread. > So aside from the problems with --mirror I think we should have > something that updates your local refs to be exactly like they are on > the other end, i.e. deletes some, non-fast-forwards others etc. > (obviously behind several --force options and so on). But such an > option *wouldn't* accept corrupted objects. That _should_ be how "git fetch --prune +refs/*:refs/*" behaves (and that refspec is set up when you use "--mirror"; we should probably have it turn on --prune, too, but I do not think you can do so via a config option currently). -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-24 19:23 ` Jeff King @ 2013-03-25 13:43 ` Jeff Mitchell 2013-03-25 14:56 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Jeff Mitchell @ 2013-03-25 13:43 UTC (permalink / raw) To: Jeff King; +Cc: Ævar Arnfjörð, git On Sun, Mar 24, 2013 at 3:23 PM, Jeff King <peff@peff.net> wrote: > On Sun, Mar 24, 2013 at 08:01:33PM +0100, Ævar Arnfjörð Bjarmason wrote: > >> On Sun, Mar 24, 2013 at 7:31 PM, Jeff King <peff@peff.net> wrote: >> > >> > I don't have details on the KDE corruption, or why it wasn't detected >> > (if it was one of the cases I mentioned above, or a more subtle issue). >> >> One thing worth mentioning is this part of the article: >> >> "Originally, mirrored clones were in fact not used, but non-mirrored >> clones on the anongits come with their own set of issues, and are more >> prone to getting stopped up by legitimate, authenticated force pushes, >> ref deletions, and so on – and if we set the refspec such that those >> are allowed through silently, we don’t gain much. " >> >> So the only reason they were even using --mirror was because they were >> running into those problems with fetching. With a normal fetch. We actually *wanted* things like force updates and ref deletions to propagate, because we have not just Gitolite's checks but our own checks on the servers, and wanted that to be considered the authenticated source. Besides just daily use and preventing cruft, we wanted to ensure that such actions propagated so that if a branch was removed because it contained personal information, accidental commits, or a security issue (for instance) that the branch was removed on the anongits too, within a timely fashion. > I think the --mirror thing is a red herring. It should not be changing > the transport used, and that is the part of git that is expected to > catch such corruption. > > But I haven't seen exactly what the corruption is, nor exactly what > commands they used to clone. I've invited the blog author to give more > details in this thread. The syncing was performed via a clone with git clone --mirror (and a git:// URL) and updates with git remote update. So I should mention that my experiments after the fact were using local paths, but with --no-hardlinks. If you're saying that the transport is where corruption is supposed to be caught, then it's possible that we shouldn't see corruption propagate on an initial mirror clone across git://, and that something else was responsible for the trouble we saw with the repositories that got cloned after-the-fact. But then I'd argue that this is non-obvious. In particular, when using --no-hardlinks, I wouldn't expect that behavior to be different with a straight path and with file://. Something else: apparently one of my statements prompted joeyh to think about potential issues with backing up live git repos (http://joeyh.name/blog/entry/difficulties_in_backing_up_live_git_repositories/). Looking at that post made me realize that, when we were doing our initial thinking about the system three years ago, we made an assumption that, in fact, taking a .tar.gz of a repo as it's in the process of being written to or garbage collected or repacked could be problematic. This isn't a totally baseless assumption, as I once had a git repository that I was in the process of updating when I had a sudden power outage that suffered corruption. (It could totally have been the filesystem, of course, although it was a journaled file system.) So, we decided to use Git's built-in capabilities of consistency checking to our advantage (with, as it turns out, a flaw in our implementation). But the question remains: are we wrong about thinking that rsyncing or tar.gz live repositories in the middle of being pushed to/gc'd/repacked could result in a bogus backup? Thanks, Jeff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 13:43 ` Jeff Mitchell @ 2013-03-25 14:56 ` Jeff King 2013-03-25 15:31 ` Duy Nguyen 0 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-25 14:56 UTC (permalink / raw) To: Jeff Mitchell; +Cc: Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 09:43:23AM -0400, Jeff Mitchell wrote: > > But I haven't seen exactly what the corruption is, nor exactly what > > commands they used to clone. I've invited the blog author to give more > > details in this thread. > > The syncing was performed via a clone with git clone --mirror (and a > git:// URL) and updates with git remote update. OK. That should be resilient to corruption, then[1]. > So I should mention that my experiments after the fact were using > local paths, but with --no-hardlinks. Yeah, we will do a direct copy in that case, and there is nothing to prevent corruption propagating. > If you're saying that the transport is where corruption is supposed to > be caught, then it's possible that we shouldn't see corruption > propagate on an initial mirror clone across git://, and that something > else was responsible for the trouble we saw with the repositories that > got cloned after-the-fact. Right. Either it was something else, or there is a bug in git's protections (but I haven't been able to reproduce anything likely). > But then I'd argue that this is non-obvious. In particular, when using > --no-hardlinks, I wouldn't expect that behavior to be different with a > straight path and with file://. There are basically three levels of transport that can be used on a local machine: 1. Hard-linking (very fast, no redundancy). 2. Byte-for-byte copy (medium speed, makes a separate copy of the data, but does not check the integrity of the original). 3. Regular git transport, creating a pack (slowest, but should include redundancy checks). Using --no-hardlinks turns off (1), but leaves (2) as an option. I think the documentation in "git clone" could use some improvement in that area. > Something else: apparently one of my statements prompted joeyh to > think about potential issues with backing up live git repos > (http://joeyh.name/blog/entry/difficulties_in_backing_up_live_git_repositories/). > Looking at that post made me realize that, when we were doing our > initial thinking about the system three years ago, we made an > assumption that, in fact, taking a .tar.gz of a repo as it's in the > process of being written to or garbage collected or repacked could be > problematic. This isn't a totally baseless assumption, as I once had a > git repository that I was in the process of updating when I had a > sudden power outage that suffered corruption. (It could totally have > been the filesystem, of course, although it was a journaled file > system.) Yes, if you take a snapshot of a repository with rsync or tar, it may be in an inconsistent state. Using the git protocol should always be robust, but if you want to do it with other tools, you need to follow a particular order: 1. copy the refs (refs/ and packed-refs) first 2. copy everything else (including object/) That covers the case where somebody is pushing an update simultaneously (you _may_ get extra objects in step 2 that they have not yet referenced, but you will never end up with a case where you are referencing objects that you did not yet transfer). If it's possible that the repository might be repacked during your transfer, I think the issue a bit trickier, as there's a moment where the new packfile is renamed into place, and then the old ones are deleted. Depending on the timing and how your readdir() implementation behaves with respect to new and deleted entries, it might be possible to miss both the new one appearing and the old ones disappearing. This is quite a tight race to catch, I suspect, but if you were to rsync objects/pack twice in a row, that would be sufficient. For pruning, I think you could run into the opposite situation: you grab the refs, somebody updates them with a history rewind (or branch deletion), then somebody prunes and objects go away. Again, the timing on this race is quite tight and it's unlikely in practice. I'm not sure of a simple way to eliminate it completely. Yet another option is to simply rsync the whole thing and then "git fsck" the result. If it's not 100% good, just re-run the rsync. This is simple and should be robust, but is more CPU intensive (you'll end up re-checking all of the data on each update). > So, we decided to use Git's built-in capabilities of consistency > checking to our advantage (with, as it turns out, a flaw in our > implementation). But the question remains: are we wrong about thinking > that rsyncing or tar.gz live repositories in the middle of being > pushed to/gc'd/repacked could result in a bogus backup? No, I think you are right. If you do the refs-then-objects ordering, that saves you from most of it, but I do think there are still some races that exist during repacking or pruning. -Peff [1] I mentioned that clone-over-git:// is resilient to corruption. I think that is true for bit corruption, but my tests did show that we are not as careful about checking graph connectivity during clone as we are with fetch. The circumstances in which that would matter are quite unlikely, though. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 14:56 ` Jeff King @ 2013-03-25 15:31 ` Duy Nguyen 2013-03-25 15:56 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Duy Nguyen @ 2013-03-25 15:31 UTC (permalink / raw) To: Jeff King; +Cc: Jeff Mitchell, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 9:56 PM, Jeff King <peff@peff.net> wrote: > There are basically three levels of transport that can be used on a > local machine: > > 1. Hard-linking (very fast, no redundancy). > > 2. Byte-for-byte copy (medium speed, makes a separate copy of the > data, but does not check the integrity of the original). > > 3. Regular git transport, creating a pack (slowest, but should include > redundancy checks). > > Using --no-hardlinks turns off (1), but leaves (2) as an option. I > think the documentation in "git clone" could use some improvement in > that area. Not only git-clone. How git-fetch and git-push verify the new pack should also be documented. I don't think many people outside the contributor circle know what is done (and maybe how) when data is received from outside. -- Duy ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 15:31 ` Duy Nguyen @ 2013-03-25 15:56 ` Jeff King 2013-03-25 16:32 ` Jeff Mitchell 2013-03-26 1:06 ` Duy Nguyen 0 siblings, 2 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 15:56 UTC (permalink / raw) To: Duy Nguyen; +Cc: Jeff Mitchell, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 10:31:04PM +0700, Nguyen Thai Ngoc Duy wrote: > On Mon, Mar 25, 2013 at 9:56 PM, Jeff King <peff@peff.net> wrote: > > There are basically three levels of transport that can be used on a > > local machine: > > > > 1. Hard-linking (very fast, no redundancy). > > > > 2. Byte-for-byte copy (medium speed, makes a separate copy of the > > data, but does not check the integrity of the original). > > > > 3. Regular git transport, creating a pack (slowest, but should include > > redundancy checks). > > > > Using --no-hardlinks turns off (1), but leaves (2) as an option. I > > think the documentation in "git clone" could use some improvement in > > that area. > > Not only git-clone. How git-fetch and git-push verify the new pack > should also be documented. I don't think many people outside the > contributor circle know what is done (and maybe how) when data is > received from outside. I think it's less of a documentation issue there, though, because they _only_ do (3). There is no option to do anything else, so there is nothing to warn the user about in terms of tradeoffs. I agree that in general git's handling of corruption could be documented somewhere, but I'm not sure where. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 15:56 ` Jeff King @ 2013-03-25 16:32 ` Jeff Mitchell 2013-03-25 20:07 ` Jeff King 2013-03-26 1:06 ` Duy Nguyen 1 sibling, 1 reply; 60+ messages in thread From: Jeff Mitchell @ 2013-03-25 16:32 UTC (permalink / raw) To: Jeff King; +Cc: Duy Nguyen, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 11:56 AM, Jeff King <peff@peff.net> wrote: > On Mon, Mar 25, 2013 at 10:31:04PM +0700, Nguyen Thai Ngoc Duy wrote: > >> On Mon, Mar 25, 2013 at 9:56 PM, Jeff King <peff@peff.net> wrote: >> > There are basically three levels of transport that can be used on a >> > local machine: >> > >> > 1. Hard-linking (very fast, no redundancy). >> > >> > 2. Byte-for-byte copy (medium speed, makes a separate copy of the >> > data, but does not check the integrity of the original). >> > >> > 3. Regular git transport, creating a pack (slowest, but should include >> > redundancy checks). >> > >> > Using --no-hardlinks turns off (1), but leaves (2) as an option. I >> > think the documentation in "git clone" could use some improvement in >> > that area. >> >> Not only git-clone. How git-fetch and git-push verify the new pack >> should also be documented. I don't think many people outside the >> contributor circle know what is done (and maybe how) when data is >> received from outside. > > I think it's less of a documentation issue there, though, because they > _only_ do (3). There is no option to do anything else, so there is > nothing to warn the user about in terms of tradeoffs. > > I agree that in general git's handling of corruption could be documented > somewhere, but I'm not sure where. Hi there, First of all, thanks for the analysis, it's much appreciated. It's good to know that we weren't totally off-base in thinking that a naive copy may be out of sync, as small as the chance are (certainly we wouldn't have known the right ordering). I think what was conflating the issue in my testing is that with --mirror it implies --bare, so there would be checking of the objects when the working tree was being created, hence --mirror won't show the error a normal clone will -- it's not a transport question, it's just a matter of the normal clone doing more and so having more data run through checks. However, there are still problems. For blob corruptions, even in this --no-hardlinks, non --mirror case where an error was found, the exit code from the clone was 0. I can see this tripping up all sorts of automated scripts or repository GUIs that ignore the output and only check the error code, which is not an unreasonable thing to do. For commit corruptions, the --no-hardlinks, non --mirror case refused to create the new repository and exited with an error code of 128. The --no-hardlinks, --mirror case spewed errors to the console, yet *still* created the new clone *and* returned an error code of zero. It seems that when there is an "error" as opposed to a "fatal" it doesn't affect the status code on a clone; I'd argue that it ought to. If Git knows that the source repository has problems, it ought to be reflected in the status code so that scripts performing clones have a normal way to detect this and alert a user/sysadmin/whoever. Even if a particular cloning method doesn't perform all sanity checks, if it finds something in the sanity checks it *does* perform, this should be trumpeted, loudly, regardless of transport mechanism and regardless of whether a user is watching the process or a script is. Thanks, Jeff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 16:32 ` Jeff Mitchell @ 2013-03-25 20:07 ` Jeff King 2013-03-26 13:43 ` Jeff Mitchell 0 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-25 20:07 UTC (permalink / raw) To: Jeff Mitchell; +Cc: Duy Nguyen, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote: > I think what was conflating the issue in my testing is that with > --mirror it implies --bare, so there would be checking of the objects > when the working tree was being created, hence --mirror won't show the > error a normal clone will -- it's not a transport question, it's just > a matter of the normal clone doing more and so having more data run > through checks. Exactly. > However, there are still problems. For blob corruptions, even in this > --no-hardlinks, non --mirror case where an error was found, the exit > code from the clone was 0. I can see this tripping up all sorts of > automated scripts or repository GUIs that ignore the output and only > check the error code, which is not an unreasonable thing to do. Yes, this is a bug. I'll post a series in a minute which fixes it. > For commit corruptions, the --no-hardlinks, non --mirror case refused > to create the new repository and exited with an error code of 128. The > --no-hardlinks, --mirror case spewed errors to the console, yet > *still* created the new clone *and* returned an error code of zero. I wasn't able to reproduce this; can you post a succint test case? > It seems that when there is an "error" as opposed to a "fatal" it > doesn't affect the status code on a clone; I'd argue that it ought to. Agreed completely. The current behavior is buggy. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 20:07 ` Jeff King @ 2013-03-26 13:43 ` Jeff Mitchell 2013-03-26 16:55 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Jeff Mitchell @ 2013-03-26 13:43 UTC (permalink / raw) To: Jeff King; +Cc: Duy Nguyen, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 4:07 PM, Jeff King <peff@peff.net> wrote: > On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote: >> For commit corruptions, the --no-hardlinks, non --mirror case refused >> to create the new repository and exited with an error code of 128. The >> --no-hardlinks, --mirror case spewed errors to the console, yet >> *still* created the new clone *and* returned an error code of zero. > > I wasn't able to reproduce this; can you post a succint test case? This actually seems hard to reproduce. I found this during testing with an existing repository on-disk, but when I tried creating a new repository with some commit objects, and modifying one of the commit objects the same way I modified the an object in the previous repository, I was unable to reproduce it. I do have the original repository though, so I'll tar.gz it up so that you can have exactly the same content as I do. It's about 40MB and you can grab it here: https://www.dropbox.com/s/e8dhedmpd1a1axs/tomahawk-corrupt.tar.gz (MD5 sum: cde8a43233db5d649932407891f8366b). Once you extract that, you should be able to run a clone using paths (not file://) with --no-hardlinks --mirror and replicate the behavior I saw. FYI, I'm on Git 1.8.2. Thanks, Jeff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 13:43 ` Jeff Mitchell @ 2013-03-26 16:55 ` Jeff King 2013-03-26 21:59 ` Philip Oakley 2013-03-26 23:20 ` Rich Fromm 0 siblings, 2 replies; 60+ messages in thread From: Jeff King @ 2013-03-26 16:55 UTC (permalink / raw) To: Jeff Mitchell; +Cc: Duy Nguyen, Ævar Arnfjörð, git On Tue, Mar 26, 2013 at 09:43:01AM -0400, Jeff Mitchell wrote: > On Mon, Mar 25, 2013 at 4:07 PM, Jeff King <peff@peff.net> wrote: > > On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote: > >> For commit corruptions, the --no-hardlinks, non --mirror case refused > >> to create the new repository and exited with an error code of 128. The > >> --no-hardlinks, --mirror case spewed errors to the console, yet > >> *still* created the new clone *and* returned an error code of zero. > > > > I wasn't able to reproduce this; can you post a succint test case? > > [...link to tar.gz...] > Once you extract that, you should be able to run a clone using paths > (not file://) with --no-hardlinks --mirror and replicate the behavior > I saw. FYI, I'm on Git 1.8.2. Thanks for providing an example. The difference is the same "--mirror implies --bare" issue; the non-bare case dies during the checkout (even before my patches, as the corruption is not in a blob, but rather in the HEAD commit object itself). You can replace --mirror with --bare and see the same behavior. The troubling part is that we see errors in the bare case, but do not die. Those errors all come from upload-pack, the "sending" side of a clone/fetch. Even though we do not transfer the objects via the git protocol, we still invoke upload-pack to get the ref list (and then copy the objects themselves out-of-band). What happens is that upload-pack sees the errors while trying to see if the object is a tag that can be peeled (the server advertises both tags and the objects they point to). It does not distinguish between "errors did not let me peel this object" and "this object is not a tag, and therefore there is nothing to peel". We could change that, but I'm not sure whether it is a good idea. I think the intent is that upload-pack's ref advertisement would remain resilient to corruption in the repository (e.g., even if that commit is corrupt, you can still fetch the rest of the data). We should not worry about advertising broken objects, because we will encounter the same error when we actually do try to send the objects. Dying at the advertisement phase would be premature, since we do not yet know what the client will request. The problem, of course, is that the --local optimization _skips_ the part where we actually ask upload-pack for data, and instead blindly copies it. So this is the same issue as usual, which is that the local transport is not thorough enough to catch corruption. It seems like a failing in this case, because upload-pack does notice the problem, but that is only luck; if the corruption were in a non-tip object, it would not notice it at all. So trying to die on errors in the ref advertisement would just be a band-aid. Fundamentally the problem is that the --local transport is not safe from propagating corruption, and should not be used if that's a requirement. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 16:55 ` Jeff King @ 2013-03-26 21:59 ` Philip Oakley 2013-03-26 22:03 ` Jeff King 2013-03-26 23:20 ` Rich Fromm 1 sibling, 1 reply; 60+ messages in thread From: Philip Oakley @ 2013-03-26 21:59 UTC (permalink / raw) To: Jeff King, Jeff Mitchell Cc: Duy Nguyen, Ævar Arnfjörð, Git List From: "Jeff King" <peff@peff.net> Sent: Tuesday, March 26, 2013 4:55 PM > On Tue, Mar 26, 2013 at 09:43:01AM -0400, Jeff Mitchell wrote: > >> On Mon, Mar 25, 2013 at 4:07 PM, Jeff King <peff@peff.net> wrote: >> > On Mon, Mar 25, 2013 at 12:32:50PM -0400, Jeff Mitchell wrote: >> >> For commit corruptions, the --no-hardlinks, non --mirror case >> >> refused >> >> to create the new repository and exited with an error code of 128. >> >> The >> >> --no-hardlinks, --mirror case spewed errors to the console, yet >> >> *still* created the new clone *and* returned an error code of >> >> zero. >> > >> > I wasn't able to reproduce this; can you post a succint test case? >> >> [...link to tar.gz...] >> Once you extract that, you should be able to run a clone using paths >> (not file://) with --no-hardlinks --mirror and replicate the behavior >> I saw. FYI, I'm on Git 1.8.2. > > Thanks for providing an example. > > The difference is the same "--mirror implies --bare" issue; the > non-bare > case dies during the checkout (even before my patches, as the > corruption > is not in a blob, but rather in the HEAD commit object itself). You > can > replace --mirror with --bare and see the same behavior. > > The troubling part is that we see errors in the bare case, but do not > die. Those errors all come from upload-pack, the "sending" side of a > clone/fetch. Even though we do not transfer the objects via the git > protocol, we still invoke upload-pack to get the ref list (and then > copy > the objects themselves out-of-band). > > What happens is that upload-pack sees the errors while trying to see > if > the object is a tag that can be peeled (the server advertises both > tags > and the objects they point to). It does not distinguish between > "errors > did not let me peel this object" and "this object is not a tag, and > therefore there is nothing to peel". > > We could change that, but I'm not sure whether it is a good idea. I > think the intent is that upload-pack's ref advertisement would remain > resilient to corruption in the repository (e.g., even if that commit > is > corrupt, you can still fetch the rest of the data). We should not > worry > about advertising broken objects, because we will encounter the same > error when we actually do try to send the objects. Dying at the > advertisement phase would be premature, since we do not yet know what > the client will request. > > The problem, of course, is that the --local optimization _skips_ the > part where we actually ask upload-pack for data, and instead blindly > copies it. So this is the same issue as usual, which is that the local > transport is not thorough enough to catch corruption. It seems like a > failing in this case, because upload-pack does notice the problem, but > that is only luck; if the corruption were in a non-tip object, it > would > not notice it at all. So trying to die on errors in the ref > advertisement would just be a band-aid. Fundamentally the problem is > that the --local transport is not safe from propagating corruption, > and > should not be used if that's a requirement. > > -Peff > -- Which way does `git bundle file.bundl --all` perform after the changes for both the 'transport' checking and being reliable during updates. Is it an option for creating an archivable file that can be used for a later `clone`? I wasn't sure if the bundle capability had been considered. Philip ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 21:59 ` Philip Oakley @ 2013-03-26 22:03 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-26 22:03 UTC (permalink / raw) To: Philip Oakley Cc: Jeff Mitchell, Duy Nguyen, Ævar Arnfjörð, Git List On Tue, Mar 26, 2013 at 09:59:42PM -0000, Philip Oakley wrote: > Which way does `git bundle file.bundl --all` perform after the changes > for both the 'transport' checking and being reliable during updates. Bundles are treated at a fairly low level the same as a remote who provides us a particular set of refs and a packfile. So we should get the same protections via index-pack, and still run check_everything_connected on it, just as we would with a fetch over the git protocol. I didn't test it, though. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 16:55 ` Jeff King 2013-03-26 21:59 ` Philip Oakley @ 2013-03-26 23:20 ` Rich Fromm 2013-03-27 1:25 ` Jonathan Nieder 2013-03-27 3:47 ` Junio C Hamano 1 sibling, 2 replies; 60+ messages in thread From: Rich Fromm @ 2013-03-26 23:20 UTC (permalink / raw) To: git Jeff King wrote > Fundamentally the problem is > that the --local transport is not safe from propagating corruption, and > should not be used if that's a requirement. I've read Jeff Mitchell's blog post, his update, relevant parts of the git-clone(1) man page, and a decent chunk of this thread, and I'm still not clear on one thing. Is the danger of `git clone --mirror` propagating corruption only true when using the --local option ? Specifically, in my case, I'm using `git clone --mirror`, but I'm *not* using --local, nor am I using --no-hardlinks. The host executing the clone command is different than the the host on which the remote repository lives, and I am using ssh as a transport protocol. If there is corruption, can I or can I not expect the clone operation to fail and return a non-zero exit value? If I can not expect this, is the workaround to run `git fsck` on the resulting clone? -- View this message in context: http://git.661346.n2.nabble.com/propagating-repo-corruption-across-clone-tp7580504p7580771.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 23:20 ` Rich Fromm @ 2013-03-27 1:25 ` Jonathan Nieder 2013-03-27 18:23 ` Rich Fromm 2013-03-27 3:47 ` Junio C Hamano 1 sibling, 1 reply; 60+ messages in thread From: Jonathan Nieder @ 2013-03-27 1:25 UTC (permalink / raw) To: Rich Fromm; +Cc: git, Jeff King Hi, Rich Fromm wrote: > The host executing the clone > command is different than the the host on which the remote repository lives, > and I am using ssh as a transport protocol. If there is corruption, can I > or can I not expect the clone operation to fail and return a non-zero exit > value? If I can not expect this, is the workaround to run `git fsck` on the > resulting clone? Is the "[transfer] fsckObjects" configuration on the host executing the clone set to true? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 1:25 ` Jonathan Nieder @ 2013-03-27 18:23 ` Rich Fromm 2013-03-27 19:49 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Rich Fromm @ 2013-03-27 18:23 UTC (permalink / raw) To: git Jonathan Nieder-2 wrote > Is the "[transfer] fsckObjects" configuration on the host executing the > clone set to true? I hadn't been setting it at all, and according to git-config(1) it defaults to false, so the answer is no. It looks like setting it might be a good idea. But I'm still somewhat confused about what is and is not checked under what conditions. Consider the three statements: # 1 git clone --mirror myuser@myhost:my_repo # 2 git clone --mirror --config transfer.fsckObjects=true myuser@myhost:my_repo # 3 git clone --mirror myuser@myhost:my_repo && cd my_repo.git && git-fsck Are 2 and 3 equivalent? Or is there an increasing level of checking that occurs from 1 to 2, and from 2 to 3? My guess is the latter, but perhaps this could be clearer in the man pages. git-config(1) says that transfer.fsckObjects essentially (if fetch... and receive... are not explicitly set) "git-fetch-pack will check all fetched objects" and "git-receive-pack will check all received objects." While git-fsck(1) says "git-fsck tests SHA1 and general object sanity, and it does full tracking of the resulting reachability and everything else." The latter sounds like a stronger statement to me. But if that's true, perhaps should the relevant section(s) of git-config(1) explicitly note that this is not equivalent to a full git-fsck ? -- View this message in context: http://git.661346.n2.nabble.com/propagating-repo-corruption-across-clone-tp7580504p7580839.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 18:23 ` Rich Fromm @ 2013-03-27 19:49 ` Jeff King 2013-03-27 20:04 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-27 19:49 UTC (permalink / raw) To: Rich Fromm; +Cc: git On Wed, Mar 27, 2013 at 11:23:15AM -0700, Rich Fromm wrote: > But I'm still somewhat confused about what is and is not checked under what > conditions. Consider the three statements: > > # 1 > git clone --mirror myuser@myhost:my_repo > > # 2 > git clone --mirror --config transfer.fsckObjects=true myuser@myhost:my_repo > > # 3 > git clone --mirror myuser@myhost:my_repo && cd my_repo.git && git-fsck > > Are 2 and 3 equivalent? Or is there an increasing level of checking that > occurs from 1 to 2, and from 2 to 3? My guess is the latter, but perhaps > this could be clearer in the man pages. 2 and 3 are not exactly equivalent, in that they are implemented slightly differently, but I do not know offhand of any case that would pass 2 but not 3. We do check reachability with transfer.fsckObjects. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 19:49 ` Jeff King @ 2013-03-27 20:04 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-27 20:04 UTC (permalink / raw) To: Rich Fromm; +Cc: git On Wed, Mar 27, 2013 at 03:49:38PM -0400, Jeff King wrote: > On Wed, Mar 27, 2013 at 11:23:15AM -0700, Rich Fromm wrote: > > > But I'm still somewhat confused about what is and is not checked under what > > conditions. Consider the three statements: > > > > # 1 > > git clone --mirror myuser@myhost:my_repo > > > > # 2 > > git clone --mirror --config transfer.fsckObjects=true myuser@myhost:my_repo > > > > # 3 > > git clone --mirror myuser@myhost:my_repo && cd my_repo.git && git-fsck > > > > Are 2 and 3 equivalent? Or is there an increasing level of checking that > > occurs from 1 to 2, and from 2 to 3? My guess is the latter, but perhaps > > this could be clearer in the man pages. > > 2 and 3 are not exactly equivalent, in that they are implemented > slightly differently, but I do not know offhand of any case that would > pass 2 but not 3. We do check reachability with transfer.fsckObjects. Oh, and in the case of #1, I think we would already find corruption, in that index-pack will expand and check the sha1 of each object we receive. The transfer.fsckObjects check adds some semantic checks as well (e.g., making sure author identities are well-formed). Clone will not currently detect missing objects and reachability without transfer.fsckObjects set, but that is IMHO a bug; fetch will notice it, and clone should behave the same way. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-26 23:20 ` Rich Fromm 2013-03-27 1:25 ` Jonathan Nieder @ 2013-03-27 3:47 ` Junio C Hamano 2013-03-27 6:19 ` Sitaram Chamarty ` (2 more replies) 1 sibling, 3 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-27 3:47 UTC (permalink / raw) To: Rich Fromm; +Cc: git Rich Fromm <richard_fromm@yahoo.com> writes: > Jeff King wrote >> Fundamentally the problem is >> that the --local transport is not safe from propagating corruption, and >> should not be used if that's a requirement. > > I've read Jeff Mitchell's blog post, his update, relevant parts of the > git-clone(1) man page, and a decent chunk of this thread, and I'm still not > clear on one thing. Is the danger of `git clone --mirror` propagating > corruption only true when using the --local option ? If you use --local, that is equivalent to "cp -R". Your corruption in the source will faithfully be byte-for-byte copied to the destination. If you do not (and in your case you have two different machines), unless you are using the long deprecated rsync transport (which again is the same as "cp -R"), transport layer will notice object corruption. See Jeff's analysis earlier in the thread. If you are lucky (or unlucky, depending on how you look at it), the corruption you have in your object store may affect objects that are needed to check out the version at the tip of the history, and "git checkout" that happens as the last step of cloning may loudly complain, but that just means you can immediately notice the breakage in that case. You may be unlucky and the corruption may not affect objects that are needed to check out the tip. The initial checkout will succeed as if nothing is wrong, but the corruption in your object store is still there nevertheless. "git log -p --all" or "git fsck" will certainly be unhappy. The difference between --mirror and no --mirror is a red herring. You may want to ask Jeff Mitchell to remove the mention of it; it only adds to the confusion without helping users. If you made byte-for-byte copy of corrupt repository, it wouldn't make any difference if the first "checkout" notices it. To be paranoid, you may want to set transfer.fsckObjects to true, perhaps in your ~/.gitconfig. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 3:47 ` Junio C Hamano @ 2013-03-27 6:19 ` Sitaram Chamarty 2013-03-27 15:03 ` Junio C Hamano 2013-03-27 18:51 ` Rich Fromm 2013-03-28 13:48 ` Jeff Mitchell 2 siblings, 1 reply; 60+ messages in thread From: Sitaram Chamarty @ 2013-03-27 6:19 UTC (permalink / raw) To: Junio C Hamano; +Cc: Rich Fromm, git On Wed, Mar 27, 2013 at 9:17 AM, Junio C Hamano <gitster@pobox.com> wrote: > To be paranoid, you may want to set transfer.fsckObjects to true, > perhaps in your ~/.gitconfig. do we have any numbers on the overhead of this? Even a "guesstimate" will do... ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 6:19 ` Sitaram Chamarty @ 2013-03-27 15:03 ` Junio C Hamano 2013-03-27 15:47 ` Sitaram Chamarty 0 siblings, 1 reply; 60+ messages in thread From: Junio C Hamano @ 2013-03-27 15:03 UTC (permalink / raw) To: Sitaram Chamarty; +Cc: Rich Fromm, git Sitaram Chamarty <sitaramc@gmail.com> writes: > On Wed, Mar 27, 2013 at 9:17 AM, Junio C Hamano <gitster@pobox.com> wrote: > >> To be paranoid, you may want to set transfer.fsckObjects to true, >> perhaps in your ~/.gitconfig. > > do we have any numbers on the overhead of this? > > Even a "guesstimate" will do... On a reasonably slow machine: $ cd /project/git/git.git && git repack -a -d $ ls -hl .git/objects/pack/*.pack -r--r--r-- 1 junio src 44M Mar 26 13:18 .git/objects/pack/pack-c40635e5ee2b7094eb0e2c416e921a2b129bd8d2.pack $ cd .. && git --bare init junk && cd junk $ time git index-pack --strict --stdin <../git.git/.git/objects/pack/*.pack real 0m13.873s user 0m21.345s sys 0m2.248s That's about 3.2 Mbps? Compare that with the speed your other side feeds you (or your line speed could be the limiting factor) and decide how much you value your data. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 15:03 ` Junio C Hamano @ 2013-03-27 15:47 ` Sitaram Chamarty 0 siblings, 0 replies; 60+ messages in thread From: Sitaram Chamarty @ 2013-03-27 15:47 UTC (permalink / raw) To: Junio C Hamano; +Cc: Rich Fromm, git On Wed, Mar 27, 2013 at 8:33 PM, Junio C Hamano <gitster@pobox.com> wrote: > Sitaram Chamarty <sitaramc@gmail.com> writes: > >> On Wed, Mar 27, 2013 at 9:17 AM, Junio C Hamano <gitster@pobox.com> wrote: >> >>> To be paranoid, you may want to set transfer.fsckObjects to true, >>> perhaps in your ~/.gitconfig. >> >> do we have any numbers on the overhead of this? >> >> Even a "guesstimate" will do... > > On a reasonably slow machine: > > $ cd /project/git/git.git && git repack -a -d > $ ls -hl .git/objects/pack/*.pack > -r--r--r-- 1 junio src 44M Mar 26 13:18 .git/objects/pack/pack-c40635e5ee2b7094eb0e2c416e921a2b129bd8d2.pack > > $ cd .. && git --bare init junk && cd junk > $ time git index-pack --strict --stdin <../git.git/.git/objects/pack/*.pack > real 0m13.873s > user 0m21.345s > sys 0m2.248s > > That's about 3.2 Mbps? > > Compare that with the speed your other side feeds you (or your line > speed could be the limiting factor) and decide how much you value > your data. Thanks. I was also interested in overhead on the server just as a %-age. I have no idea why but when I did some tests a long time ago I got upwards of 40% or so, but now when I try these tests for git.git cd <some empty dir> git init --bare # git config transfer.fsckobjects true git fetch file:///full/path/to/git.git refs/*:refs/* then, the difference in elapsed time 18s -> 22s, so about 22%, and CPU time is 31 -> 37, so about 20%. I didn't measure disk access increases, but I guess 20% is not too bad. Is it likely to be linear in the size of the repo, by and large? ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 3:47 ` Junio C Hamano 2013-03-27 6:19 ` Sitaram Chamarty @ 2013-03-27 18:51 ` Rich Fromm 2013-03-27 19:13 ` Junio C Hamano 2013-03-28 13:52 ` Jeff Mitchell 2013-03-28 13:48 ` Jeff Mitchell 2 siblings, 2 replies; 60+ messages in thread From: Rich Fromm @ 2013-03-27 18:51 UTC (permalink / raw) To: git Junio C Hamano wrote > If you use --local, that is equivalent to "cp -R". Your corruption > in the source will faithfully be byte-for-byte copied to the > destination. If you do not > ... > transport layer will notice > object corruption. > ... > The difference between --mirror and no --mirror is a red herring. > You may want to ask Jeff Mitchell to remove the mention of it; it > only adds to the confusion without helping users. Just to clarify, I don't know Jeff Mitchell personally, and I'm not affiliated with the KDE project. I happened to have recently implemented a backup strategy for a different codebase, that relies on `git clone --mirror` to take the actual snapshots of the live repos, and I read about Jeff's experiences, and that's why I started following this discussion. Apologies if my questions are considered slightly off topic -- I'm not positive if this is supposed to be a list for developers, and not users. Nevertheless, I will try to contact Jeff and point him at this. My initial reading of his blog posts definitely gave me the impression that this was a --mirror vs. not issue, but it really sounds like his main problem was using --local. However, I think there may be room for some additional clarity in the docs. The --local option in git-config(1) says "When the repository to clone from is on a local machine, this flag bypasses the normal "git aware" transport mechanism". But there's no mention of the consequences of this transport bypass. There's also no mention of this in the "GIT URLS" section that discusses transport protocols, and I also don't see anything noting it in either of these sections of the git book: http://git-scm.com/book/en/Git-on-the-Server-The-Protocols http://git-scm.com/book/en/Git-Internals-Transfer-Protocols -- View this message in context: http://git.661346.n2.nabble.com/propagating-repo-corruption-across-clone-tp7580504p7580845.html Sent from the git mailing list archive at Nabble.com. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 18:51 ` Rich Fromm @ 2013-03-27 19:13 ` Junio C Hamano 2013-03-28 13:52 ` Jeff Mitchell 1 sibling, 0 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-27 19:13 UTC (permalink / raw) To: Rich Fromm; +Cc: git Rich Fromm <richard_fromm@yahoo.com> writes: > Apologies if my questions are considered slightly off topic -- I'm not > positive if this is supposed to be a list for developers, and not users. The list is both for users and developers. > However, I think there may be room for some additional clarity in the docs. > The --local option in git-config(1) says "When the repository to clone from > is on a local machine, this flag bypasses the normal "git aware" transport > mechanism". But there's no mention of the consequences of this transport > bypass. Yeah, I would not mind a patch to update the documentation for "clone --local" and rsync transport to say something about byte-for-byte copying of broken repository. Thanks. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 18:51 ` Rich Fromm 2013-03-27 19:13 ` Junio C Hamano @ 2013-03-28 13:52 ` Jeff Mitchell 1 sibling, 0 replies; 60+ messages in thread From: Jeff Mitchell @ 2013-03-28 13:52 UTC (permalink / raw) To: Rich Fromm; +Cc: git On Wed, Mar 27, 2013 at 2:51 PM, Rich Fromm <richard_fromm@yahoo.com> wrote: > Nevertheless, I will try to contact Jeff and point him at this. My initial > reading of his blog posts definitely gave me the impression that this was a > --mirror vs. not issue, but it really sounds like his main problem was using > --local. Actually, I wasn't using --local, I just wasn't using --no-local, as I mixed up that and --no-hardlinks (which lies somewhere between, but is still not the same as using file://). It's entirely possible that you read the posts before I updated them. Also, keep in mind, if you're evaluating what you're doing, that the system we have was not set up to be a backup system. It was set up to be a mirror system. With proper sanity checking, it would have acted in a decent capacity as a backup system, but that wasn't why it was set up, nor how it was set up. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-27 3:47 ` Junio C Hamano 2013-03-27 6:19 ` Sitaram Chamarty 2013-03-27 18:51 ` Rich Fromm @ 2013-03-28 13:48 ` Jeff Mitchell 2 siblings, 0 replies; 60+ messages in thread From: Jeff Mitchell @ 2013-03-28 13:48 UTC (permalink / raw) To: Junio C Hamano; +Cc: Rich Fromm, git On Tue, Mar 26, 2013 at 11:47 PM, Junio C Hamano <gitster@pobox.com> wrote: > The difference between --mirror and no --mirror is a red herring. > You may want to ask Jeff Mitchell to remove the mention of it; it > only adds to the confusion without helping users. If you made > byte-for-byte copy of corrupt repository, it wouldn't make any > difference if the first "checkout" notices it. Hi, Several days ago I had actually already updated the post to indicate that my testing methodology was incorrect as a result of mixing up --no-hardlinks and --no-local, and pointed to this thread. I will say that we did see corrupted repos on the downstream mirrors. I don't have an explanation for it, but have not been able to reproduce it either. My only potential guess (untested) is that perhaps when the corruption was detected the clone aborted but left the objects already transferred locally. Again, untested -- I mention it only because it's my only potential explanation :-) > To be paranoid, you may want to set transfer.fsckObjects to true, > perhaps in your ~/.gitconfig. Interesting; I'd known about receive.fsckObjects but not transfer/fetch. Thanks for the pointer. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 15:56 ` Jeff King 2013-03-25 16:32 ` Jeff Mitchell @ 2013-03-26 1:06 ` Duy Nguyen 1 sibling, 0 replies; 60+ messages in thread From: Duy Nguyen @ 2013-03-26 1:06 UTC (permalink / raw) To: Jeff King; +Cc: Jeff Mitchell, Ævar Arnfjörð, git On Mon, Mar 25, 2013 at 10:56 PM, Jeff King <peff@peff.net> wrote: > On Mon, Mar 25, 2013 at 10:31:04PM +0700, Nguyen Thai Ngoc Duy wrote: > >> On Mon, Mar 25, 2013 at 9:56 PM, Jeff King <peff@peff.net> wrote: >> > There are basically three levels of transport that can be used on a >> > local machine: >> > >> > 1. Hard-linking (very fast, no redundancy). >> > >> > 2. Byte-for-byte copy (medium speed, makes a separate copy of the >> > data, but does not check the integrity of the original). >> > >> > 3. Regular git transport, creating a pack (slowest, but should include >> > redundancy checks). >> > >> > Using --no-hardlinks turns off (1), but leaves (2) as an option. I >> > think the documentation in "git clone" could use some improvement in >> > that area. >> >> Not only git-clone. How git-fetch and git-push verify the new pack >> should also be documented. I don't think many people outside the >> contributor circle know what is done (and maybe how) when data is >> received from outside. > > I think it's less of a documentation issue there, though, because they > _only_ do (3). There is no option to do anything else, so there is > nothing to warn the user about in terms of tradeoffs. > > I agree that in general git's handling of corruption could be documented > somewhere, but I'm not sure where. I think either a section in git-fsck.txt or git.txt. Probably the former as people who read it are probably more concerned about corruption. -- Duy ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-24 18:31 propagating repo corruption across clone Jeff King 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason @ 2013-03-24 19:16 ` Ilari Liusvaara 2013-03-25 20:01 ` Junio C Hamano 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King 3 siblings, 0 replies; 60+ messages in thread From: Ilari Liusvaara @ 2013-03-24 19:16 UTC (permalink / raw) To: Jeff King; +Cc: git On Sun, Mar 24, 2013 at 02:31:33PM -0400, Jeff King wrote: > > Fscking the incoming objects does work, but of course it comes at a cost > in the normal case (for linux-2.6, I measured an increase in CPU time > with "index-pack --strict" from ~2.5 minutes to ~4 minutes). And I think > it is probably overkill for finding corruption; index-pack already > recognizes bit corruption inside an object, and > check_everything_connected can detect object graph problems much more > cheaply. AFAIK, standard checks index-pack has to do + checking that the object graph has no broken links (and every ref points to something valid) will catch everything except: - SHA-1 collisions between corrupt objects and clean ones. - Corrupted refs (that still point to something valid). - "Born-corrupted" objects. > One thing I didn't check is bit corruption inside a packed object that > still correctly zlib inflates. check_everything_connected will end up > reading all of the commits and trees (to walk them), but not the blobs. Checking that everything is connected will (modulo SHA-1 collisions) save you there, at least if packv3 is used as transport stream. -Ilari ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-24 18:31 propagating repo corruption across clone Jeff King 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason 2013-03-24 19:16 ` Ilari Liusvaara @ 2013-03-25 20:01 ` Junio C Hamano 2013-03-25 20:05 ` Jeff King 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King 3 siblings, 1 reply; 60+ messages in thread From: Junio C Hamano @ 2013-03-25 20:01 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > We _do_ see a problem during the checkout phase, but we don't propagate > a checkout failure to the exit code from clone. That is bad in general, > and should probably be fixed. Though it would never find corruption of > older objects in the history, anyway, so checkout should not be relied > on for robustness. It is obvious that we should exit with non-zero status when we see a failure from the checkout, but do we want to nuke the resulting repository as in the case of normal transport failure? A checkout failure might be due to being under quota for object store but running out of quota upon populating the working tree, in which case we probably do not want to. > We do not notice the sha1 mis-match on the sending side (which we could, > if we checked the sha1 as we were sending). We do not notice the broken > object graph during the receive process either. I would have expected > check_everything_connected to handle this, but we don't actually call it > during clone! If you do this: > > $ git init non-local && cd non-local && git fetch .. > remote: Counting objects: 3, done. > remote: Total 3 (delta 0), reused 3 (delta 0) > Unpacking objects: 100% (3/3), done. > fatal: missing blob object 'd95f3ad14dee633a758d2e331151e950dd13e4ed' > error: .. did not send all necessary objects > > we do notice. Yes, it is OK to add connectedness check to "git clone". > And one final check: > > $ git -c transfer.fsckobjects=1 clone --no-local . fsck > Cloning into 'fsck'... > remote: Counting objects: 3, done. > remote: Total 3 (delta 0), reused 3 (delta 0) > Receiving objects: 100% (3/3), done. > error: unable to find d95f3ad14dee633a758d2e331151e950dd13e4ed > fatal: object of unexpected type > fatal: index-pack failed > > Fscking the incoming objects does work, but of course it comes at a cost > in the normal case (for linux-2.6, I measured an increase in CPU time > with "index-pack --strict" from ~2.5 minutes to ~4 minutes). And I think > it is probably overkill for finding corruption; index-pack already > recognizes bit corruption inside an object, and > check_everything_connected can detect object graph problems much more > cheaply. > One thing I didn't check is bit corruption inside a packed object that > still correctly zlib inflates. check_everything_connected will end up > reading all of the commits and trees (to walk them), but not the blobs. Correct. > So I think at the very least we should: > > 1. Make sure clone propagates errors from checkout to the final exit > code. > > 2. Teach clone to run check_everything_connected. I agree with both but with a slight reservation on the former one (see above). Thanks. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: propagating repo corruption across clone 2013-03-25 20:01 ` Junio C Hamano @ 2013-03-25 20:05 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Mon, Mar 25, 2013 at 01:01:59PM -0700, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > We _do_ see a problem during the checkout phase, but we don't propagate > > a checkout failure to the exit code from clone. That is bad in general, > > and should probably be fixed. Though it would never find corruption of > > older objects in the history, anyway, so checkout should not be relied > > on for robustness. > > It is obvious that we should exit with non-zero status when we see a > failure from the checkout, but do we want to nuke the resulting > repository as in the case of normal transport failure? A checkout > failure might be due to being under quota for object store but > running out of quota upon populating the working tree, in which case > we probably do not want to. I'm just running through my final tests on a large-ish patch series which deals with this (among other issues). I had the same thought, though we do already die on a variety of checkout errors. I left it as a die() for now, but I think we should potentially address it with a further patch. > > $ git init non-local && cd non-local && git fetch .. > > remote: Counting objects: 3, done. > > remote: Total 3 (delta 0), reused 3 (delta 0) > > Unpacking objects: 100% (3/3), done. > > fatal: missing blob object 'd95f3ad14dee633a758d2e331151e950dd13e4ed' > > error: .. did not send all necessary objects > > > > we do notice. > > Yes, it is OK to add connectedness check to "git clone". That's in my series, too. Unfortunately, in the local clone case, it slows down the clone considerably (since we otherwise would not have to traverse the objects at all). -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 0/9] corrupt object potpourri 2013-03-24 18:31 propagating repo corruption across clone Jeff King ` (2 preceding siblings ...) 2013-03-25 20:01 ` Junio C Hamano @ 2013-03-25 20:14 ` Jeff King 2013-03-25 20:16 ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King ` (8 more replies) 3 siblings, 9 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:14 UTC (permalink / raw) To: git; +Cc: Junio C Hamano I started these patches with the intent of improving clone's behavior on corrupt objects, but my testing uncovered some other nastiness, including two infinite loops in the streaming code!. Yikes. I think 1-7 are good. We might want to tweak the die() behavior of patch 8, but I think it should come on top. Patch 9 has some pretty ugly performance implications. At the end of the series, all of the introduced tests pass except for one, which is that "git clone" may silently write out a bogus working tree entry. I haven't tracked that one down yet. [1/9]: stream_blob_to_fd: detect errors reading from stream [2/9]: check_sha1_signature: check return value from read_istream [3/9]: read_istream_filtered: propagate read error from upstream [4/9]: avoid infinite loop in read_istream_loose [5/9]: add test for streaming corrupt blobs [6/9]: streaming_write_entry: propagate streaming errors [7/9]: add tests for cloning corrupted repositories [8/9]: clone: die on errors from unpack_trees [9/9]: clone: run check_everything_connected -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King @ 2013-03-25 20:16 ` Jeff King 2013-03-26 21:27 ` Junio C Hamano 2013-03-25 20:17 ` [PATCH 2/9] check_sha1_signature: check return value from read_istream Jeff King ` (7 subsequent siblings) 8 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-25 20:16 UTC (permalink / raw) To: git; +Cc: Junio C Hamano We call read_istream, but never check its return value for errors. This can lead to us looping infinitely, as we just keep trying to write "-1" bytes (and we do not notice the error, as we simply check that write_in_full reports the same number of bytes we fed it, which of course is also -1). Signed-off-by: Jeff King <peff@peff.net> --- No test yet, as my method for triggering this causes _another_ infinite loop. So the test comes after the fixes, to avoid infinite loops when bisecting the history later. :) streaming.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/streaming.c b/streaming.c index 4d978e5..f4126a7 100644 --- a/streaming.c +++ b/streaming.c @@ -514,6 +514,8 @@ int stream_blob_to_fd(int fd, unsigned const char *sha1, struct stream_filter *f ssize_t wrote, holeto; ssize_t readlen = read_istream(st, buf, sizeof(buf)); + if (readlen < 0) + goto close_and_exit; if (!readlen) break; if (can_seek && sizeof(buf) == readlen) { -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream 2013-03-25 20:16 ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King @ 2013-03-26 21:27 ` Junio C Hamano 0 siblings, 0 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-26 21:27 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > We call read_istream, but never check its return value for > errors. This can lead to us looping infinitely, as we just > keep trying to write "-1" bytes (and we do not notice the > error, as we simply check that write_in_full reports the > same number of bytes we fed it, which of course is also -1). Looks sane. Thanks. > > Signed-off-by: Jeff King <peff@peff.net> > --- > No test yet, as my method for triggering this causes _another_ infinite > loop. So the test comes after the fixes, to avoid infinite loops when > bisecting the history later. :) > > streaming.c | 2 ++ > 1 file changed, 2 insertions(+) > > diff --git a/streaming.c b/streaming.c > index 4d978e5..f4126a7 100644 > --- a/streaming.c > +++ b/streaming.c > @@ -514,6 +514,8 @@ int stream_blob_to_fd(int fd, unsigned const char *sha1, struct stream_filter *f > ssize_t wrote, holeto; > ssize_t readlen = read_istream(st, buf, sizeof(buf)); > > + if (readlen < 0) > + goto close_and_exit; > if (!readlen) > break; > if (can_seek && sizeof(buf) == readlen) { ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 2/9] check_sha1_signature: check return value from read_istream 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King 2013-03-25 20:16 ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King @ 2013-03-25 20:17 ` Jeff King 2013-03-25 20:18 ` [PATCH 3/9] read_istream_filtered: propagate read error from upstream Jeff King ` (6 subsequent siblings) 8 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:17 UTC (permalink / raw) To: git; +Cc: Junio C Hamano It's possible for read_istream to return an error, in which case we just end up in an infinite loop (aside from EOF, we do not even look at the result, but just feed it straight into our running hash). Signed-off-by: Jeff King <peff@peff.net> --- I didn't actually trigger this code path in any of my tests, but I audited all of the callers of read_istream after the last patch, and noticed this one (the rest looked fine to me). sha1_file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/sha1_file.c b/sha1_file.c index 16967d3..0b99f33 100644 --- a/sha1_file.c +++ b/sha1_file.c @@ -1266,6 +1266,10 @@ int check_sha1_signature(const unsigned char *sha1, void *map, char buf[1024 * 16]; ssize_t readlen = read_istream(st, buf, sizeof(buf)); + if (readlen < 0) { + close_istream(st); + return -1; + } if (!readlen) break; git_SHA1_Update(&c, buf, readlen); -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH 3/9] read_istream_filtered: propagate read error from upstream 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King 2013-03-25 20:16 ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King 2013-03-25 20:17 ` [PATCH 2/9] check_sha1_signature: check return value from read_istream Jeff King @ 2013-03-25 20:18 ` Jeff King 2013-03-25 20:21 ` [PATCH 4/9] avoid infinite loop in read_istream_loose Jeff King ` (5 subsequent siblings) 8 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:18 UTC (permalink / raw) To: git; +Cc: Junio C Hamano The filter istream pulls data from an "upstream" stream, running it through a filter function. However, we did not properly notice when the upstream filter yielded an error, and just returned what we had read. Instead, we should propagate the error. Signed-off-by: Jeff King <peff@peff.net> --- I don't know if we should be preserving fs->i_end from getting a negative value. I would think the internal state of the istream after an error is undefined. streaming.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/streaming.c b/streaming.c index f4126a7..f4ab12b 100644 --- a/streaming.c +++ b/streaming.c @@ -237,7 +237,7 @@ static read_method_decl(filtered) if (!fs->input_finished) { fs->i_end = read_istream(fs->upstream, fs->ibuf, FILTER_BUFFER); if (fs->i_end < 0) - break; + return -1; if (fs->i_end) continue; } -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH 4/9] avoid infinite loop in read_istream_loose 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (2 preceding siblings ...) 2013-03-25 20:18 ` [PATCH 3/9] read_istream_filtered: propagate read error from upstream Jeff King @ 2013-03-25 20:21 ` Jeff King 2013-03-25 20:21 ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King ` (4 subsequent siblings) 8 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:21 UTC (permalink / raw) To: git; +Cc: Junio C Hamano The read_istream_loose function loops on inflating a chunk of data from an mmap'd loose object. We end the loop when we run out of space in our output buffer, or if we see a zlib error. We need to treat Z_BUF_ERROR specially, though, as it is not fatal; it is just zlib's way of telling us that we need to either feed it more input or give it more output space. It is perfectly normal for us to hit this when we are at the end of our buffer. However, we may also get Z_BUF_ERROR because we have run out of input. In a well-formed object, this should not happen, because we have fed the whole mmap'd contents to zlib. But if the object is truncated or corrupt, we will loop forever, never giving zlib any more data, but continuing to ask it to inflate. We can fix this by considering it an error when zlib returns Z_BUF_ERROR but we still have output space left (which means it must want more input, which we know is a truncation error). It would not be sufficient to just check whether zlib had consumed all the input at the start of the loop, as it might still want to generate output from what is in its internal state. Signed-off-by: Jeff King <peff@peff.net> --- The read_istream_pack_non_delta function does not suffer from the same issue, because it continually feeds more data via use_pack(). Although it may run into problems if it reads to the very end of a pack. I also didn't audit the other zlib code paths for similar problems; we may want to do that. streaming.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/streaming.c b/streaming.c index f4ab12b..cabcd9d 100644 --- a/streaming.c +++ b/streaming.c @@ -309,7 +309,7 @@ static read_method_decl(loose) st->z_state = z_done; break; } - if (status != Z_OK && status != Z_BUF_ERROR) { + if (status != Z_OK && (status != Z_BUF_ERROR || total_read < sz)) { git_inflate_end(&st->z); st->z_state = z_error; return -1; -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH 5/9] add test for streaming corrupt blobs 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (3 preceding siblings ...) 2013-03-25 20:21 ` [PATCH 4/9] avoid infinite loop in read_istream_loose Jeff King @ 2013-03-25 20:21 ` Jeff King 2013-03-25 21:10 ` Jonathan Nieder 2013-03-27 20:27 ` Jeff King 2013-03-25 20:22 ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King ` (3 subsequent siblings) 8 siblings, 2 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:21 UTC (permalink / raw) To: git; +Cc: Junio C Hamano We do not have many tests for handling corrupt objects. This new test at least checks that we detect a byte error in a corrupt blob object while streaming it out with cat-file. Signed-off-by: Jeff King <peff@peff.net> --- t/t1060-object-corruption.sh | 34 ++++++++++++++++++++++++++++++++++ 1 file changed, 34 insertions(+) create mode 100755 t/t1060-object-corruption.sh diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh new file mode 100755 index 0000000..d36994a --- /dev/null +++ b/t/t1060-object-corruption.sh @@ -0,0 +1,34 @@ +#!/bin/sh + +test_description='see how we handle various forms of corruption' +. ./test-lib.sh + +# convert "1234abcd" to ".git/objects/12/34abcd" +obj_to_file() { + echo "$(git rev-parse --git-dir)/objects/$(git rev-parse "$1" | sed 's,..,&/,')" +} + +# Convert byte at offset "$2" of object "$1" into '\0' +corrupt_byte() { + obj_file=$(obj_to_file "$1") && + chmod +w "$obj_file" && + printf '\0' | dd of="$obj_file" bs=1 seek="$2" +} + +test_expect_success 'setup corrupt repo' ' + git init bit-error && + ( + cd bit-error && + test_commit content && + corrupt_byte HEAD:content.t 10 + ) +' + +test_expect_success 'streaming a corrupt blob fails' ' + ( + cd bit-error && + test_must_fail git cat-file blob HEAD:content.t + ) +' + +test_done -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 5/9] add test for streaming corrupt blobs 2013-03-25 20:21 ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King @ 2013-03-25 21:10 ` Jonathan Nieder 2013-03-25 21:26 ` Jeff King 2013-03-27 20:27 ` Jeff King 1 sibling, 1 reply; 60+ messages in thread From: Jonathan Nieder @ 2013-03-25 21:10 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King wrote: > We do not have many tests for handling corrupt objects. This > new test at least checks that we detect a byte error in a > corrupt blob object while streaming it out with cat-file. Thanks. [...] > +# convert "1234abcd" to ".git/objects/12/34abcd" > +obj_to_file() { > + echo "$(git rev-parse --git-dir)/objects/$(git rev-parse "$1" | sed 's,..,&/,')" > +} Maybe this would be clearer in multiple lines? commit=$(git rev-parse --verify "$1") && git_dir=$(git rev-parse --git-dir) && tail=${commit#??} && echo "$git_dir/objects/${commit%$tail}/$tail" > + > +# Convert byte at offset "$2" of object "$1" into '\0' > +corrupt_byte() { > + obj_file=$(obj_to_file "$1") && > + chmod +w "$obj_file" && > + printf '\0' | dd of="$obj_file" bs=1 seek="$2" Some other tests such as t4205 also rely on "printf" being binary-safe. Phew. > +} > + > +test_expect_success 'setup corrupt repo' ' > + git init bit-error && > + ( > + cd bit-error && > + test_commit content && > + corrupt_byte HEAD:content.t 10 > + ) > +' > + > +test_expect_success 'streaming a corrupt blob fails' ' "fails gracefully", maybe, to be more precise. With or without the two changes suggested above, Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 5/9] add test for streaming corrupt blobs 2013-03-25 21:10 ` Jonathan Nieder @ 2013-03-25 21:26 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 21:26 UTC (permalink / raw) To: Jonathan Nieder; +Cc: git, Junio C Hamano On Mon, Mar 25, 2013 at 02:10:38PM -0700, Jonathan Nieder wrote: > > +# convert "1234abcd" to ".git/objects/12/34abcd" > > +obj_to_file() { > > + echo "$(git rev-parse --git-dir)/objects/$(git rev-parse "$1" | sed 's,..,&/,')" > > +} > > Maybe this would be clearer in multiple lines? > > commit=$(git rev-parse --verify "$1") && > git_dir=$(git rev-parse --git-dir) && > tail=${commit#??} && > echo "$git_dir/objects/${commit%$tail}/$tail" Yeah, it started as: echo "$1" | sed 's,..,&/,' and kind of got out of hand as it grew features. I'd be fine with your version (though $commit is not right, as it is any object, and in fact the test uses blobs). > > + > > +# Convert byte at offset "$2" of object "$1" into '\0' > > +corrupt_byte() { > > + obj_file=$(obj_to_file "$1") && > > + chmod +w "$obj_file" && > > + printf '\0' | dd of="$obj_file" bs=1 seek="$2" > > Some other tests such as t4205 also rely on "printf" being > binary-safe. Phew. Yeah, I think it should be fine, though the choice of character does not actually matter, as long as it is different from what is currently at that position (for the sake of simplicity, I just determined experimentally that the given object is corrupted with the offset and character I chose). -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 5/9] add test for streaming corrupt blobs 2013-03-25 20:21 ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King 2013-03-25 21:10 ` Jonathan Nieder @ 2013-03-27 20:27 ` Jeff King 2013-03-27 20:35 ` Junio C Hamano 1 sibling, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-27 20:27 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Mon, Mar 25, 2013 at 04:21:34PM -0400, Jeff King wrote: > +# Convert byte at offset "$2" of object "$1" into '\0' > +corrupt_byte() { > + obj_file=$(obj_to_file "$1") && > + chmod +w "$obj_file" && > + printf '\0' | dd of="$obj_file" bs=1 seek="$2" > +} Hmm, this last line should probably be: diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index a84deb1..3f87051 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -12,7 +12,7 @@ corrupt_byte() { corrupt_byte() { obj_file=$(obj_to_file "$1") && chmod +w "$obj_file" && - printf '\0' | dd of="$obj_file" bs=1 seek="$2" + printf '\0' | dd of="$obj_file" bs=1 seek="$2" conv=notrunc } test_expect_success 'setup corrupt repo' ' The intent was to change a single byte, not truncate the file (though on the plus side, that truncation is what found the other bugs). -Peff ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 5/9] add test for streaming corrupt blobs 2013-03-27 20:27 ` Jeff King @ 2013-03-27 20:35 ` Junio C Hamano 0 siblings, 0 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-27 20:35 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > On Mon, Mar 25, 2013 at 04:21:34PM -0400, Jeff King wrote: > >> +# Convert byte at offset "$2" of object "$1" into '\0' >> +corrupt_byte() { >> + obj_file=$(obj_to_file "$1") && >> + chmod +w "$obj_file" && >> + printf '\0' | dd of="$obj_file" bs=1 seek="$2" >> +} > > Hmm, this last line should probably be: > > diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh > index a84deb1..3f87051 100755 > --- a/t/t1060-object-corruption.sh > +++ b/t/t1060-object-corruption.sh > @@ -12,7 +12,7 @@ corrupt_byte() { > corrupt_byte() { > obj_file=$(obj_to_file "$1") && > chmod +w "$obj_file" && > - printf '\0' | dd of="$obj_file" bs=1 seek="$2" > + printf '\0' | dd of="$obj_file" bs=1 seek="$2" conv=notrunc > } > > test_expect_success 'setup corrupt repo' ' > > The intent was to change a single byte, not truncate the file (though on > the plus side, that truncation is what found the other bugs). ;-). Thanks, I missed that. ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (4 preceding siblings ...) 2013-03-25 20:21 ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King @ 2013-03-25 20:22 ` Jeff King 2013-03-25 21:35 ` Eric Sunshine 2013-03-25 21:39 ` Jonathan Nieder 2013-03-25 20:22 ` [PATCH 7/9] add tests for cloning corrupted repositories Jeff King ` (2 subsequent siblings) 8 siblings, 2 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:22 UTC (permalink / raw) To: git; +Cc: Junio C Hamano When we are streaming an index blob to disk, we store the error from stream_blob_to_fd in the "result" variable, and then immediately overwrite that with the return value of "close". That means we catch errors on close (e.g., problems committing the file to disk), but miss anything which happened before then. Signed-off-by: Jeff King <peff@peff.net> --- entry.c | 6 ++++-- t/t1060-object-corruption.sh | 25 +++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 2 deletions(-) diff --git a/entry.c b/entry.c index 17a6bcc..002b2f2 100644 --- a/entry.c +++ b/entry.c @@ -126,8 +126,10 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, fd = open_output_fd(path, ce, to_tempfile); if (0 <= fd) { result = stream_blob_to_fd(fd, ce->sha1, filter, 1); - *fstat_done = fstat_output(fd, state, statbuf); - result = close(fd); + if (!result) { + *fstat_done = fstat_output(fd, state, statbuf); + result = close(fd); + } } if (result && 0 <= fd) unlink(path); diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index d36994a..0792132 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -24,6 +24,15 @@ test_expect_success 'setup corrupt repo' ' ) ' +test_expect_success 'setup repo with missing object' ' + git init missing && + ( + cd missing && + test_commit content && + rm -f "$(obj_to_file HEAD:content.t)" + ) +' + test_expect_success 'streaming a corrupt blob fails' ' ( cd bit-error && @@ -31,4 +40,20 @@ test_expect_success 'streaming a corrupt blob fails' ' ) ' +test_expect_success 'read-tree -u detects bit-errors in blobs' ' + ( + cd bit-error && + rm content.t && + test_must_fail git read-tree --reset -u FETCH_HEAD + ) +' + +test_expect_success 'read-tree -u detects missing objects' ' + ( + cd missing && + rm content.t && + test_must_fail git read-tree --reset -u FETCH_HEAD + ) +' + test_done -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 20:22 ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King @ 2013-03-25 21:35 ` Eric Sunshine 2013-03-25 21:37 ` Jeff King 2013-03-25 21:39 ` Jonathan Nieder 1 sibling, 1 reply; 60+ messages in thread From: Eric Sunshine @ 2013-03-25 21:35 UTC (permalink / raw) To: Jeff King; +Cc: Git List, Junio C Hamano On Mon, Mar 25, 2013 at 4:22 PM, Jeff King <peff@peff.net> wrote: > diff --git a/entry.c b/entry.c > index 17a6bcc..002b2f2 100644 > --- a/entry.c > +++ b/entry.c > @@ -126,8 +126,10 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, > fd = open_output_fd(path, ce, to_tempfile); > if (0 <= fd) { > result = stream_blob_to_fd(fd, ce->sha1, filter, 1); > - *fstat_done = fstat_output(fd, state, statbuf); > - result = close(fd); > + if (!result) { > + *fstat_done = fstat_output(fd, state, statbuf); > + result = close(fd); > + } Is this intentionally leaking the opened 'fd' when stream_blob_to_fd() returns an error? > } > if (result && 0 <= fd) > unlink(path); Won't the unlink() now fail on Windows since 'fd' is still open? -- ES ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 21:35 ` Eric Sunshine @ 2013-03-25 21:37 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 21:37 UTC (permalink / raw) To: Eric Sunshine; +Cc: Git List, Junio C Hamano On Mon, Mar 25, 2013 at 05:35:51PM -0400, Eric Sunshine wrote: > On Mon, Mar 25, 2013 at 4:22 PM, Jeff King <peff@peff.net> wrote: > > diff --git a/entry.c b/entry.c > > index 17a6bcc..002b2f2 100644 > > --- a/entry.c > > +++ b/entry.c > > @@ -126,8 +126,10 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, > > fd = open_output_fd(path, ce, to_tempfile); > > if (0 <= fd) { > > result = stream_blob_to_fd(fd, ce->sha1, filter, 1); > > - *fstat_done = fstat_output(fd, state, statbuf); > > - result = close(fd); > > + if (!result) { > > + *fstat_done = fstat_output(fd, state, statbuf); > > + result = close(fd); > > + } > > Is this intentionally leaking the opened 'fd' when stream_blob_to_fd() > returns an error? Good catch. I was so focused on making sure we still called unlink that I forgot about the cleanup side-effect of close. I'll re-roll it. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 20:22 ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King 2013-03-25 21:35 ` Eric Sunshine @ 2013-03-25 21:39 ` Jonathan Nieder 2013-03-25 21:49 ` [PATCH v2 " Jeff King 1 sibling, 1 reply; 60+ messages in thread From: Jonathan Nieder @ 2013-03-25 21:39 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano Jeff King wrote: > When we are streaming an index blob to disk, we store the > error from stream_blob_to_fd in the "result" variable, and > then immediately overwrite that with the return value of > "close". Good catch. [...] > --- a/entry.c > +++ b/entry.c > @@ -126,8 +126,10 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, > fd = open_output_fd(path, ce, to_tempfile); > if (0 <= fd) { > result = stream_blob_to_fd(fd, ce->sha1, filter, 1); > - *fstat_done = fstat_output(fd, state, statbuf); > - result = close(fd); > + if (!result) { > + *fstat_done = fstat_output(fd, state, statbuf); > + result = close(fd); > + } Should this do something like { int fd, result = 0; fd = open_output_fd(path, ce, to_tempfile); if (fd < 0) return -1; result = stream_blob_to_fd(fd, ce->sha1, filter, 1); if (result) goto close_fd; *fstat_done = fstat_output(fd, state, statbuf); close_fd: result |= close(fd); unlink_path: if (result) unlink(path); return result; } to avoid leaking the file descriptor? > @@ -31,4 +40,20 @@ test_expect_success 'streaming a corrupt blob fails' ' > ) > ' > > +test_expect_success 'read-tree -u detects bit-errors in blobs' ' > + ( > + cd bit-error && > + rm content.t && > + test_must_fail git read-tree --reset -u FETCH_HEAD > + ) Makes sense. Might make sense to use "rm -f" instead of "rm" to avoid failures if content.t is removed already. > +' > + > +test_expect_success 'read-tree -u detects missing objects' ' > + ( > + cd missing && > + rm content.t && Especially here. Thanks, Jonathan ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH v2 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 21:39 ` Jonathan Nieder @ 2013-03-25 21:49 ` Jeff King 2013-03-25 23:29 ` Jonathan Nieder 2013-03-26 21:38 ` Junio C Hamano 0 siblings, 2 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 21:49 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Eric Sunshine, git, Junio C Hamano On Mon, Mar 25, 2013 at 02:39:34PM -0700, Jonathan Nieder wrote: > > --- a/entry.c > > +++ b/entry.c > > @@ -126,8 +126,10 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, > > fd = open_output_fd(path, ce, to_tempfile); > > if (0 <= fd) { > > result = stream_blob_to_fd(fd, ce->sha1, filter, 1); > > - *fstat_done = fstat_output(fd, state, statbuf); > > - result = close(fd); > > + if (!result) { > > + *fstat_done = fstat_output(fd, state, statbuf); > > + result = close(fd); > > + } > > Should this do something like > [...] > to avoid leaking the file descriptor? Yes, Eric Sunshine noticed this, too. Re-rolled patch is below, which I think is even a little cleaner. > > +test_expect_success 'read-tree -u detects bit-errors in blobs' ' > > + ( > > + cd bit-error && > > + rm content.t && > > + test_must_fail git read-tree --reset -u FETCH_HEAD > > + ) > > Makes sense. Might make sense to use "rm -f" instead of "rm" to avoid > failures if content.t is removed already. Yeah, good point. My original test looked like: git init bit-error && git fetch .. && corrupt ... test_must_fail ... but I ended up refactoring it to re-use the corrupted directories, and added the "rm" after the fact. The use of FETCH_HEAD is also bogus (read-tree is failing, but because we are giving it a bogus ref, not because of the corruption, so we are not actually testing anything anymore, even though it still passes). Both fixed in my re-roll. -- >8 -- Subject: [PATCH] streaming_write_entry: propagate streaming errors When we are streaming an index blob to disk, we store the error from stream_blob_to_fd in the "result" variable, and then immediately overwrite that with the return value of "close". That means we catch errors on close (e.g., problems committing the file to disk), but miss anything which happened before then. We can fix this by using bitwise-OR to accumulate errors in our result variable. While we're here, we can also simplify the error handling with an early return, which makes it easier to see under which circumstances we need to clean up. Signed-off-by: Jeff King <peff@peff.net> --- entry.c | 16 +++++++++------- t/t1060-object-corruption.sh | 25 +++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 7 deletions(-) diff --git a/entry.c b/entry.c index 17a6bcc..a20bcbc 100644 --- a/entry.c +++ b/entry.c @@ -120,16 +120,18 @@ static int streaming_write_entry(struct cache_entry *ce, char *path, const struct checkout *state, int to_tempfile, int *fstat_done, struct stat *statbuf) { - int result = -1; + int result = 0; int fd; fd = open_output_fd(path, ce, to_tempfile); - if (0 <= fd) { - result = stream_blob_to_fd(fd, ce->sha1, filter, 1); - *fstat_done = fstat_output(fd, state, statbuf); - result = close(fd); - } - if (result && 0 <= fd) + if (fd < 0) + return -1; + + result |= stream_blob_to_fd(fd, ce->sha1, filter, 1); + *fstat_done = fstat_output(fd, state, statbuf); + result |= close(fd); + + if (result) unlink(path); return result; } diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index d36994a..2945395 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -24,6 +24,15 @@ test_expect_success 'setup corrupt repo' ' ) ' +test_expect_success 'setup repo with missing object' ' + git init missing && + ( + cd missing && + test_commit content && + rm -f "$(obj_to_file HEAD:content.t)" + ) +' + test_expect_success 'streaming a corrupt blob fails' ' ( cd bit-error && @@ -31,4 +40,20 @@ test_expect_success 'streaming a corrupt blob fails' ' ) ' +test_expect_success 'read-tree -u detects bit-errors in blobs' ' + ( + cd bit-error && + rm -f content.t && + test_must_fail git read-tree --reset -u HEAD + ) +' + +test_expect_success 'read-tree -u detects missing objects' ' + ( + cd missing && + rm -f content.t && + test_must_fail git read-tree --reset -u HEAD + ) +' + test_done -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH v2 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 21:49 ` [PATCH v2 " Jeff King @ 2013-03-25 23:29 ` Jonathan Nieder 2013-03-26 21:38 ` Junio C Hamano 1 sibling, 0 replies; 60+ messages in thread From: Jonathan Nieder @ 2013-03-25 23:29 UTC (permalink / raw) To: Jeff King; +Cc: Eric Sunshine, git, Junio C Hamano Jeff King wrote: > Both fixed in my re-roll. Thanks! This and the rest of the patches up to and including patch 8 look good to me. I haven't decided what to think about patch 9 yet, but I suspect it would be good, too. In the long term I suspect "git clone --worktree-only <repo>" (or some other standard interface for git-new-workdir functionality) is a better way to provide a convenient lightweight same-machine clone anyway. Jonathan ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH v2 6/9] streaming_write_entry: propagate streaming errors 2013-03-25 21:49 ` [PATCH v2 " Jeff King 2013-03-25 23:29 ` Jonathan Nieder @ 2013-03-26 21:38 ` Junio C Hamano 1 sibling, 0 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-26 21:38 UTC (permalink / raw) To: Jeff King; +Cc: Jonathan Nieder, Eric Sunshine, git Jeff King <peff@peff.net> writes: > Subject: [PATCH] streaming_write_entry: propagate streaming errors > > When we are streaming an index blob to disk, we store the > error from stream_blob_to_fd in the "result" variable, and > then immediately overwrite that with the return value of > "close". That means we catch errors on close (e.g., problems > committing the file to disk), but miss anything which > happened before then. > > We can fix this by using bitwise-OR to accumulate errors in > our result variable. > > While we're here, we can also simplify the error handling > with an early return, which makes it easier to see under > which circumstances we need to clean up. > > Signed-off-by: Jeff King <peff@peff.net> Very sensible. Thanks. ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 7/9] add tests for cloning corrupted repositories 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (5 preceding siblings ...) 2013-03-25 20:22 ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King @ 2013-03-25 20:22 ` Jeff King 2013-03-25 20:23 ` [PATCH 8/9] clone: die on errors from unpack_trees Jeff King 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King 8 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:22 UTC (permalink / raw) To: git; +Cc: Junio C Hamano We try not to let corruption pass unnoticed over fetches and clones. For the most part, this works, but there are some broken corner cases, including: 1. We do not detect missing objects over git-aware transports. This is a little hard to test, because the sending side will actually complain about the missing object. To fool it, we corrupt a repository such that we have a "misnamed" object: it claims to be sha1 X, but is really Y. This lets the sender blindly transmit it, but it is the receiver's responsibility to verify that what it got is sane (and it does not). 2. We do not detect missing or misnamed blobs during the checkout phase of clone. Signed-off-by: Jeff King <peff@peff.net> --- t/t1060-object-corruption.sh | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+) diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index 0792132..eb285c0 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -33,6 +33,19 @@ test_expect_success 'setup repo with missing object' ' ) ' +test_expect_success 'setup repo with misnamed object' ' + git init misnamed && + ( + cd misnamed && + test_commit content && + good=$(obj_to_file HEAD:content.t) && + blob=$(echo corrupt | git hash-object -w --stdin) && + bad=$(obj_to_file $blob) && + rm -f "$good" && + mv "$bad" "$good" + ) +' + test_expect_success 'streaming a corrupt blob fails' ' ( cd bit-error && @@ -56,4 +69,32 @@ test_expect_success 'read-tree -u detects missing objects' ' ) ' +# We use --bare to make sure that the transport detects it, not the checkout +# phase. +test_expect_success 'clone --no-local --bare detects corruption' ' + test_must_fail git clone --no-local --bare bit-error corrupt-transport +' + +test_expect_success 'clone --no-local --bare detects missing object' ' + test_must_fail git clone --no-local --bare missing missing-transport +' + +test_expect_failure 'clone --no-local --bare detects misnamed object' ' + test_must_fail git clone --no-local --bare misnamed misnamed-transport +' + +# We do not expect --local to detect corruption at the transport layer, +# so we are really checking the checkout() code path. +test_expect_success 'clone --local detects corruption' ' + test_must_fail git clone --local bit-error corrupt-checkout +' + +test_expect_failure 'clone --local detects missing objects' ' + test_must_fail git clone --local missing missing-checkout +' + +test_expect_failure 'clone --local detects misnamed objects' ' + test_must_fail git clone --local misnamed misnamed-checkout +' + test_done -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* [PATCH 8/9] clone: die on errors from unpack_trees 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (6 preceding siblings ...) 2013-03-25 20:22 ` [PATCH 7/9] add tests for cloning corrupted repositories Jeff King @ 2013-03-25 20:23 ` Jeff King 2013-03-26 21:40 ` Junio C Hamano 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King 8 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-25 20:23 UTC (permalink / raw) To: git; +Cc: Junio C Hamano When clone is populating the working tree, it ignores the return status from unpack_trees; this means we may report a successful clone, even when the checkout fails. When checkout fails, we may want to leave the $GIT_DIR in place, as it might be possible to recover the data through further use of "git checkout" (e.g., if the checkout failed due to a transient error, disk full, etc). However, we already die on a number of other checkout-related errors, so this patch follows that pattern. In addition to marking a now-passing test, we need to adjust t5710, which blindly assumed it could make bogus clones of very deep alternates hierarchies. By using "--bare", we can avoid it actually touching any objects. Signed-off-by: Jeff King <peff@peff.net> --- I think the "leave the data behind" fix may be to just set "junk_pid = 0" a little sooner in cmd_clone (i.e., before checkout()). Then we would still die, but at least leave the fetched objects intact. builtin/clone.c | 3 ++- t/t1060-object-corruption.sh | 2 +- t/t5710-info-alternate.sh | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index e0aaf13..7d48ef3 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -579,7 +579,8 @@ static int checkout(void) tree = parse_tree_indirect(sha1); parse_tree(tree); init_tree_desc(&t, tree->buffer, tree->size); - unpack_trees(1, &t, &opts); + if (unpack_trees(1, &t, &opts) < 0) + die(_("unable to checkout working tree")); if (write_cache(fd, active_cache, active_nr) || commit_locked_index(lock_file)) diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index eb285c0..05ba4e7 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -89,7 +89,7 @@ test_expect_success 'clone --local detects corruption' ' test_must_fail git clone --local bit-error corrupt-checkout ' -test_expect_failure 'clone --local detects missing objects' ' +test_expect_success 'clone --local detects missing objects' ' test_must_fail git clone --local missing missing-checkout ' diff --git a/t/t5710-info-alternate.sh b/t/t5710-info-alternate.sh index aa04529..5a6e49d 100755 --- a/t/t5710-info-alternate.sh +++ b/t/t5710-info-alternate.sh @@ -58,7 +58,7 @@ git clone -l -s F G && git clone -l -s D E && git clone -l -s E F && git clone -l -s F G && -git clone -l -s G H' +git clone --bare -l -s G H' test_expect_success 'invalidity of deepest repository' \ 'cd H && { -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 8/9] clone: die on errors from unpack_trees 2013-03-25 20:23 ` [PATCH 8/9] clone: die on errors from unpack_trees Jeff King @ 2013-03-26 21:40 ` Junio C Hamano 2013-03-26 22:22 ` [PATCH 10/9] clone: leave repo in place after checkout errors Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Junio C Hamano @ 2013-03-26 21:40 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > When clone is populating the working tree, it ignores the > return status from unpack_trees; this means we may report a > successful clone, even when the checkout fails. > > When checkout fails, we may want to leave the $GIT_DIR in > place, as it might be possible to recover the data through > further use of "git checkout" (e.g., if the checkout failed > due to a transient error, disk full, etc). However, we > already die on a number of other checkout-related errors, so > this patch follows that pattern. > > In addition to marking a now-passing test, we need to adjust > t5710, which blindly assumed it could make bogus clones of > very deep alternates hierarchies. By using "--bare", we can > avoid it actually touching any objects. > > Signed-off-by: Jeff King <peff@peff.net> > --- Thanks. > I think the "leave the data behind" fix may be to just set "junk_pid = > 0" a little sooner in cmd_clone (i.e., before checkout()). Then we > would still die, but at least leave the fetched objects intact. Yeah, perhaps, but I agree that is a much lower priority change. ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 10/9] clone: leave repo in place after checkout errors 2013-03-26 21:40 ` Junio C Hamano @ 2013-03-26 22:22 ` Jeff King 2013-03-26 22:32 ` Jonathan Nieder 0 siblings, 1 reply; 60+ messages in thread From: Jeff King @ 2013-03-26 22:22 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Tue, Mar 26, 2013 at 02:40:57PM -0700, Junio C Hamano wrote: > > I think the "leave the data behind" fix may be to just set "junk_pid = > > 0" a little sooner in cmd_clone (i.e., before checkout()). Then we > > would still die, but at least leave the fetched objects intact. > > Yeah, perhaps, but I agree that is a much lower priority change. As it turns out, the checkout() error path sometimes _already_ leaves the repository intact, but it's due to a bug. And it ends up deleting something random instead. :) I agree it's not a high priority, but I think it makes sense while we're in the area. And while it's very unlikely that the deletion would be disastrous (see below), it makes me nervous. Patch is below. -- >8 -- Subject: [PATCH] clone: leave repo in place after checkout errors If we manage to clone a remote repository but run into an error in the checkout, it is probably sane to leave the repo directory in place. That lets the user examine the situation without spending time to re-clone from the remote (which may be a lengthy process). Rather than try to convert each die() from the checkout code path into an error(), we simply set a flag that tells the "remove_junk" atexit function to print a helpful message and leave the repo in place. Note that the test added in this patch actually passes without the code change. The reason is that the cleanup code is buggy; we chdir into the working tree for the checkout, but still may use relative paths to remove the directories (which means if you cloned into "foo", we would accidentally remove "foo" from the working tree!). There's no point in fixing it now, since this patch means we will never try to remove anything after the chdir, anyway. Signed-off-by: Jeff King <peff@peff.net> --- I think the accidental deletion could also escape the repository if you did something like: git clone $remote ../../foo which would delete ../../foo/../../foo, or ../../../foo, which is not related to what you just cloned. But I didn't test, and we don't have to care anymore after this patch. builtin/clone.c | 33 ++++++++++++++++++++++++++++++++- t/t1060-object-corruption.sh | 4 ++++ 2 files changed, 36 insertions(+), 1 deletion(-) diff --git a/builtin/clone.c b/builtin/clone.c index eceaa74..e145dfc 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -377,10 +377,40 @@ static void remove_junk(void) static const char *junk_work_tree; static const char *junk_git_dir; static pid_t junk_pid; +enum { + JUNK_LEAVE_NONE, + JUNK_LEAVE_REPO, + JUNK_LEAVE_ALL +} junk_mode = JUNK_LEAVE_NONE; + +static const char junk_leave_repo_msg[] = +N_("The remote repository was cloned successfully, but there was\n" + "an error checking out the HEAD branch. The repository has been left in\n" + "place but the working tree may be in an inconsistent state. You can\n" + "can inspect the contents with:\n" + "\n" + " git status\n" + "\n" + "and retry the checkout with\n" + "\n" + " git checkout -f HEAD\n" + "\n"); static void remove_junk(void) { struct strbuf sb = STRBUF_INIT; + + switch (junk_mode) { + case JUNK_LEAVE_REPO: + warning("%s", _(junk_leave_repo_msg)); + /* fall-through */ + case JUNK_LEAVE_ALL: + return; + default: + /* proceed to removal */ + break; + } + if (getpid() != junk_pid) return; if (junk_git_dir) { @@ -925,12 +955,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix) transport_unlock_pack(transport); transport_disconnect(transport); + junk_mode = JUNK_LEAVE_REPO; err = checkout(); strbuf_release(&reflog_msg); strbuf_release(&branch_top); strbuf_release(&key); strbuf_release(&value); - junk_pid = 0; + junk_mode = JUNK_LEAVE_ALL; return err; } diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index a405b70..a84deb1 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -89,6 +89,10 @@ test_expect_success 'clone --local detects corruption' ' test_must_fail git clone --local bit-error corrupt-checkout ' +test_expect_success 'error detected during checkout leaves repo intact' ' + test_path_is_dir corrupt-checkout/.git +' + test_expect_success 'clone --local detects missing objects' ' test_must_fail git clone --local missing missing-checkout ' -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 10/9] clone: leave repo in place after checkout errors 2013-03-26 22:22 ` [PATCH 10/9] clone: leave repo in place after checkout errors Jeff King @ 2013-03-26 22:32 ` Jonathan Nieder 2013-03-27 1:03 ` Jeff King 0 siblings, 1 reply; 60+ messages in thread From: Jonathan Nieder @ 2013-03-26 22:32 UTC (permalink / raw) To: Jeff King; +Cc: Junio C Hamano, git Jeff King wrote: > --- a/builtin/clone.c > +++ b/builtin/clone.c > @@ -377,10 +377,40 @@ static void remove_junk(void) > static const char *junk_work_tree; > static const char *junk_git_dir; > static pid_t junk_pid; > +enum { > + JUNK_LEAVE_NONE, > + JUNK_LEAVE_REPO, > + JUNK_LEAVE_ALL > +} junk_mode = JUNK_LEAVE_NONE; Neat. > + > +static const char junk_leave_repo_msg[] = > +N_("The remote repository was cloned successfully, but there was\n" > + "an error checking out the HEAD branch. The repository has been left in\n" > + "place but the working tree may be in an inconsistent state. You can\n" > + "can inspect the contents with:\n" > + "\n" > + " git status\n" > + "\n" > + "and retry the checkout with\n" > + "\n" > + " git checkout -f HEAD\n" > + "\n"); Can this be made more precise? I don't know what it means for the working tree to be in an inconsistent state: do you mean that some files might be partially checked out or not have been checked out at all yet? error: Clone succeeded, but checkout failed. hint: You can inspect what was checked out with "git status". hint: To retry the checkout, run "git checkout -f HEAD". Aside from that, this looks very nice. Thanks, Jonathan ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 10/9] clone: leave repo in place after checkout errors 2013-03-26 22:32 ` Jonathan Nieder @ 2013-03-27 1:03 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-27 1:03 UTC (permalink / raw) To: Jonathan Nieder; +Cc: Junio C Hamano, git On Tue, Mar 26, 2013 at 03:32:59PM -0700, Jonathan Nieder wrote: > > +static const char junk_leave_repo_msg[] = > > +N_("The remote repository was cloned successfully, but there was\n" > > + "an error checking out the HEAD branch. The repository has been left in\n" > > + "place but the working tree may be in an inconsistent state. You can\n" > > + "can inspect the contents with:\n" > > + "\n" > > + " git status\n" > > + "\n" > > + "and retry the checkout with\n" > > + "\n" > > + " git checkout -f HEAD\n" > > + "\n"); > > Can this be made more precise? I don't know what it means for the > working tree to be in an inconsistent state: do you mean that some files > might be partially checked out or not have been checked out at all yet? It means that we died during the checkout procedure, and we don't have any idea what was left. Maybe something, maybe nothing. Maybe an index, maybe not. > error: Clone succeeded, but checkout failed. > hint: You can inspect what was checked out with "git status". > hint: To retry the checkout, run "git checkout -f HEAD". That is certainly more succint, if not more precise. I'd be fine with it. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* [PATCH 9/9] clone: run check_everything_connected 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King ` (7 preceding siblings ...) 2013-03-25 20:23 ` [PATCH 8/9] clone: die on errors from unpack_trees Jeff King @ 2013-03-25 20:26 ` Jeff King 2013-03-26 0:53 ` Duy Nguyen ` (2 more replies) 8 siblings, 3 replies; 60+ messages in thread From: Jeff King @ 2013-03-25 20:26 UTC (permalink / raw) To: git; +Cc: Junio C Hamano When we fetch from a remote, we do a revision walk to make sure that what we received is connected to our existing history. We do not do the same check for clone, which should be able to check that we received an intact history graph. The upside of this patch is that it will make clone more resilient against propagating repository corruption. The downside is that we will now traverse "rev-list --objects --all" down to the roots, which may take some time (it is especially noticeable for a "--local --bare" clone). Note that we need to adjust t5710, which tries to make such a bogus clone. Rather than checking after the fact that our clone is bogus, we can simplify it to just make sure "git clone" reports failure. Signed-off-by: Jeff King <peff@peff.net> --- The slowdown is really quite terrible if you try "git clone --bare linux-2.6.git". Even with this, the local-clone case already misses blob corruption. So it probably makes sense to restrict it to just the non-local clone case, which already has to do more work. Even still, it adds a non-trivial amount of work (linux-2.6 takes something like a minute to check). I don't like the idea of declaring "git clone" non-safe unless you turn on transfer.fsckObjects, though. It should have the same safety as "git fetch". builtin/clone.c | 26 ++++++++++++++++++++++++++ t/t1060-object-corruption.sh | 2 +- t/t5710-info-alternate.sh | 8 +------- 3 files changed, 28 insertions(+), 8 deletions(-) diff --git a/builtin/clone.c b/builtin/clone.c index 7d48ef3..eceaa74 100644 --- a/builtin/clone.c +++ b/builtin/clone.c @@ -23,6 +23,7 @@ #include "branch.h" #include "remote.h" #include "run-command.h" +#include "connected.h" /* * Overall FIXMEs: @@ -485,12 +486,37 @@ static void update_remote_refs(const struct ref *refs, } } +static int iterate_ref_map(void *cb_data, unsigned char sha1[20]) +{ + struct ref **rm = cb_data; + struct ref *ref = *rm; + + /* + * Skip anything missing a peer_ref, which we are not + * actually going to write a ref for. + */ + while (ref && !ref->peer_ref) + ref = ref->next; + /* Returning -1 notes "end of list" to the caller. */ + if (!ref) + return -1; + + hashcpy(sha1, ref->old_sha1); + *rm = ref->next; + return 0; +} + static void update_remote_refs(const struct ref *refs, const struct ref *mapped_refs, const struct ref *remote_head_points_at, const char *branch_top, const char *msg) { + const struct ref *rm = mapped_refs; + + if (check_everything_connected(iterate_ref_map, 0, &rm)) + die(_("remote did not send all necessary objects")); + if (refs) { write_remote_refs(mapped_refs); if (option_single_branch) diff --git a/t/t1060-object-corruption.sh b/t/t1060-object-corruption.sh index 05ba4e7..fd314ef 100755 --- a/t/t1060-object-corruption.sh +++ b/t/t1060-object-corruption.sh @@ -79,7 +79,7 @@ test_expect_success 'clone --no-local --bare detects missing object' ' test_must_fail git clone --no-local --bare missing missing-transport ' -test_expect_failure 'clone --no-local --bare detects misnamed object' ' +test_expect_success 'clone --no-local --bare detects misnamed object' ' test_must_fail git clone --no-local --bare misnamed misnamed-transport ' diff --git a/t/t5710-info-alternate.sh b/t/t5710-info-alternate.sh index 5a6e49d..8956c21 100755 --- a/t/t5710-info-alternate.sh +++ b/t/t5710-info-alternate.sh @@ -58,13 +58,7 @@ git clone -l -s F G && git clone -l -s D E && git clone -l -s E F && git clone -l -s F G && -git clone --bare -l -s G H' - -test_expect_success 'invalidity of deepest repository' \ -'cd H && { - test_valid_repo - test $? -ne 0 -}' +test_must_fail git clone --bare -l -s G H' cd "$base_dir" -- 1.8.2.13.g0f18d3c ^ permalink raw reply related [flat|nested] 60+ messages in thread
* Re: [PATCH 9/9] clone: run check_everything_connected 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King @ 2013-03-26 0:53 ` Duy Nguyen 2013-03-26 22:24 ` Jeff King 2013-03-26 21:50 ` Junio C Hamano 2013-03-28 0:40 ` Duy Nguyen 2 siblings, 1 reply; 60+ messages in thread From: Duy Nguyen @ 2013-03-26 0:53 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On Tue, Mar 26, 2013 at 3:26 AM, Jeff King <peff@peff.net> wrote: > static void update_remote_refs(const struct ref *refs, > const struct ref *mapped_refs, > const struct ref *remote_head_points_at, > const char *branch_top, > const char *msg) > { > + const struct ref *rm = mapped_refs; > + > + if (check_everything_connected(iterate_ref_map, 0, &rm)) > + die(_("remote did not send all necessary objects")); > + > if (refs) { > write_remote_refs(mapped_refs); > if (option_single_branch) Maybe move this after checkout, so that I can switch terminal and start working while it's verifying? And maybe an option not to check_everything_connected, instead print a big fat warning telling the user to fsck later? -- Duy ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 9/9] clone: run check_everything_connected 2013-03-26 0:53 ` Duy Nguyen @ 2013-03-26 22:24 ` Jeff King 0 siblings, 0 replies; 60+ messages in thread From: Jeff King @ 2013-03-26 22:24 UTC (permalink / raw) To: Duy Nguyen; +Cc: git, Junio C Hamano On Tue, Mar 26, 2013 at 07:53:42AM +0700, Nguyen Thai Ngoc Duy wrote: > On Tue, Mar 26, 2013 at 3:26 AM, Jeff King <peff@peff.net> wrote: > > static void update_remote_refs(const struct ref *refs, > > const struct ref *mapped_refs, > > const struct ref *remote_head_points_at, > > const char *branch_top, > > const char *msg) > > { > > + const struct ref *rm = mapped_refs; > > + > > + if (check_everything_connected(iterate_ref_map, 0, &rm)) > > + die(_("remote did not send all necessary objects")); > > + > > if (refs) { > > write_remote_refs(mapped_refs); > > if (option_single_branch) > > Maybe move this after checkout, so that I can switch terminal and > start working while it's verifying? And maybe an option not to > check_everything_connected, instead print a big fat warning telling > the user to fsck later? I tried to follow the fetch process of not installing the refs until we had verified that the objects were reasonable. It probably doesn't matter that much for clone, since you would not have simultaneous users expecting the repository to be in a reasonable state until after clone completes, though. We also would have to tweak check_everything_connected, which does something like "--not --all" to avoid rechecking objects we already have. But that is not too hard to do. -Peff ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 9/9] clone: run check_everything_connected 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King 2013-03-26 0:53 ` Duy Nguyen @ 2013-03-26 21:50 ` Junio C Hamano 2013-03-28 0:40 ` Duy Nguyen 2 siblings, 0 replies; 60+ messages in thread From: Junio C Hamano @ 2013-03-26 21:50 UTC (permalink / raw) To: Jeff King; +Cc: git Jeff King <peff@peff.net> writes: > The slowdown is really quite terrible if you try "git clone --bare > linux-2.6.git". Even with this, the local-clone case already misses blob > corruption. So it probably makes sense to restrict it to just the > non-local clone case, which already has to do more work. Probably. We may want to enable fsck even for local clones in the longer term and also have this check. Those who know their filesystem is trustworthy can do the filesystem-level copy with "cp -R" themselves after all. > Even still, it adds a non-trivial amount of work (linux-2.6 takes > something like a minute to check). I don't like the idea of declaring > "git clone" non-safe unless you turn on transfer.fsckObjects, though. It > should have the same safety as "git fetch". True. ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 9/9] clone: run check_everything_connected 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King 2013-03-26 0:53 ` Duy Nguyen 2013-03-26 21:50 ` Junio C Hamano @ 2013-03-28 0:40 ` Duy Nguyen 2013-03-31 7:57 ` Duy Nguyen 2 siblings, 1 reply; 60+ messages in thread From: Duy Nguyen @ 2013-03-28 0:40 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On Tue, Mar 26, 2013 at 3:26 AM, Jeff King <peff@peff.net> wrote: > The slowdown is really quite terrible if you try "git clone --bare > linux-2.6.git". Even with this, the local-clone case already misses blob > corruption. So it probably makes sense to restrict it to just the > non-local clone case, which already has to do more work. > > Even still, it adds a non-trivial amount of work (linux-2.6 takes > something like a minute to check). I don't like the idea of declaring > "git clone" non-safe unless you turn on transfer.fsckObjects, though. It > should have the same safety as "git fetch". Maybe we could do it in index-pack to save some (wall) time. I haven't tried but I think it might work. The problem is to make sure the pack contains objects for all sha1 references in the pack. By that description, we don't need to do standard DAG traversal. We could extract sha-1 references in index-pack as we uncompress objects and put all "want" sha-1 in a hash table. At the end of index-pack, we check if any sha-1 in the hash table still points to non-existing object. This way, at least we don't need to uncompress all objects again in rev-list. We could parse+hash in both phases in index-pack. The first phase (parse_pack_objects) is usually I/O bound, we could hide some cost there. The second phase is multithreaded, all the better. -- Duy ^ permalink raw reply [flat|nested] 60+ messages in thread
* Re: [PATCH 9/9] clone: run check_everything_connected 2013-03-28 0:40 ` Duy Nguyen @ 2013-03-31 7:57 ` Duy Nguyen 0 siblings, 0 replies; 60+ messages in thread From: Duy Nguyen @ 2013-03-31 7:57 UTC (permalink / raw) To: Jeff King; +Cc: git, Junio C Hamano On Thu, Mar 28, 2013 at 07:40:51AM +0700, Duy Nguyen wrote: > Maybe we could do it in index-pack to save some (wall) time. I haven't > tried but I think it might work. The problem is to make sure the pack > contains objects for all sha1 references in the pack. By that > description, we don't need to do standard DAG traversal. We could > extract sha-1 references in index-pack as we uncompress objects and > put all "want" sha-1 in a hash table. At the end of index-pack, we > check if any sha-1 in the hash table still points to non-existing > object. > > This way, at least we don't need to uncompress all objects again in > rev-list. We could parse+hash in both phases in index-pack. The first > phase (parse_pack_objects) is usually I/O bound, we could hide some > cost there. The second phase is multithreaded, all the better. It looks like what I describe above is exactly what index-pack --strict does. Except that it holds the lock longer and has more abstraction layers to slow things down. On linux-2.6 with 3 threads: $ rev-list --all --objects --quiet (aka check_everything_connected) 34.26user 0.22system 0:34.56elapsed 99%CPU (0avgtext+0avgdata 2550528maxresident)k 0inputs+0outputs (0major+208569minor)pagefaults 0swaps $ index-pack --stdin 214.57user 8.38system 1:31.82elapsed 242%CPU (0avgtext+0avgdata 1357328maxresident)k 8inputs+1421016outputs (0major+1222537minor)pagefaults 0swaps $ index-pack --stdin --strict 297.36user 13.77system 2:11.82elapsed 236%CPU (0avgtext+0avgdata 1875040maxresident)k 0inputs+1421016outputs (0major+1308718minor)pagefaults 0swaps $ index-pack --stdin --connectivity 231.09user 7.42system 1:37.39elapsed 244%CPU (0avgtext+0avgdata 2080816maxresident)k 0inputs+1421016outputs (0major+540069minor)pagefaults 0swaps The last one does not hold locks by duplicating object hash table per thread. As you can see the consumed memory is much higher than --stdin. In return it only adds up 1/3 of rev-list time. Maybe you should check which one is cheaper for clone case, check_everything_connected() or index-pack --strict. -- Duy ^ permalink raw reply [flat|nested] 60+ messages in thread
end of thread, other threads:[~2013-03-31 7:57 UTC | newest] Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2013-03-24 18:31 propagating repo corruption across clone Jeff King 2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason 2013-03-24 19:23 ` Jeff King 2013-03-25 13:43 ` Jeff Mitchell 2013-03-25 14:56 ` Jeff King 2013-03-25 15:31 ` Duy Nguyen 2013-03-25 15:56 ` Jeff King 2013-03-25 16:32 ` Jeff Mitchell 2013-03-25 20:07 ` Jeff King 2013-03-26 13:43 ` Jeff Mitchell 2013-03-26 16:55 ` Jeff King 2013-03-26 21:59 ` Philip Oakley 2013-03-26 22:03 ` Jeff King 2013-03-26 23:20 ` Rich Fromm 2013-03-27 1:25 ` Jonathan Nieder 2013-03-27 18:23 ` Rich Fromm 2013-03-27 19:49 ` Jeff King 2013-03-27 20:04 ` Jeff King 2013-03-27 3:47 ` Junio C Hamano 2013-03-27 6:19 ` Sitaram Chamarty 2013-03-27 15:03 ` Junio C Hamano 2013-03-27 15:47 ` Sitaram Chamarty 2013-03-27 18:51 ` Rich Fromm 2013-03-27 19:13 ` Junio C Hamano 2013-03-28 13:52 ` Jeff Mitchell 2013-03-28 13:48 ` Jeff Mitchell 2013-03-26 1:06 ` Duy Nguyen 2013-03-24 19:16 ` Ilari Liusvaara 2013-03-25 20:01 ` Junio C Hamano 2013-03-25 20:05 ` Jeff King 2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King 2013-03-25 20:16 ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King 2013-03-26 21:27 ` Junio C Hamano 2013-03-25 20:17 ` [PATCH 2/9] check_sha1_signature: check return value from read_istream Jeff King 2013-03-25 20:18 ` [PATCH 3/9] read_istream_filtered: propagate read error from upstream Jeff King 2013-03-25 20:21 ` [PATCH 4/9] avoid infinite loop in read_istream_loose Jeff King 2013-03-25 20:21 ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King 2013-03-25 21:10 ` Jonathan Nieder 2013-03-25 21:26 ` Jeff King 2013-03-27 20:27 ` Jeff King 2013-03-27 20:35 ` Junio C Hamano 2013-03-25 20:22 ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King 2013-03-25 21:35 ` Eric Sunshine 2013-03-25 21:37 ` Jeff King 2013-03-25 21:39 ` Jonathan Nieder 2013-03-25 21:49 ` [PATCH v2 " Jeff King 2013-03-25 23:29 ` Jonathan Nieder 2013-03-26 21:38 ` Junio C Hamano 2013-03-25 20:22 ` [PATCH 7/9] add tests for cloning corrupted repositories Jeff King 2013-03-25 20:23 ` [PATCH 8/9] clone: die on errors from unpack_trees Jeff King 2013-03-26 21:40 ` Junio C Hamano 2013-03-26 22:22 ` [PATCH 10/9] clone: leave repo in place after checkout errors Jeff King 2013-03-26 22:32 ` Jonathan Nieder 2013-03-27 1:03 ` Jeff King 2013-03-25 20:26 ` [PATCH 9/9] clone: run check_everything_connected Jeff King 2013-03-26 0:53 ` Duy Nguyen 2013-03-26 22:24 ` Jeff King 2013-03-26 21:50 ` Junio C Hamano 2013-03-28 0:40 ` Duy Nguyen 2013-03-31 7:57 ` Duy Nguyen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.