git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff Mitchell <jeffrey.mitchell@gmail.com>
To: Jeff King <peff@peff.net>
Cc: "Duy Nguyen" <pclouds@gmail.com>,
	"Ævar Arnfjörð" <avarab@gmail.com>,
	git@vger.kernel.org
Subject: Re: propagating repo corruption across clone
Date: Mon, 25 Mar 2013 12:32:50 -0400	[thread overview]
Message-ID: <CAOx6V3a6vGJvJ4HEmAXdTRKKCzRJS23OYd_em1b3aQLzPNEtQA@mail.gmail.com> (raw)
In-Reply-To: <20130325155600.GA18216@sigill.intra.peff.net>

On Mon, Mar 25, 2013 at 11:56 AM, Jeff King <peff@peff.net> wrote:
> On Mon, Mar 25, 2013 at 10:31:04PM +0700, Nguyen Thai Ngoc Duy wrote:
>
>> On Mon, Mar 25, 2013 at 9:56 PM, Jeff King <peff@peff.net> wrote:
>> > There are basically three levels of transport that can be used on a
>> > local machine:
>> >
>> >   1. Hard-linking (very fast, no redundancy).
>> >
>> >   2. Byte-for-byte copy (medium speed, makes a separate copy of the
>> >      data, but does not check the integrity of the original).
>> >
>> >   3. Regular git transport, creating a pack (slowest, but should include
>> >      redundancy checks).
>> >
>> > Using --no-hardlinks turns off (1), but leaves (2) as an option.  I
>> > think the documentation in "git clone" could use some improvement in
>> > that area.
>>
>> Not only git-clone. How git-fetch and git-push verify the new pack
>> should also be documented. I don't think many people outside the
>> contributor circle know what is done (and maybe how) when data is
>> received from outside.
>
> I think it's less of a documentation issue there, though, because they
> _only_ do (3). There is no option to do anything else, so there is
> nothing to warn the user about in terms of tradeoffs.
>
> I agree that in general git's handling of corruption could be documented
> somewhere, but I'm not sure where.

Hi there,

First of all, thanks for the analysis, it's much appreciated. It's
good to know that we weren't totally off-base in thinking that a naive
copy may be out of sync, as small as the chance are (certainly we
wouldn't have known the right ordering).

I think what was conflating the issue in my testing is that with
--mirror it implies --bare, so there would be checking of the objects
when the working tree was being created, hence --mirror won't show the
error a normal clone will -- it's not a transport question, it's just
a matter of the normal clone doing more and so having more data run
through checks.

However, there are still problems. For blob corruptions, even in this
--no-hardlinks, non --mirror case where an error was found, the exit
code from the clone was 0. I can see this tripping up all sorts of
automated scripts or repository GUIs that ignore the output and only
check the error code, which is not an unreasonable thing to do.

For commit corruptions, the --no-hardlinks, non --mirror case refused
to create the new repository and exited with an error code of 128. The
--no-hardlinks, --mirror case spewed errors to the console, yet
*still* created the new clone *and* returned an error code of zero.

It seems that when there is an "error" as opposed to a "fatal" it
doesn't affect the status code on a clone; I'd argue that it ought to.
If Git knows that the source repository has problems, it ought to be
reflected in the status code so that scripts performing clones have a
normal way to detect this and alert a user/sysadmin/whoever. Even if a
particular cloning method doesn't perform all sanity checks, if it
finds something in the sanity checks it *does* perform, this should be
trumpeted, loudly, regardless of transport mechanism and regardless of
whether a user is watching the process or a script is.

Thanks,
Jeff

  reply	other threads:[~2013-03-25 16:33 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-03-24 18:31 propagating repo corruption across clone Jeff King
2013-03-24 19:01 ` Ævar Arnfjörð Bjarmason
2013-03-24 19:23   ` Jeff King
2013-03-25 13:43     ` Jeff Mitchell
2013-03-25 14:56       ` Jeff King
2013-03-25 15:31         ` Duy Nguyen
2013-03-25 15:56           ` Jeff King
2013-03-25 16:32             ` Jeff Mitchell [this message]
2013-03-25 20:07               ` Jeff King
2013-03-26 13:43                 ` Jeff Mitchell
2013-03-26 16:55                   ` Jeff King
2013-03-26 21:59                     ` Philip Oakley
2013-03-26 22:03                       ` Jeff King
2013-03-26 23:20                     ` Rich Fromm
2013-03-27  1:25                       ` Jonathan Nieder
2013-03-27 18:23                         ` Rich Fromm
2013-03-27 19:49                           ` Jeff King
2013-03-27 20:04                             ` Jeff King
2013-03-27  3:47                       ` Junio C Hamano
2013-03-27  6:19                         ` Sitaram Chamarty
2013-03-27 15:03                           ` Junio C Hamano
2013-03-27 15:47                             ` Sitaram Chamarty
2013-03-27 18:51                         ` Rich Fromm
2013-03-27 19:13                           ` Junio C Hamano
2013-03-28 13:52                           ` Jeff Mitchell
2013-03-28 13:48                         ` Jeff Mitchell
2013-03-26  1:06             ` Duy Nguyen
2013-03-24 19:16 ` Ilari Liusvaara
2013-03-25 20:01 ` Junio C Hamano
2013-03-25 20:05   ` Jeff King
2013-03-25 20:14 ` [PATCH 0/9] corrupt object potpourri Jeff King
2013-03-25 20:16   ` [PATCH 1/9] stream_blob_to_fd: detect errors reading from stream Jeff King
2013-03-26 21:27     ` Junio C Hamano
2013-03-25 20:17   ` [PATCH 2/9] check_sha1_signature: check return value from read_istream Jeff King
2013-03-25 20:18   ` [PATCH 3/9] read_istream_filtered: propagate read error from upstream Jeff King
2013-03-25 20:21   ` [PATCH 4/9] avoid infinite loop in read_istream_loose Jeff King
2013-03-25 20:21   ` [PATCH 5/9] add test for streaming corrupt blobs Jeff King
2013-03-25 21:10     ` Jonathan Nieder
2013-03-25 21:26       ` Jeff King
2013-03-27 20:27     ` Jeff King
2013-03-27 20:35       ` Junio C Hamano
2013-03-25 20:22   ` [PATCH 6/9] streaming_write_entry: propagate streaming errors Jeff King
2013-03-25 21:35     ` Eric Sunshine
2013-03-25 21:37       ` Jeff King
2013-03-25 21:39     ` Jonathan Nieder
2013-03-25 21:49       ` [PATCH v2 " Jeff King
2013-03-25 23:29         ` Jonathan Nieder
2013-03-26 21:38         ` Junio C Hamano
2013-03-25 20:22   ` [PATCH 7/9] add tests for cloning corrupted repositories Jeff King
2013-03-25 20:23   ` [PATCH 8/9] clone: die on errors from unpack_trees Jeff King
2013-03-26 21:40     ` Junio C Hamano
2013-03-26 22:22       ` [PATCH 10/9] clone: leave repo in place after checkout errors Jeff King
2013-03-26 22:32         ` Jonathan Nieder
2013-03-27  1:03           ` Jeff King
2013-03-25 20:26   ` [PATCH 9/9] clone: run check_everything_connected Jeff King
2013-03-26  0:53     ` Duy Nguyen
2013-03-26 22:24       ` Jeff King
2013-03-26 21:50     ` Junio C Hamano
2013-03-28  0:40     ` Duy Nguyen
2013-03-31  7:57       ` Duy Nguyen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAOx6V3a6vGJvJ4HEmAXdTRKKCzRJS23OYd_em1b3aQLzPNEtQA@mail.gmail.com \
    --to=jeffrey.mitchell@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).