Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: Bill Lear <rael@zopyra.com>
Cc: Junio C Hamano <junkio@cox.net>, git@vger.kernel.org
Subject: Re: Error converting from 1.4.4.1 to 1.5.0?
Date: Wed, 14 Feb 2007 10:42:15 -0800 (PST)
Message-ID: <Pine.LNX.4.64.0702141033400.3604@woody.linux-foundation.org> (raw)
In-Reply-To: <Pine.LNX.4.64.0702140958440.3604@woody.linux-foundation.org>



On Wed, 14 Feb 2007, Linus Torvalds wrote:
> 
> And if you can make the git history available to outsiders, I'd love to 
> see the corrupt tar-file (it doesn't have to be *public*, if you just can 
> trust me and perhaps a few other people with the data).

Side note: one reason why this is nice - even if you don't care about the 
corruption and can fix it other ways - is that the last time we had the 
one-bit corruption is also the reason why we now have the "-r" option to 
git-unpack-objects.

In other words, real-life corruption is not just a really nasty event, 
it's also a good way for *us* to verify that our recovery tools do as good 
a job as they possibly can. Maybe there are other things like that "-r" 
option where we could possibly do even better.

The git data structures are designed to be extremely robust, but there's 
nothing they can do about "corruption after the fact". The same way that a 
logging filesystem doesn't help if the disk itself starts getting read 
errors, the git data structures aren't going to guarantee that you can't 
lose data if you have actual disk or memory corruption going on. 

The things git can do is:

 - detection. The SHA1's should basically guarantee that you will never 
   ever have an _undetectable_ corruption anywhere (which is really really 
   easy with just about any other SCM)

 - make replication easy (so that once you've detected corruption, you 
   have mirrors you can trust).

 - and finally: in the absense of replication, we can do  our damndest to 
   try to figure out what the data was. But in many ways, the fact that we 
   are really really good at compressing data (people do love their small 
   repositories) also means that we have basically no redundancy anywhere, 
   because redundancy is what compression gets rid of (both delta- and 
   zlib compression do it - it's very fundamentally what any compression 
   is based on)

but it's always interesting to have real-life corruption cases to verify.

			Linus

  reply index

Thread overview: 39+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-02-14 16:12 Bill Lear
2007-02-14 17:07 ` Bill Lear
2007-02-14 17:15 ` Junio C Hamano
2007-02-14 17:20   ` Bill Lear
2007-02-14 17:45     ` Junio C Hamano
2007-02-14 20:49       ` Bill Lear
2007-02-14 20:58         ` Bill Lear
2007-02-14 21:19           ` Linus Torvalds
2007-02-14 21:40             ` Bill Lear
2007-02-14 21:47               ` Junio C Hamano
2007-02-14 21:52                 ` Junio C Hamano
2007-02-14 22:04                   ` Johannes Schindelin
2007-02-14 22:13                     ` Junio C Hamano
2007-02-14 22:32                       ` Johannes Schindelin
2007-02-15  0:41                       ` Jakub Narebski
2007-02-15  0:54                       ` Olivier Galibert
2007-02-15  1:36                         ` Johannes Schindelin
2007-02-14 22:02               ` Johannes Schindelin
2007-02-14 22:27               ` Nicolas Pitre
2007-02-14 22:41                 ` Bill Lear
2007-02-15  1:18                   ` OT: data destruction classics (was: Re: Error converting from 1.4.4.1 to 1.5.0?) Simon 'corecode' Schubert
2007-02-15  2:13                     ` Shawn O. Pearce
2007-02-15  2:51                       ` Linus Torvalds
2007-02-15 10:24                         ` Johannes Schindelin
2007-02-15 13:13                           ` Michael K. Edwards
2007-02-15 11:58                         ` Bill Lear
2007-02-15  9:13                     ` Andy Parkins
2007-02-15 14:30                       ` Mark Wooding
2007-02-14 23:24                 ` Error converting from 1.4.4.1 to 1.5.0? Linus Torvalds
2007-02-14 23:03               ` Linus Torvalds
2007-02-15  8:40               ` Uwe Kleine-König
2007-02-14 21:12         ` Junio C Hamano
2007-02-14 21:18           ` Bill Lear
2007-02-14 21:14         ` Nicolas Pitre
2007-02-14 21:32         ` Junio C Hamano
2007-02-14 18:19     ` Linus Torvalds
2007-02-14 18:42       ` Linus Torvalds [this message]
2007-02-14 21:13       ` Bill Lear
2007-02-14 21:35         ` Linus Torvalds

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0702141033400.3604@woody.linux-foundation.org \
    --to=torvalds@linux-foundation.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=rael@zopyra.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git