All of lore.kernel.org
 help / color / mirror / Atom feed
* git fsck not identifying corrupted packs
@ 2009-10-19  7:56 Sergio Callegari
  2009-10-19  9:11 ` Johannes Sixt
  0 siblings, 1 reply; 21+ messages in thread
From: Sergio Callegari @ 2009-10-19  7:56 UTC (permalink / raw)
  To: git

Hi,

I have a pack that contains a corrupted object.
It is an old corrupted repo that I have conserved.

As expected, git gc cries out loud about it.
It indicates an inflate error (data stream error with incorrect data check),
and then the impossibility to read an object from a certain offset in the pack.

However, git fsck does not complain at all about the repo.
I guess that for speed reasons, git fsck does not try to inflate the objects.

Is there a means to have fsck to a truly full check on the sanity of a repo?

This both on git 1.6.5.1 and 1.6.4.2.

Thanks

Sergio Callegari

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19  7:56 git fsck not identifying corrupted packs Sergio Callegari
@ 2009-10-19  9:11 ` Johannes Sixt
  2009-10-19 10:04   ` Johannes Schindelin
                     ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Johannes Sixt @ 2009-10-19  9:11 UTC (permalink / raw)
  To: Sergio Callegari; +Cc: git

Sergio Callegari schrieb:
> Is there a means to have fsck to a truly full check on the sanity of a repo?

git fsck --full

RTFM, please.

-- Hannes

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19  9:11 ` Johannes Sixt
@ 2009-10-19 10:04   ` Johannes Schindelin
  2009-10-19 19:03     ` Junio C Hamano
  2009-10-19 10:56   ` Sergio Callegari
  2009-10-19 18:36   ` Gabor Gombas
  2 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2009-10-19 10:04 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Sergio Callegari, git

Hi,

On Mon, 19 Oct 2009, Johannes Sixt wrote:

> Sergio Callegari schrieb:
> > Is there a means to have fsck to a truly full check on the sanity of a 
> > repo?
> 
> git fsck --full
> 
> RTFM, please.

Now, now.

If you were to test a new filesystem, say, wonderfulfs, and wanted to 
check its integrity, would you not just run "fsck-wonderfulfs" if that 
exists, rather than reading the fantamagastic manual?  Would you not 
expect that it Does The Right Thing?  Would you not expect that it 
follows the Law Of Minimal Surprise?

So FWIW I can see where Sergio is coming from.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19  9:11 ` Johannes Sixt
  2009-10-19 10:04   ` Johannes Schindelin
@ 2009-10-19 10:56   ` Sergio Callegari
  2009-10-19 19:07     ` Wesley J. Landaker
  2009-10-19 18:36   ` Gabor Gombas
  2 siblings, 1 reply; 21+ messages in thread
From: Sergio Callegari @ 2009-10-19 10:56 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: git

Johannes Sixt wrote:
> Sergio Callegari schrieb:
>   
>> Is there a means to have fsck to a truly full check on the sanity of a repo?
>>     
>
> git fsck --full
>
> RTFM, please.
>
>   
Right... sorry for the noise, I mismatched --strict for --full in a script.

BTW, the short help for fsck at --full only says "consider objects in 
alternate repositories".

My apologize.

Sergio

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19  9:11 ` Johannes Sixt
  2009-10-19 10:04   ` Johannes Schindelin
  2009-10-19 10:56   ` Sergio Callegari
@ 2009-10-19 18:36   ` Gabor Gombas
  2 siblings, 0 replies; 21+ messages in thread
From: Gabor Gombas @ 2009-10-19 18:36 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Sergio Callegari, git

On Mon, Oct 19, 2009 at 11:11:33AM +0200, Johannes Sixt wrote:

> > Is there a means to have fsck to a truly full check on the sanity of a repo?
> 
> git fsck --full
> 
> RTFM, please.

That still does not catch everything. About a week ago I wanted to check
out a branch in a local repo and I got an error that it was corrupt. But
"git fsck --full" only complained about some dangling objects, it did
not notice the corruption. I used git version 1.6.4.3 (Debian unstable
at that time). It would be nice to have a
"git fsck --i-really-want-to-check-everything".

Gabor

-- 
     ---------------------------------------------------------
     MTA SZTAKI Computer and Automation Research Institute
                Hungarian Academy of Sciences
     ---------------------------------------------------------

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 10:04   ` Johannes Schindelin
@ 2009-10-19 19:03     ` Junio C Hamano
  2009-10-19 19:27       ` Wesley J. Landaker
  2009-10-20  6:26       ` Matthieu Moy
  0 siblings, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2009-10-19 19:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Johannes Sixt, Sergio Callegari, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> On Mon, 19 Oct 2009, Johannes Sixt wrote:
>
>> Sergio Callegari schrieb:
>> > Is there a means to have fsck to a truly full check on the sanity of a 
>> > repo?
>> 
>> git fsck --full
>> 
>> RTFM, please.
>
> Now, now.
>
> If you were to test a new filesystem, say, wonderfulfs, and wanted to 
> check its integrity, would you not just run "fsck-wonderfulfs" if that 
> exists, rather than reading the fantamagastic manual?  Would you not 
> expect that it Does The Right Thing?  Would you not expect that it 
> follows the Law Of Minimal Surprise?
>
> So FWIW I can see where Sergio is coming from.

Linus and other git developers from the early days trained their fingers
to type the command, every once in a while even without thinking, to check
the consistency of the repository back when the lower core part of the git
was still being developed.  Developers who wanted to make sure that git
correctly dealt with packfiles could deliberately trigger their creation
and checked them after they were created carefully, but loose objects are
the ones that are written by various commands from random codepaths.  It
made some technical sense to have a mode that checked only loose objects
from the debugging point of view for that reason.

    Side note.  I think the help description of --full option is wrong (or
    at least stale).  We always look at alternate object store these days
    since e15ef66 (fsck: check loose objects from alternate object stores
    by default, 2009-01-30).  It probably should read "check packed
    objects fully" or something.

The above paragraph is merely a historical background, and in this case
the "history" refers to early-to-mid 2005.  Even for git developers there
no longer is any reason to type "git fsck" in fear of some newly created
objects might be corrupt due to recent change to git these days.

The reason we did not make "--full" the default is probably we trust our
filesystems a bit too much.  At least, we trusted filesystems more than we
trusted the lower core part of git that was under development ;-)

Once a packfile is created and we always use it read-only, there didn't
seem to be much point in suspecting that the underlying filesystems or
disks may corrupt them in such a way that is not caught by the SHA-1
checksum over the entire packfile and per object checksum.  That trust in
the filesystems might have been a good tradeoff between fsck performance
and reliability on platforms git was initially developed on and for, but
it might not be true anymore as we run on more platforms these days.

It probably makes sense to ship 1.7.0 with a version of "fsck" in which
"--full" is the default; it would still accept "--full" but it would be a
no-op.  This would be a backward incompatible change, but the difference
is primarily about performance ("it takes a lot longer than before!"), and
not correctness, so we probably can live with it.  As I already said,
there is not much reason to run "fsck" every five minutes anymore to begin
with (unless your filesystem is so unreliable that it might eat one file
every five minutes, that is).

It probably is also a good idea to add a "--loose" option that does what
"fsck" currently does without "--full".  It is a good name because (1) to
people who do not know the internal of git, it means "check only loosely",
which would discourage them from running "fack" with that option to begin
with, and (2) to others, it exactly tells what the option makes the
command check.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 10:56   ` Sergio Callegari
@ 2009-10-19 19:07     ` Wesley J. Landaker
  2009-10-20  6:24       ` Matthieu Moy
  0 siblings, 1 reply; 21+ messages in thread
From: Wesley J. Landaker @ 2009-10-19 19:07 UTC (permalink / raw)
  To: git; +Cc: Sergio Callegari, Johannes Sixt, Junio C Hamano

On Monday 19 October 2009 04:56:07 Sergio Callegari wrote:
> Johannes Sixt wrote:
> > Sergio Callegari schrieb:
> >> Is there a means to have fsck to a truly full check on the sanity of a
> >> repo?
> >
> > git fsck --full
> >
> > RTFM, please.
>
> Right... sorry for the noise, I mismatched --strict for --full in a
> script.
>
> BTW, the short help for fsck at --full only says "consider objects in
> alternate repositories".

Until I read this thread, I didn't realize you needed --full to check
objects in packs.

Since just every git repository I ever use has 99%+ of it's objects in
packs, this means every time I've run "git fsck" it's essentially been
a no-op and I didn't know it. I imagine this is a common confusion.

Also, having --full mean both "check alternate object pools", and "check
objects in packs" seems to be rolling up two orthogonal issues.

But anyway, here is a patch that at least fixes the short option help to
match the manual and the current behavior:

--- 8< ---
From 8fc3cd68d496bf00faad4f0a7b6ae4fee9437e68 Mon Sep 17 00:00:00 2001
From: Wesley J. Landaker <wjl@icecavern.net>
Date: Mon, 19 Oct 2009 12:48:07 -0600
Subject: [PATCH] Update git fsck --full short description to mention packs

The '--full' option to git fsck does two things:

  1) Check objects in packs
  2) Check alternate objects

This is documented in the git fsck manual; this patch reflects that in
the short git fsck option help message as well.

Signed-off-by: Wesley J. Landaker <wjl@icecavern.net>
---
 builtin-fsck.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/builtin-fsck.c b/builtin-fsck.c
index c58b0e3..63212ea 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -576,7 +576,7 @@ static struct option fsck_opts[] = {
 	OPT_BOOLEAN(0, "root", &show_root, "report root nodes"),
 	OPT_BOOLEAN(0, "cache", &keep_cache_objects, "make index objects head nodes"),
 	OPT_BOOLEAN(0, "reflogs", &include_reflogs, "make reflogs head nodes (default)"),
-	OPT_BOOLEAN(0, "full", &check_full, "also consider alternate objects"),
+	OPT_BOOLEAN(0, "full", &check_full, "also consider packs and alternate objects"),
 	OPT_BOOLEAN(0, "strict", &check_strict, "enable more strict checking"),
 	OPT_BOOLEAN(0, "lost-found", &write_lost_and_found,
 				"write dangling objects in .git/lost-found"),
-- 
1.6.5

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 19:03     ` Junio C Hamano
@ 2009-10-19 19:27       ` Wesley J. Landaker
  2009-10-20 15:41         ` Robin Rosenberg
  2009-10-20  6:26       ` Matthieu Moy
  1 sibling, 1 reply; 21+ messages in thread
From: Wesley J. Landaker @ 2009-10-19 19:27 UTC (permalink / raw)
  To: git

(Not CCing everyone, since this is mostly curiosa in the "using git as it 
was never intended" section):

On Monday 19 October 2009 13:03:42 Junio C Hamano wrote:
> Once a packfile is created and we always use it read-only, there didn't
> seem to be much point in suspecting that the underlying filesystems or
> disks may corrupt them in such a way that is not caught by the SHA-1
> checksum over the entire packfile and per object checksum.  That trust in
> the filesystems might have been a good tradeoff between fsck performance
> and reliability on platforms git was initially developed on and for, but
> it might not be true anymore as we run on more platforms these days.

Filesystems are mostly reliable, but only until your crazy users do strange 
and terrible things. I have a real, non-toy environment where I use this 
stack as a [horrible] workaround for some issues beyond my control:

git -> ext4 -> lvm -> dmcrypt -> loop -> sshfs -> cygwin sshd -> SMB share

Amazingly, this works pretty reliably with many gigabytes of data in a git 
repository, even with the occasional crash because of flakiness with the 
"sshfs -> cygwin sshd" piece of the puzzle. But a good "git fsck" sure 
doesn't hurt in this environment! =)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 19:07     ` Wesley J. Landaker
@ 2009-10-20  6:24       ` Matthieu Moy
  0 siblings, 0 replies; 21+ messages in thread
From: Matthieu Moy @ 2009-10-20  6:24 UTC (permalink / raw)
  To: Wesley J. Landaker; +Cc: git, Sergio Callegari, Johannes Sixt, Junio C Hamano

"Wesley J. Landaker" <wjl@icecavern.net> writes:

> Until I read this thread, I didn't realize you needed --full to check
> objects in packs.

Same here.

> -	OPT_BOOLEAN(0, "full", &check_full, "also consider alternate objects"),
> +	OPT_BOOLEAN(0, "full", &check_full, "also consider packs and alternate objects"),

IMHO, something like this should be applied to "maint", this is kind
of a serious bug indeed. Just check the "alternate" thing, according
to Junio:

Junio C Hamano <gitster@pobox.com> writes:

>     Side note.  I think the help description of --full option is wrong (or
>     at least stale).  We always look at alternate object store these days
>     since e15ef66 (fsck: check loose objects from alternate object stores
>     by default, 2009-01-30).  It probably should read "check packed
>     objects fully" or something.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 19:03     ` Junio C Hamano
  2009-10-19 19:27       ` Wesley J. Landaker
@ 2009-10-20  6:26       ` Matthieu Moy
  2009-10-20  6:45         ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Matthieu Moy @ 2009-10-20  6:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Johannes Schindelin, Johannes Sixt, Sergio Callegari, git

Junio C Hamano <gitster@pobox.com> writes:

> Linus and other git developers from the early days [...]

Thanks for the historical background.

> It probably makes sense to ship 1.7.0 with a version of "fsck" in which
> "--full" is the default; it would still accept "--full" but it would be a
> no-op.

+1

> It probably is also a good idea to add a "--loose" option that does what
> "fsck" currently does without "--full".  It is a good name

+1 too.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20  6:26       ` Matthieu Moy
@ 2009-10-20  6:45         ` Junio C Hamano
  2009-10-20  9:25           ` Alex Riesen
  2009-10-20 18:39           ` git fsck not identifying corrupted packs Nicolas Pitre
  0 siblings, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2009-10-20  6:45 UTC (permalink / raw)
  To: Matthieu Moy; +Cc: Johannes Schindelin, Johannes Sixt, Sergio Callegari, git

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> Junio C Hamano <gitster@pobox.com> writes:
>
>> Linus and other git developers from the early days [...]
>
> Thanks for the historical background.
>
>> It probably makes sense to ship 1.7.0 with a version of "fsck" in which
>> "--full" is the default; it would still accept "--full" but it would be a
>> no-op.
>
> +1
>
>> It probably is also a good idea to add a "--loose" option that does what
>> "fsck" currently does without "--full".  It is a good name
>
> +1 too.

Actually, I changed my mind.  I do not think this so big that we need to
wait for a major version bump.  Why not shoot for 1.6.6?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20  6:45         ` Junio C Hamano
@ 2009-10-20  9:25           ` Alex Riesen
  2009-10-20 10:22             ` Johannes Schindelin
  2009-10-20 18:39           ` git fsck not identifying corrupted packs Nicolas Pitre
  1 sibling, 1 reply; 21+ messages in thread
From: Alex Riesen @ 2009-10-20  9:25 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Johannes Schindelin, Johannes Sixt, Sergio Callegari, git

On Tue, Oct 20, 2009 at 08:45, Junio C Hamano <gitster@pobox.com> wrote:
> Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
>>> It probably is also a good idea to add a "--loose" option that does what
>>> "fsck" currently does without "--full".  It is a good name
>>
>> +1 too.
>
> Actually, I changed my mind.  I do not think this so big that we need to
> wait for a major version bump.  Why not shoot for 1.6.6?

--no-full works

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20  9:25           ` Alex Riesen
@ 2009-10-20 10:22             ` Johannes Schindelin
  2009-10-20 11:56               ` Matthieu Moy
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2009-10-20 10:22 UTC (permalink / raw)
  To: Alex Riesen
  Cc: Junio C Hamano, Matthieu Moy, Johannes Sixt, Sergio Callegari, git

[-- Attachment #1: Type: TEXT/PLAIN, Size: 698 bytes --]

Hi,

On Tue, 20 Oct 2009, Alex Riesen wrote:

> On Tue, Oct 20, 2009 at 08:45, Junio C Hamano <gitster@pobox.com> wrote:
> > Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
> >>> It probably is also a good idea to add a "--loose" option that does what
> >>> "fsck" currently does without "--full".  It is a good name
> >>
> >> +1 too.
> >
> > Actually, I changed my mind.  I do not think this so big that we need to
> > wait for a major version bump.  Why not shoot for 1.6.6?
> 
> --no-full works

It works.  Technically.  For human users, though, --loose-objects-only 
(with a shortcut "--loose") would be better.

Disclaimer: this email was written by a bot and is valid without signature

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20 10:22             ` Johannes Schindelin
@ 2009-10-20 11:56               ` Matthieu Moy
  2009-10-20 18:46                 ` [RFC/PATCH] fsck: default to "git fsck --full" Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Matthieu Moy @ 2009-10-20 11:56 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Alex Riesen, Junio C Hamano, Johannes Sixt, Sergio Callegari, git

Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> Hi,
>
> On Tue, 20 Oct 2009, Alex Riesen wrote:
>
>> On Tue, Oct 20, 2009 at 08:45, Junio C Hamano <gitster@pobox.com> wrote:
>> > Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:
>> >>> It probably is also a good idea to add a "--loose" option that does what
>> >>> "fsck" currently does without "--full".  It is a good name
>> 
>> --no-full works
>
> It works.  Technically.  For human users, though, --loose-objects-only 
> (with a shortcut "--loose") would be better.

OTOH, the advantage of "--no-full" is that it's compatible with
existing Git versions. If I learn Git 1.6.6 with --no-full, and use it
in a script, then my stript works also with older Gits.

But anyway, I think very few people are actually interested in "git
--no-full" (or call it whatever you like), so I don't think this is
very important.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-19 19:27       ` Wesley J. Landaker
@ 2009-10-20 15:41         ` Robin Rosenberg
  2009-10-20 16:20           ` Wesley J. Landaker
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Rosenberg @ 2009-10-20 15:41 UTC (permalink / raw)
  To: Wesley J. Landaker; +Cc: git

måndag 19 oktober 2009 21:27:48 skrev  Wesley J. Landaker:
> (Not CCing everyone, since this is mostly curiosa in the "using git as it
> was never intended" section):
>
> On Monday 19 October 2009 13:03:42 Junio C Hamano wrote:
> > Once a packfile is created and we always use it read-only, there didn't
> > seem to be much point in suspecting that the underlying filesystems or
> > disks may corrupt them in such a way that is not caught by the SHA-1
> > checksum over the entire packfile and per object checksum.  That trust in
> > the filesystems might have been a good tradeoff between fsck performance
> > and reliability on platforms git was initially developed on and for, but
> > it might not be true anymore as we run on more platforms these days.
>
> Filesystems are mostly reliable, but only until your crazy users do strange
> and terrible things. I have a real, non-toy environment where I use this
> stack as a [horrible] workaround for some issues beyond my control:
>
> git -> ext4 -> lvm -> dmcrypt -> loop -> sshfs -> cygwin sshd -> SMB share

The obvious follow up question here is: Why?

-- robin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20 15:41         ` Robin Rosenberg
@ 2009-10-20 16:20           ` Wesley J. Landaker
  0 siblings, 0 replies; 21+ messages in thread
From: Wesley J. Landaker @ 2009-10-20 16:20 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: git

On Tuesday 20 October 2009 09:41:50 Robin Rosenberg wrote:
> måndag 19 oktober 2009 21:27:48 skrev  Wesley J. Landaker:
> > (Not CCing everyone, since this is mostly curiosa in the "using git as
> > it was never intended" section):
[...]
> > Filesystems are mostly reliable, but only until your crazy users do
> > strange and terrible things. I have a real, non-toy environment where I
> > use this stack as a [horrible] workaround for some issues beyond my
> > control:
> >
> > git -> ext4 -> lvm -> dmcrypt -> loop -> sshfs -> cygwin sshd -> SMB
> > share

My main point was to illustrate that having "git fsck" do a REALLY GOOD 
CHECK is still desirable, as we still haven't reached the days of file-
system utopia where nothing ever gets corrupted (even with a smaller, 
simpler stack).

The actual application where I use this stack is because of odd requirements 
and circumstances like data must be physically stored on a particular 
Windows server on the network that uses a weird authentication method that 
samba doesn't support, and it has to go over the network encrypted anyway, 
there are lots of holes in the data, so I want ext4 for the extent support, 
file-size limitations on the target, etc.

It's a really an exotic love-hate mix between an off-by-one-please-no-never-
again kind of situation coupled with a bit of "because I can".

> The obvious follow up question here is: Why?

If you are both nerdy and morbidly curious enough to care, send me a "but, 
no ... really, WHY?!" with the git list CC dropped and we can talk about 
details and/or other crazy stuff. (I don't want to get wildly off-topic on 
this list.)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20  6:45         ` Junio C Hamano
  2009-10-20  9:25           ` Alex Riesen
@ 2009-10-20 18:39           ` Nicolas Pitre
  2009-10-20 20:49             ` Alex Riesen
  1 sibling, 1 reply; 21+ messages in thread
From: Nicolas Pitre @ 2009-10-20 18:39 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Johannes Schindelin, Johannes Sixt, Sergio Callegari, git

On Mon, 19 Oct 2009, Junio C Hamano wrote:

> Actually, I changed my mind.  I do not think this so big that we need to
> wait for a major version bump.  Why not shoot for 1.6.6?

Agreed.  With a prominent note in the release notes to point people at 
it when they don't read release notes and complain that fsck suddenly 
became very slow after they upgraded.


Nicolas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC/PATCH] fsck: default to "git fsck --full"
  2009-10-20 11:56               ` Matthieu Moy
@ 2009-10-20 18:46                 ` Junio C Hamano
  2009-10-20 19:00                   ` Nicolas Pitre
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2009-10-20 18:46 UTC (permalink / raw)
  To: Matthieu Moy
  Cc: Johannes Schindelin, Alex Riesen, Johannes Sixt, Sergio Callegari, git

Linus and other git developers from the early days trained their fingers
to type the command, every once in a while even without thinking, to check
the consistency of the repository back when the lower core part of the git
was still being developed.  Developers who wanted to make sure that git
correctly dealt with packfiles could deliberately trigger their creation
and checked them after they were created carefully, but loose objects are
the ones that are written by various commands from random codepaths.  It
made some technical sense to have a mode that checked only loose objects
from the debugging point of view for that reason.

Even for git developers, there no longer is any reason to type "git fsck"
every five minutes these days, worried that some newly created objects
might be corrupt due to recent change to git.

The reason we did not make "--full" the default is probably we trust our
filesystems a bit too much.  At least, we trusted filesystems more than we
trusted the lower core part of git that was under development.

Once a packfile is created and we always use it read-only, there didn't
seem to be much point in suspecting that the underlying filesystems or
disks may corrupt them in such a way that is not caught by the SHA-1
checksum over the entire packfile and per object checksum.  That trust in
the filesystems might have been a good tradeoff between fsck performance
and reliability on platforms git was initially developed on and for, but
it may not be true anymore as we run on many more platforms these days.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

Matthieu Moy <Matthieu.Moy@grenoble-inp.fr> writes:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> ...
>> On Tue, 20 Oct 2009, Alex Riesen wrote:
>> ...
>>> --no-full works
>>
>> It works.  Technically.  For human users, though, --loose-objects-only 
>> (with a shortcut "--loose") would be better.
>
> OTOH, the advantage of "--no-full" is that it's compatible with
> existing Git versions. If I learn Git 1.6.6 with --no-full, and use it
> in a script, then my stript works also with older Gits.
>
> But anyway, I think very few people are actually interested in "git
> --no-full" (or call it whatever you like), so I don't think this is
> very important.

For human users, I think --full vs --no-full is quite a nice suggestion,
given that we already have advertised --full and people know the option.

Also people know that splicing "no-" after the double dash is often the
way to negate a boolean-looking option.

The actual patch to do this is tiny, but that is just a bonus ;-)

 Documentation/RelNotes-1.6.6.txt |   10 ++++++++++
 Documentation/git-fsck.txt       |    5 +++--
 builtin-fsck.c                   |    2 +-
 3 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
index 5f1fecb..1896e05 100644
--- a/Documentation/RelNotes-1.6.6.txt
+++ b/Documentation/RelNotes-1.6.6.txt
@@ -1,6 +1,13 @@
 GIT v1.6.6 Release Notes
 ========================
 
+In this release, "git fsck" defaults to "git fsck --full" and checks
+packfiles.  If you prefer a quicker check only on loose objects (the
+old default), you can say "git fsck --no-full".  This has been
+supported by 1.5.4 and newer versions of git, so it is safe to write
+it in your script if you use slightly older git on some of your
+machines.
+
 In git 1.7.0, which is planned to be the release after 1.6.6, "git
 push" into a branch that is currently checked out will be refused by
 default.
@@ -38,6 +45,9 @@ Updates since v1.6.5
 
 (usability, bells and whistles)
 
+ * "git fsck" by default checks the packfiles (i.e. "--full" is the
+   default); you can turn it off with "git fsck --no-full".
+
  * "git log --decorate" shows the location of HEAD as well.
 
 (developers)
diff --git a/Documentation/git-fsck.txt b/Documentation/git-fsck.txt
index 287c4fc..6fe9484 100644
--- a/Documentation/git-fsck.txt
+++ b/Documentation/git-fsck.txt
@@ -10,7 +10,7 @@ SYNOPSIS
 --------
 [verse]
 'git fsck' [--tags] [--root] [--unreachable] [--cache] [--no-reflogs]
-	 [--full] [--strict] [--verbose] [--lost-found] [<object>*]
+	 [--[no-]full] [--strict] [--verbose] [--lost-found] [<object>*]
 
 DESCRIPTION
 -----------
@@ -52,7 +52,8 @@ index file, all SHA1 references in .git/refs/*, and all reflogs (unless
 	or $GIT_DIR/objects/info/alternates,
 	and in packed git archives found in $GIT_DIR/objects/pack
 	and corresponding pack subdirectories in alternate
-	object pools.
+	object pools.  This is now default; you can turn it off
+	with --no-full.
 
 --strict::
 	Enable more strict checking, namely to catch a file mode
diff --git a/builtin-fsck.c b/builtin-fsck.c
index c58b0e3..2d88e45 100644
--- a/builtin-fsck.c
+++ b/builtin-fsck.c
@@ -19,7 +19,7 @@ static int show_root;
 static int show_tags;
 static int show_unreachable;
 static int include_reflogs = 1;
-static int check_full;
+static int check_full = 1;
 static int check_strict;
 static int keep_cache_objects;
 static unsigned char head_sha1[20];

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC/PATCH] fsck: default to "git fsck --full"
  2009-10-20 18:46                 ` [RFC/PATCH] fsck: default to "git fsck --full" Junio C Hamano
@ 2009-10-20 19:00                   ` Nicolas Pitre
  2009-10-20 19:11                     ` Junio C Hamano
  0 siblings, 1 reply; 21+ messages in thread
From: Nicolas Pitre @ 2009-10-20 19:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Matthieu Moy, Johannes Schindelin, Alex Riesen, Johannes Sixt,
	Sergio Callegari, git

On Tue, 20 Oct 2009, Junio C Hamano wrote:

> diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
> index 5f1fecb..1896e05 100644
> --- a/Documentation/RelNotes-1.6.6.txt
> +++ b/Documentation/RelNotes-1.6.6.txt
> @@ -1,6 +1,13 @@
>  GIT v1.6.6 Release Notes
>  ========================
>  
> +In this release, "git fsck" defaults to "git fsck --full" and checks
> +packfiles.  If you prefer a quicker check only on loose objects (the
             ^^

Might be worth mentioning explicitly that, because of that change, plain 
fsck is now going to take much longer to complete.


Nicolas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC/PATCH] fsck: default to "git fsck --full"
  2009-10-20 19:00                   ` Nicolas Pitre
@ 2009-10-20 19:11                     ` Junio C Hamano
  0 siblings, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2009-10-20 19:11 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Matthieu Moy, Johannes Schindelin, Alex Riesen,
	Johannes Sixt, Sergio Callegari, git

Nicolas Pitre <nico@fluxnic.net> writes:

> On Tue, 20 Oct 2009, Junio C Hamano wrote:
>
>> diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
>> index 5f1fecb..1896e05 100644
>> --- a/Documentation/RelNotes-1.6.6.txt
>> +++ b/Documentation/RelNotes-1.6.6.txt
>> @@ -1,6 +1,13 @@
>>  GIT v1.6.6 Release Notes
>>  ========================
>>  
>> +In this release, "git fsck" defaults to "git fsck --full" and checks
>> +packfiles.  If you prefer a quicker check only on loose objects (the
>              ^^
>
> Might be worth mentioning explicitly that, because of that change, plain 
> fsck is now going to take much longer to complete.

Sounds fair; thanks.

diff --git a/Documentation/RelNotes-1.6.6.txt b/Documentation/RelNotes-1.6.6.txt
index 0adf998..fa0e11a 100644
--- a/Documentation/RelNotes-1.6.6.txt
+++ b/Documentation/RelNotes-1.6.6.txt
@@ -2,7 +2,8 @@ GIT v1.6.6 Release Notes
 ========================
 
 In this release, "git fsck" defaults to "git fsck --full" and checks
-packfiles.  If you prefer a quicker check only on loose objects (the
+packfiles, and because of this it will take much longer to complete
+than before.  If you prefer a quicker check only on loose objects (the
 old default), you can say "git fsck --no-full".  This has been
 supported by 1.5.4 and newer versions of git, so it is safe to write
 it in your script even if you use slightly older git on some of your

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: git fsck not identifying corrupted packs
  2009-10-20 18:39           ` git fsck not identifying corrupted packs Nicolas Pitre
@ 2009-10-20 20:49             ` Alex Riesen
  0 siblings, 0 replies; 21+ messages in thread
From: Alex Riesen @ 2009-10-20 20:49 UTC (permalink / raw)
  To: Nicolas Pitre
  Cc: Junio C Hamano, Matthieu Moy, Johannes Schindelin, Johannes Sixt,
	Sergio Callegari, git

On Tue, Oct 20, 2009 at 20:39, Nicolas Pitre <nico@fluxnic.net> wrote:
> On Mon, 19 Oct 2009, Junio C Hamano wrote:
>
>> Actually, I changed my mind.  I do not think this so big that we need to
>> wait for a major version bump.  Why not shoot for 1.6.6?
>
> Agreed.  With a prominent note in the release notes to point people at
> it when they don't read release notes and complain that fsck suddenly
> became very slow after they upgraded.

I have a feeling that it either wont be noticed at all (i.e., I have run fsck
with something other than --full only one time in my whole bash history,
and never shown it otherwise to anyone else), or people will immediately
like it ("Oh, finallly! Now that feels like it is doing something!" :)

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2009-10-20 20:49 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-19  7:56 git fsck not identifying corrupted packs Sergio Callegari
2009-10-19  9:11 ` Johannes Sixt
2009-10-19 10:04   ` Johannes Schindelin
2009-10-19 19:03     ` Junio C Hamano
2009-10-19 19:27       ` Wesley J. Landaker
2009-10-20 15:41         ` Robin Rosenberg
2009-10-20 16:20           ` Wesley J. Landaker
2009-10-20  6:26       ` Matthieu Moy
2009-10-20  6:45         ` Junio C Hamano
2009-10-20  9:25           ` Alex Riesen
2009-10-20 10:22             ` Johannes Schindelin
2009-10-20 11:56               ` Matthieu Moy
2009-10-20 18:46                 ` [RFC/PATCH] fsck: default to "git fsck --full" Junio C Hamano
2009-10-20 19:00                   ` Nicolas Pitre
2009-10-20 19:11                     ` Junio C Hamano
2009-10-20 18:39           ` git fsck not identifying corrupted packs Nicolas Pitre
2009-10-20 20:49             ` Alex Riesen
2009-10-19 10:56   ` Sergio Callegari
2009-10-19 19:07     ` Wesley J. Landaker
2009-10-20  6:24       ` Matthieu Moy
2009-10-19 18:36   ` Gabor Gombas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.