All of lore.kernel.org
 help / color / mirror / Atom feed
* Massive repository corruptions
@ 2010-07-13  1:56 Enrico Weigelt
  2010-07-13  3:23 ` Avery Pennarun
  0 siblings, 1 reply; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-13  1:56 UTC (permalink / raw)
  To: git


Hi folks,


I've just reorganized several repositories (eg. splitted off a large 
repo into several small ones), and then I had massive corruptions
(broken pack files) in the new repos (after they already had been clean).

Maybe it has something to do with a cronjob which frequently GC's
all the repos, and it could get even worse if the fs sometimes 
goes full within this process.

Could multiple GCs running on the same repo cause this ?


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  1:56 Massive repository corruptions Enrico Weigelt
@ 2010-07-13  3:23 ` Avery Pennarun
  2010-07-13  5:03   ` Enrico Weigelt
  0 siblings, 1 reply; 13+ messages in thread
From: Avery Pennarun @ 2010-07-13  3:23 UTC (permalink / raw)
  To: weigelt; +Cc: git

On Mon, Jul 12, 2010 at 9:56 PM, Enrico Weigelt <weigelt@metux.de> wrote:
> I've just reorganized several repositories (eg. splitted off a large
> repo into several small ones), and then I had massive corruptions
> (broken pack files) in the new repos (after they already had been clean).
>
> Maybe it has something to do with a cronjob which frequently GC's
> all the repos, and it could get even worse if the fs sometimes
> goes full within this process.
>
> Could multiple GCs running on the same repo cause this ?

Multiple simultaneous gc's shouldn't be a problem - git locks things
as it needs them.  Plus, git only removes objects after it has safely
created a new packfile that contains them.  Maybe a filesystem filling
up could cause a problem, but git should be detecting that if it
happens (maybe there's a bug that causes it to not notice, though).

You could experience corruption if your computer crashed before
everything was synced to disk.

Do you know which packfiles are corrupted?  Does 'git index-pack' on
the files reveal anything?

Be sure to make a backup copy of your corrupted repositories before
doing any experiments, or you might accidentally fix the problem and
make it harder to trace.

Good luck.

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  3:23 ` Avery Pennarun
@ 2010-07-13  5:03   ` Enrico Weigelt
  2010-07-13  5:31     ` Enrico Weigelt
  2010-07-13  9:40     ` Avery Pennarun
  0 siblings, 2 replies; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-13  5:03 UTC (permalink / raw)
  To: git

* Avery Pennarun <apenwarr@gmail.com> wrote:

> Multiple simultaneous gc's shouldn't be a problem - git locks things
> as it needs them.  Plus, git only removes objects after it has safely
> created a new packfile that contains them.  Maybe a filesystem filling
> up could cause a problem, but git should be detecting that if it
> happens (maybe there's a bug that causes it to not notice, though).

Okay.

> You could experience corruption if your computer crashed before
> everything was synced to disk.

No machine crash, and no sign of filesystem or disk problems
(according to kernel log).

> Do you know which packfiles are corrupted?  Does 'git index-pack' on
> the files reveal anything?

git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
error: inflate: data stream error (incorrect data check)
fatal: pack has bad object at offset 37075832: inflate returned -3

(that's essentially the same git-gc says)


git@blackwidow ~/metux/work.git/pack $ git unpack-objects -r < pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack 
error: inflate: data stream error (incorrect data check)
error: inflate returned -3

error: inflate: data stream error (incorrect data check)
error: inflate returned -3

Unpacking objects: 100% (1223/1223), done.
fatal: final sha1 did not match



cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  5:03   ` Enrico Weigelt
@ 2010-07-13  5:31     ` Enrico Weigelt
  2010-07-13  6:46       ` Enrico Weigelt
  2010-07-13 10:17       ` Valeo de Vries
  2010-07-13  9:40     ` Avery Pennarun
  1 sibling, 2 replies; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-13  5:31 UTC (permalink / raw)
  To: git

* Enrico Weigelt <weigelt@metux.de> wrote:

<snip>

What's strange: 

when copying pack files from another machine to this box and
run git index-pack there, it fails with the same error. 

also: pushing into a new (bare) repo sometimes fails with 
inflate errors, sometimes succeeds but leaves an broken packfile.


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  5:31     ` Enrico Weigelt
@ 2010-07-13  6:46       ` Enrico Weigelt
  2010-07-13 10:17       ` Valeo de Vries
  1 sibling, 0 replies; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-13  6:46 UTC (permalink / raw)
  To: git

* Enrico Weigelt <weigelt@metux.de> wrote:
> * Enrico Weigelt <weigelt@metux.de> wrote:
> 
> <snip>
> 
> What's strange: 
> 
> when copying pack files from another machine to this box and
> run git index-pack there, it fails with the same error. 
> 
> also: pushing into a new (bare) repo sometimes fails with 
> inflate errors, sometimes succeeds but leaves an broken packfile.

Interesting: if I limit the pack size on the local repository,
and manually copy them over via scp, git index-pack runs fine
on them.

Subsequent push doesnt seem to recognize the already transferred
packs and still sends the big one which gets broken, but running
git-gc multiple times seems to clean up the mess.

Is there any way for limiting the pack size on push ?
(pack.packSizeLimit only affects git-repack, not remote transfers).


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  5:03   ` Enrico Weigelt
  2010-07-13  5:31     ` Enrico Weigelt
@ 2010-07-13  9:40     ` Avery Pennarun
  2010-07-13 10:22       ` Enrico Weigelt
  1 sibling, 1 reply; 13+ messages in thread
From: Avery Pennarun @ 2010-07-13  9:40 UTC (permalink / raw)
  To: weigelt; +Cc: git

On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <weigelt@metux.de> wrote:
> * Avery Pennarun <apenwarr@gmail.com> wrote:
>> Do you know which packfiles are corrupted?  Does 'git index-pack' on
>> the files reveal anything?
>
> git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
> error: inflate: data stream error (incorrect data check)
> fatal: pack has bad object at offset 37075832: inflate returned -3
>
> (that's essentially the same git-gc says)

What's the size of that .pack file?

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  5:31     ` Enrico Weigelt
  2010-07-13  6:46       ` Enrico Weigelt
@ 2010-07-13 10:17       ` Valeo de Vries
  1 sibling, 0 replies; 13+ messages in thread
From: Valeo de Vries @ 2010-07-13 10:17 UTC (permalink / raw)
  To: weigelt; +Cc: git

On 13 July 2010 06:31, Enrico Weigelt <weigelt@metux.de> wrote:
> * Enrico Weigelt <weigelt@metux.de> wrote:
>
> <snip>
>
> What's strange:
>
> when copying pack files from another machine to this box and
> run git index-pack there, it fails with the same error.
>
> also: pushing into a new (bare) repo sometimes fails with
> inflate errors, sometimes succeeds but leaves an broken packfile.

The pack files you copied over from another machine, were they sane
(i.e. non-corrupt)? If so, that perhaps smells like your hard drive
could be on its last legs...

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13  9:40     ` Avery Pennarun
@ 2010-07-13 10:22       ` Enrico Weigelt
  2010-07-13 17:59         ` Avery Pennarun
  0 siblings, 1 reply; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-13 10:22 UTC (permalink / raw)
  To: git

* Avery Pennarun <apenwarr@gmail.com> wrote:
> On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <weigelt@metux.de> wrote:
> > * Avery Pennarun <apenwarr@gmail.com> wrote:
> >> Do you know which packfiles are corrupted?  Does 'git index-pack' on
> >> the files reveal anything?
> >
> > git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
> > error: inflate: data stream error (incorrect data check)
> > fatal: pack has bad object at offset 37075832: inflate returned -3
> >
> > (that's essentially the same git-gc says)
> 
> What's the size of that .pack file?

Somewhat over 300MB. 

Lowering the packfile size seemed to help.
(but I still only can do that for git-repack, not remote transfers)


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13 10:22       ` Enrico Weigelt
@ 2010-07-13 17:59         ` Avery Pennarun
  2010-07-14 13:22           ` Enrico Weigelt
  0 siblings, 1 reply; 13+ messages in thread
From: Avery Pennarun @ 2010-07-13 17:59 UTC (permalink / raw)
  To: weigelt; +Cc: git

On Tue, Jul 13, 2010 at 6:22 AM, Enrico Weigelt <weigelt@metux.de> wrote:
> * Avery Pennarun <apenwarr@gmail.com> wrote:
>> On Tue, Jul 13, 2010 at 1:03 AM, Enrico Weigelt <weigelt@metux.de> wrote:
>> > * Avery Pennarun <apenwarr@gmail.com> wrote:
>> >> Do you know which packfiles are corrupted?  Does 'git index-pack' on
>> >> the files reveal anything?
>> >
>> > git@blackwidow ~/metux/work.git/pack $ git index-pack pack-3b6cbd5dc5f54cf390cfaa479cac6a99d7401375.pack
>> > error: inflate: data stream error (incorrect data check)
>> > fatal: pack has bad object at offset 37075832: inflate returned -3
>> >
>> > (that's essentially the same git-gc says)
>>
>> What's the size of that .pack file?
>
> Somewhat over 300MB.
>
> Lowering the packfile size seemed to help.
> (but I still only can do that for git-repack, not remote transfers)

If you got corruption at offset 37,075,832 (about 37 megs) and the
pack is over 300 megs, then the file itself is corrupted right in the
middle (not truncated) and this couldn't have been caused by disk full
errors.  Either you have memory corruption problems, or disk
corruption problems, or filesystem corruption problems.  You'd better
watch out.

Forcing the packfile size to be smaller probably just changes your
memory access patterns and moves your errors around.  But it doesn't
sound like a git bug at this point.

Avery

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-13 17:59         ` Avery Pennarun
@ 2010-07-14 13:22           ` Enrico Weigelt
  2010-08-05 17:31             ` Enrico Weigelt
  0 siblings, 1 reply; 13+ messages in thread
From: Enrico Weigelt @ 2010-07-14 13:22 UTC (permalink / raw)
  To: git

* Avery Pennarun <apenwarr@gmail.com> wrote:

> If you got corruption at offset 37,075,832 (about 37 megs) and the
> pack is over 300 megs, then the file itself is corrupted right in the
> middle (not truncated) and this couldn't have been caused by disk full
> errors.  Either you have memory corruption problems, or disk
> corruption problems, or filesystem corruption problems.  You'd better
> watch out.

hmm, I have no signs of any hw corruption, but I had a patched
version of zlib installed. Maybe some of my patches broke it, 
so some strange overflow or sth like that caused that trouble.

Meanwhile, after reinstalling (unpatched) zlib and recloning the
broken repos, everything seems fine again. Maybe some of you would
like to have a look at my zlib patches ;-o


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
  2010-07-14 13:22           ` Enrico Weigelt
@ 2010-08-05 17:31             ` Enrico Weigelt
       [not found]               ` <AANLkTikFypx3e-=+8J2925A++_jY-aJCDYHHw6dry5s6@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Enrico Weigelt @ 2010-08-05 17:31 UTC (permalink / raw)
  To: git

* Enrico Weigelt <weigelt@metux.de> wrote:

> hmm, I have no signs of any hw corruption, but I had a patched
> version of zlib installed. Maybe some of my patches broke it, 
> so some strange overflow or sth like that caused that trouble.
> 
> Meanwhile, after reinstalling (unpatched) zlib and recloning the
> broken repos, everything seems fine again. Maybe some of you would
> like to have a look at my zlib patches ;-o

This only seemed to help for a while. Again have trouble w/ broken 
repos. But the strange thing: seems to affect only large ones. For 
example, could got clone and repeatedly gc --aggressive the git
source w/ trouble.

If it *is* any hw problem (which isnt that unplausible since that
machine is the only one making trouble now), how can I detect it ?
Shouldnt broken memory or disk raise some kernel log message ?


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
       [not found]               ` <AANLkTikFypx3e-=+8J2925A++_jY-aJCDYHHw6dry5s6@mail.gmail.com>
@ 2010-08-05 20:10                 ` Enrico Weigelt
       [not found]                   ` <AANLkTi=5HVQ2kSEt7O+OXdMRtvy8amufpFKgRpj2VLEy@mail.gmail.com>
  0 siblings, 1 reply; 13+ messages in thread
From: Enrico Weigelt @ 2010-08-05 20:10 UTC (permalink / raw)
  To: git

* Jussi Sirpoma <jussi.sirpoma@gmail.com> wrote:

> I once had a difficult to trace memory problem on a box when one of the last
> memory banks
> was bad. It was only used during high load situations while compiling the
> kernel or something
> similar. The problem was finally pinpointed by memtest86 which stresses all
> memory.

hmm, you know some way to do a memory-stresstest w/o rebooting ?


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Massive repository corruptions
       [not found]                   ` <AANLkTi=5HVQ2kSEt7O+OXdMRtvy8amufpFKgRpj2VLEy@mail.gmail.com>
@ 2010-08-05 20:42                     ` Enrico Weigelt
  0 siblings, 0 replies; 13+ messages in thread
From: Enrico Weigelt @ 2010-08-05 20:42 UTC (permalink / raw)
  To: git

* Jussi Sirpoma <jussi.sirpoma@sks.fi> wrote:

> Sorry not really. Maybe compiling kernel with lots of jobs would reveal some
> problems?

okay, I'll have a try :)

BTW: I'm currently recreating one of the broken repos (mozilla,
which might be large enough for stresstest ;-)) by adding and
fetching the remotes step by step. Reposity size now about 250M
(reduced the packsize to 32M, since this already helped on some
other repos) - yet no breaks occoured.

Let's see where it goes ...


cu
-- 
----------------------------------------------------------------------
 Enrico Weigelt, metux IT service -- http://www.metux.de/

 phone:  +49 36207 519931  email: weigelt@metux.de
 mobile: +49 151 27565287  icq:   210169427         skype: nekrad666
----------------------------------------------------------------------
 Embedded-Linux / Portierung / Opensource-QM / Verteilte Systeme
----------------------------------------------------------------------

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-08-05 20:49 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-07-13  1:56 Massive repository corruptions Enrico Weigelt
2010-07-13  3:23 ` Avery Pennarun
2010-07-13  5:03   ` Enrico Weigelt
2010-07-13  5:31     ` Enrico Weigelt
2010-07-13  6:46       ` Enrico Weigelt
2010-07-13 10:17       ` Valeo de Vries
2010-07-13  9:40     ` Avery Pennarun
2010-07-13 10:22       ` Enrico Weigelt
2010-07-13 17:59         ` Avery Pennarun
2010-07-14 13:22           ` Enrico Weigelt
2010-08-05 17:31             ` Enrico Weigelt
     [not found]               ` <AANLkTikFypx3e-=+8J2925A++_jY-aJCDYHHw6dry5s6@mail.gmail.com>
2010-08-05 20:10                 ` Enrico Weigelt
     [not found]                   ` <AANLkTi=5HVQ2kSEt7O+OXdMRtvy8amufpFKgRpj2VLEy@mail.gmail.com>
2010-08-05 20:42                     ` Enrico Weigelt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.