All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ulrich Spoerlein <uqs@FreeBSD.org>
To: git@vger.kernel.org, Jeff King <peff@peff.net>
Cc: Ed Maste <emaste@freebsd.org>, Junio C Hamano <gitster@pobox.com>
Subject: Re: git fast-import crashing on big imports
Date: Wed, 18 Jan 2017 15:01:17 +0100	[thread overview]
Message-ID: <20170118140117.GK4426@acme.spoerlein.net> (raw)
In-Reply-To: <20170112082138.GJ4426@acme.spoerlein.net>

Yo Jeff, your commit 8261e1f139db3f8aa6f9fd7d98c876cbeb0f927c from Aug
22nd, that changes delta_base_cache to use hashmap.h is the culprit for
git fast-import crashing on large imports.

Please read below, you can find a 55G SVN dump that should show the
problem after a couple of minutes to less than an hour. Please also see
similar issues from 2009 and 2011. This seems to be a rather fragile
part of the code, could you add unit tests that make sure this
regression is not re-introduce again once you fix it? Thanks!

I'm happy to test any patches that you can provide.

Cheers,
Uli

On Do., 2017-01-12 at 09:21:38 +0100, Ulrich Spörlein wrote:
> Hey,
> 
> the FreeBSD svn2git conversion is crashing somewhat
> non-deterministically during its long conversion process. From memory,
> this was not as bad is it is with more recent versions of git (but I
> can't be sure, really).
> 
> I have a dump file that you can grab at
> http://scan.spoerlein.net/pub/freebsd-base.dump.xz (19G, 55G uncompressed)
> that shows this problem after a couple of minutes of runtime. The caveat is
> that for another member of the team on a different machine the crashes are on
> different revisions.
> 
> Googling around I found two previous threads that were discussing
> problems just like this (memory corruption, bad caching, etc)
> 
> https://www.spinics.net/lists/git/msg93598.html  from 2009
> and
> http://git.661346.n2.nabble.com/long-fast-import-errors-out-quot-failed-to-apply-delta-quot-td6557884.html
> from 2011
> 
> % git fast-import --stats < ../freebsd-base.dump
> ...
> progress SVN r49318 branch master = :49869
> progress SVN r49319 branch stable/3 = :49870
> progress SVN r49320 branch master = :49871
> error: failed to apply delta
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> error: bad offset for revindex
> fatal: Can't load tree b35ae4e9c2a41677e84a3f14bed09f584c3ff25e
> fast-import: dumping crash report to fast_import_crash_29613
> 
> 
> fast-import crash report:
>     fast-import process: 29613
>     parent process     : 29612
>     at 2017-01-11 19:33:37 +0000
> 
> fatal: Can't load tree b35ae4e9c2a41677e84a3f14bed09f584c3ff25e
> 
> 
> git fsck shows a somewhat incomplete pack file (I guess that's expected if the
> process dies mid-stream?)
> 
> % git fsck
> Checking object directories: 100% (256/256), done.
> error: failed to apply delta6/614500)
> error: cannot unpack d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack at offset 122654805
> error: failed to apply delta
> error: failed to read delta base object d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f at offset 122654805 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack
> error: cannot unpack 8523bde63ef34bef725347994fdaec996d756510 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack at offset 122671596
> error: failed to apply delta0/614500)
> error: failed to read delta base object d1d7ee1f81c6767c5e0f75d14d400d7512a85a0f at offset 122654805 from ./objects/pack/pack-e28fcea43fc221d2ebe92857b484da58bb888237.pack
> ...
> 
> 
> Any comments on whether the original problems from 2009 and 2011 were ever
> fixed and committed?
> 
> Some more facts:
> - git version 2.11.0
> - I don't recall these sorts of crashes with a git from 2-3 years ago
> - adding more checkpoints does help, but not fix the problem, it merely shifts
>   the crashes around to different revisions
> - incremental runs of the conversion *will* complete most of the time, but
>   depending on how often checkpoints are used, I've seen it croak on specific
>   commits and not being able to progress further :(
> 
> Thanks for any pointers or things to try!
> Cheers
> Uli

  reply	other threads:[~2017-01-18 14:31 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-12  8:21 git fast-import crashing on big imports Ulrich Spörlein
2017-01-18 14:01 ` Ulrich Spoerlein [this message]
2017-01-18 14:38   ` Jeff King
2017-01-18 20:06     ` Jeff King
     [not found]       ` <CAJ9axoSzZJXD4RKvVx+D60dw4sakMJWgNmOP-cREWA53Ae3C3w@mail.gmail.com>
2017-01-18 20:27         ` Jeff King
2017-01-18 21:51           ` Jeff King
2017-01-19 14:03             ` Ulrich Spörlein
2017-01-19 16:33               ` [PATCH] clear_delta_base_cache(): don't modify hashmap while iterating Jeff King
2017-01-19 19:16                 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170118140117.GK4426@acme.spoerlein.net \
    --to=uqs@freebsd.org \
    --cc=emaste@freebsd.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.