All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <novalis@novalis.org>
To: git@vger.kernel.org
Cc: Christian Couder <christian.couder@gmail.com>
Subject: Simultaneous gc and repack
Date: Thu, 13 Apr 2017 13:31:38 -0400	[thread overview]
Message-ID: <1492104698.1527.79.camel@novalis.org> (raw)

Git gc locks the repository (using a gc.pid file) so that other gcs
don't run concurrently. But git repack doesn't respect this lock, so
it's possible to have a repack running at the same time as a gc.  This
makes the gc sad when its packs are deleted out from under it with:
"fatal: ./objects/pack/pack-$sha.pack cannot be accessed".  Then it
dies, leaving a large temp file hanging around.

Does the following seem reasonable?

1. Make git repack, by default, check for a gc.pid file (using the same
logic as git gc itself does).
2. Provide a --force option to git repack to ignore said check.
3. Make git gc provide that --force option when it calls repack under
its own lock.

This came up because Gitlab runs a repack after every N pushes and a gc
after every M commits, where M >> N.  Sometimes, when pushes come in
rapidly, the repack catches the still-running gc and the above badness
happens.  At least, that's my understanding: I don't run our Gitlab
servers, but I talked to the person who does and that's what he said.

Of course, Gitlab could do its own locking, but the general approach
seems like it would help other folks too.

             reply	other threads:[~2017-04-13 17:31 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-13 17:31 David Turner [this message]
2017-04-13 18:03 ` Simultaneous gc and repack Jacob Keller
2017-04-13 18:08   ` Martin Fick
2017-04-13 18:28     ` David Turner
2017-04-13 18:35       ` Jacob Keller
2017-04-13 18:36       ` Martin Fick
2017-04-13 19:05         ` David Turner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1492104698.1527.79.camel@novalis.org \
    --to=novalis@novalis.org \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.