All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Turner <novalis@novalis.org>
To: Jeff King <peff@peff.net>
Cc: Junio C Hamano <gitster@pobox.com>,
	Duy Nguyen <pclouds@gmail.com>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: "disabling bitmap writing, as some objects are not being packed"?
Date: Wed, 08 Feb 2017 17:14:03 -0500	[thread overview]
Message-ID: <1486592043.1938.82.camel@novalis.org> (raw)
In-Reply-To: <20170208190858.rjoqehbhyizlwg5q@sigill.intra.peff.net>

On Wed, 2017-02-08 at 14:08 -0500, Jeff King wrote:
> On Wed, Feb 08, 2017 at 02:05:42PM -0500, David Turner wrote:
> 
> > On Wed, 2017-02-08 at 09:44 -0800, Junio C Hamano wrote:
> > > Duy Nguyen <pclouds@gmail.com> writes:
> > > 
> > > > On second thought, perhaps gc.autoDetach should default to false if
> > > > there's no tty, since its main point it to stop breaking interactive
> > > > usage. That would make the server side happy (no tty there).
> > > 
> > > Sounds like an idea, but wouldn't that keep the end-user coming over
> > > the network waiting after accepting a push until the GC completes, I
> > > wonder.  If an impatient user disconnects, would that end up killing
> > > an ongoing GC?  etc.
> > 
> > Regardless, it's impolite to keep the user waiting. So, I think we
> > should just not write the "too many unreachable loose objects" message
> > if auto-gc is on.  Does that sound OK?
> 
> I thought the point of that message was to prevent auto-gc from kicking
> in over and over again due to objects that won't actually get pruned.
> 
> I wonder if you'd want to either bump the auto-gc object limit, or
> possibly reduce the gc.pruneExpire limit to keep this situation from
> coming up in the first place (or at least mitigating the amount of time
> it's the case).

Auto-gc might not succeed in pruning objects, but it will at least
reduce the number of packs, which is super-important for performance.

I think the intent of automatic gc is to have a git repository be
relatively low-maintenance from a server-operator perspective.  (Side
note: it's fairly trivial for a user with push access to mess with the
check simply by pushing a bunch of objects whose shas start with 17).
It seems odd that git gets itself into a state where it refuses to do
any maintenance just because at some point some piece of the maintenance
didn't make progress.

Sure, I could change my configuration, but that doesn't help the other
folks (e.g. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813084 )
who run into this.

I have three thoughts on this:

Idea 1: when gc --auto would issue this message, instead it could create
a file named gc.too-much-garbage (instead of gc.log), with this message.
If that file exists, and it is less than one day (?) old, then we don't
attempt to do a full gc; instead we just run git repack -A -d.  (If it's
more than one day old, we just delete it and continue anyway).

Idea 2 : Like idea 1, but instead of repacking, just smash the existing
packs together into one big pack.  In other words, don't consider
dangling objects, or recompute deltas.  Twitter has a tool called "git
combine-pack" that does this:
https://github.com/dturner-tw/git/blob/dturner/journal/builtin/combine-pack.c

That's less space-efficient than a true repack, but it's no worse than
having the packs separate, and it's a win for read performance because
there's no need to do a linear search over N packs to find an object.

Idea 3: As I suggested last time, separate fatal and non-fatal errors.
If gc fails because of EIO or something, we probably don't want to touch
the disk anymore. But here, the worst consequence is that we waste some
processing power. And it's better to occasionally waste processing power
in a non-interactive setting than it is to do so when a user will be
blocked on it.  So non-fatal warnings should go to gc.log, and fatal
errors should go to gc.fatal.  gc.log won't block gc from running. I
think this is my preferred option.


  reply	other threads:[~2017-02-08 22:14 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-16 21:05 "disabling bitmap writing, as some objects are not being packed"? David Turner
2016-12-16 21:27 ` Jeff King
2016-12-16 21:28 ` Junio C Hamano
2016-12-16 21:32   ` Jeff King
2016-12-16 21:40     ` David Turner
2016-12-16 21:49       ` Jeff King
2016-12-16 23:59         ` [PATCH] pack-objects: don't warn about bitmaps on incremental pack David Turner
2016-12-17  4:04           ` Jeff King
2016-12-19 16:03             ` David Turner
2016-12-17  7:50   ` "disabling bitmap writing, as some objects are not being packed"? Duy Nguyen
2017-02-08  1:03     ` David Turner
2017-02-08  6:45       ` Duy Nguyen
2017-02-08  8:24         ` David Turner
2017-02-08  8:37           ` Duy Nguyen
2017-02-08 17:44             ` Junio C Hamano
2017-02-08 19:05               ` David Turner
2017-02-08 19:08                 ` Jeff King
2017-02-08 22:14                   ` David Turner [this message]
2017-02-08 23:00                     ` Jeff King
2017-02-09  0:18                       ` Junio C Hamano
2017-02-09  1:12                         ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1486592043.1938.82.camel@novalis.org \
    --to=novalis@novalis.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=pclouds@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.