Git Mailing List Archive on lore.kernel.org
 help / color / Atom feed
From: "SZEDER Gábor" <szeder.dev@gmail.com>
To: "Ævar Arnfjörð Bjarmason" <avarab@gmail.com>
Cc: "Derrick Stolee" <stolee@gmail.com>,
	"Git List" <git@vger.kernel.org>,
	"Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Subject: Re: We should add a "git gc --auto" after "git clone" due to commit graph
Date: Wed, 3 Oct 2018 18:59:26 +0200
Message-ID: <20181003165926.GR23446@localhost> (raw)
In-Reply-To: <87murvgir6.fsf@evledraar.gmail.com>

On Wed, Oct 03, 2018 at 05:19:41PM +0200, Ævar Arnfjörð Bjarmason wrote:
> >> >> >> So we should make "git gc --auto" be run on clone,
> >> >> >
> >> >> > There is no garbage after 'git clone'...
> >> >>
> >> >> "git gc" is really "git gc-or-create-indexes" these days.
> >> >
> >> > Because it happens to be convenient to create those indexes at
> >> > gc-time.  But that should not be an excuse to run gc when by
> >> > definition no gc is needed.
> >>
> >> Ah, I thought you just had an objection to the "gc" name being used for
> >> non-gc stuff,
> >
> > But you thought right, I do have an objection against that.  'git gc'
> > should, well, collect garbage.  Any non-gc stuff is already violating
> > separation of concerns.
> 
> Ever since git-gc was added back in 30f610b7b0 ("Create 'git gc' to
> perform common maintenance operations.", 2006-12-27) it has been
> described as:
> 
>     git-gc - Cleanup unnecessary files and optimize the local repository
> 
> Creating these indexes like the commit-graph falls under "optimize the
> local repository",

But it doesn't fall under "cleanup unnecessary files", which the
commit-graph file is, since, strictly speaking, it's purely
optimization.

That description came about, because cleaning up unnecessary files,
notably combining lots of loose refs into a single packed-refs file
and combining lots of loose objects and pack files into a single pack
file, could not only make the repository smaller (barring too many
exploding unreachable objects), but, as it turned out, could also make
Git operations in that repository faster.

To me, the main goal of the command is cleanup.  Optimization, however
beneficial, is its side effect, and I assume the "optimize" part was
added to the description mainly to inform and "encourage" users.
After all, the command is called 'git gc', not 'git optimize-repo'.

> and 3rd party tools (e.g. the repo tool doing this
> came up on list recently) have been calling "gc --auto" with this
> assumption.
> 
> >>  but if you mean we shouldn't do a giant repack right after
> >> clone I agree.
> >
> > And, I also mean that since 'git clone' knows that there can't
> > possibly be any garbage in the first place, then it shouldn't call 'gc
> > --auto' at all.  However, since it also knows that there is a lot of
> > new stuff, then it should create a commit-graph if enabled.
> 
> Is this something you think just because the tool isn't called
> git-gc-and-optimzie, or do you think this regardless of what it's
> called?

Well, that still has 'gc' in it...

> I don't see how splitting up the entry points for "detect if we need to
> cleanup or optimize the repo" leaves us with a better codebase for the
> reasons noted in
> https://public-inbox.org/git/87pnwrgll2.fsf@evledraar.gmail.com/

Such a separation would be valuable for those having gc.auto = 0 in
their config.  Or, in general, to have a clearly marked entry point to
update all the enabled "purely-optimization" files without 'gc'
exploding a bunch of "just-became-unreachable" objects from deleted
reflog entries and packfiles, or without performing a comparatively
expensive repacking.  Note the "clearly marked"; I don't think
teaching 'gc [--auto]' various tricks to only create/update these
files without doing what it is fundamentally supposed to do qualifies
for that.



  reply index

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-03 13:23 Ævar Arnfjörð Bjarmason
2018-10-03 13:36 ` SZEDER Gábor
2018-10-03 13:42   ` Derrick Stolee
2018-10-03 14:18     ` Ævar Arnfjörð Bjarmason
2018-10-03 14:01   ` Ævar Arnfjörð Bjarmason
2018-10-03 14:17     ` SZEDER Gábor
2018-10-03 14:22       ` Ævar Arnfjörð Bjarmason
2018-10-03 14:53         ` SZEDER Gábor
2018-10-03 15:19           ` Ævar Arnfjörð Bjarmason
2018-10-03 16:59             ` SZEDER Gábor [this message]
2018-10-05  6:09               ` Junio C Hamano
2018-10-10 22:07                 ` SZEDER Gábor
2018-10-10 23:01                   ` Ævar Arnfjörð Bjarmason
2018-10-03 19:08           ` Stefan Beller
2018-10-03 19:21             ` Jeff King
2018-10-03 20:35               ` Ævar Arnfjörð Bjarmason
2018-10-03 17:47         ` Stefan Beller
2018-10-03 18:47           ` Ævar Arnfjörð Bjarmason
2018-10-03 18:51             ` Jeff King
2018-10-03 18:59               ` Derrick Stolee
2018-10-03 19:18                 ` Jeff King
2018-10-08 16:41                   ` SZEDER Gábor
2018-10-08 16:57                     ` Derrick Stolee
2018-10-08 18:10                       ` SZEDER Gábor
2018-10-08 18:29                         ` Derrick Stolee
2018-10-09  3:08                           ` Jeff King
2018-10-09 13:48                             ` Bloom Filters (was Re: We should add a "git gc --auto" after "git clone" due to commit graph) Derrick Stolee
2018-10-09 18:45                               ` Ævar Arnfjörð Bjarmason
2018-10-09 18:46                               ` Jeff King
2018-10-09 19:03                                 ` Derrick Stolee
2018-10-09 21:14                                   ` Jeff King
2018-10-09 23:12                                     ` Bloom Filters Jeff King
2018-10-09 23:13                                       ` [PoC -- do not apply 1/3] initial tree-bitmap proof of concept Jeff King
2018-10-09 23:14                                       ` [PoC -- do not apply 2/3] test-tree-bitmap: add "dump" mode Jeff King
2018-10-10  0:48                                         ` Junio C Hamano
2018-10-11  3:13                                           ` Jeff King
2018-10-09 23:14                                       ` [PoC -- do not apply 3/3] test-tree-bitmap: replace ewah with custom rle encoding Jeff King
2018-10-10  0:58                                         ` Junio C Hamano
2018-10-11  3:20                                           ` Jeff King
2018-10-11 12:33                                       ` Bloom Filters Derrick Stolee
2018-10-11 13:43                                         ` Jeff King
2018-10-09 21:30                             ` We should add a "git gc --auto" after "git clone" due to commit graph SZEDER Gábor
2018-10-09 19:34                       ` [PATCH 0/4] Bloom filter experiment SZEDER Gábor
2018-10-09 19:34                         ` [PATCH 1/4] Add a (very) barebones Bloom filter implementation SZEDER Gábor
2018-10-09 19:34                         ` [PATCH 2/4] commit-graph: write a Bloom filter containing changed paths for each commit SZEDER Gábor
2018-10-09 21:06                           ` Jeff King
2018-10-09 21:37                             ` SZEDER Gábor
2018-10-09 19:34                         ` [PATCH 3/4] revision.c: use the Bloom filter to speed up path-limited revision walks SZEDER Gábor
2018-10-09 19:34                         ` [PATCH 4/4] revision.c: add GIT_TRACE_BLOOM_FILTER for a bit of statistics SZEDER Gábor
2018-10-09 19:47                         ` [PATCH 0/4] Bloom filter experiment Derrick Stolee
2018-10-11  1:21                         ` [PATCH 0/2] Per-commit filter proof of concept Jonathan Tan
2018-10-11  1:21                           ` [PATCH 1/2] One filter per commit Jonathan Tan
2018-10-11  1:21                           ` [PATCH 2/2] Only make bloom filter for first parent Jonathan Tan
2018-10-11  7:37                           ` [PATCH 0/2] Per-commit filter proof of concept Ævar Arnfjörð Bjarmason
2018-10-15 14:39                         ` [PATCH 0/4] Bloom filter experiment Derrick Stolee
2018-10-16  4:45                           ` Junio C Hamano
2018-10-16 11:13                             ` Derrick Stolee
2018-10-16 12:57                               ` Ævar Arnfjörð Bjarmason
2018-10-16 13:03                                 ` Derrick Stolee
2018-10-18  2:00                                 ` Junio C Hamano
2018-10-16 23:41                           ` Jonathan Tan
2018-10-08 23:02                     ` We should add a "git gc --auto" after "git clone" due to commit graph Junio C Hamano
2018-10-03 14:32     ` Duy Nguyen
2018-10-03 16:45 ` Duy Nguyen
2018-10-04 21:42 ` [RFC PATCH] " Ævar Arnfjörð Bjarmason
2018-10-05 12:05   ` Derrick Stolee
2018-10-05 13:05     ` Ævar Arnfjörð Bjarmason
2018-10-05 13:45       ` Derrick Stolee
2018-10-05 14:04         ` Ævar Arnfjörð Bjarmason
2018-10-05 19:21         ` Jeff King
2018-10-05 19:41           ` Derrick Stolee
2018-10-05 19:47             ` Jeff King
2018-10-05 20:00               ` Derrick Stolee
2018-10-05 20:02                 ` Jeff King
2018-10-05 20:01               ` Ævar Arnfjörð Bjarmason
2018-10-05 20:09                 ` Jeff King
2018-10-11 12:49 [PATCH 1/2] One filter per commit Derrick Stolee
2018-10-11 19:11 ` [PATCH] Per-commit and per-parent filters for 2 parents Jonathan Tan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181003165926.GR23446@localhost \
    --to=szeder.dev@gmail.com \
    --cc=avarab@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Git Mailing List Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/git/0 git/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 git git/ https://lore.kernel.org/git \
		git@vger.kernel.org
	public-inbox-index git

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.git


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git