git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "James Ramsay" <james@jramsay.com.au>
To: git@vger.kernel.org
Subject: [TOPIC 6/17] GC strategies
Date: Thu, 12 Mar 2020 15:01:54 +1100	[thread overview]
Message-ID: <B459AEB1-BF77-44C9-B06A-4B96C2E22287@jramsay.com.au> (raw)
In-Reply-To: <AC2EB721-2979-43FD-922D-C5076A57F24B@jramsay.com.au>

1. Jonathan N: Git has a flexible packfile format. Compared to CVS where 
things are stored as deltas against the next revision of the same file. 
GC can be a huge operation if it’s not done regularly. "git gc" makes 
one huge pack. Better amortized behavior to have multiple packs with 
exponentially increasing size and combine them when needed (Martin 
Dick's exproll).

2. Jonathan N: There are also unreachable objects to take care about. GC 
can/should delete them. But at the same time someone else might be 
creating history that still needs those objects. To give objects a grace 
period, we turn the unused objects into loose objects and look at the 
creation time. But alternatively there’s the proposal to move these 
unreachable objects into a packfile for all these objects. But this can 
be a problem for older git clients, because they might not know the pack 
is garbage and might move objects across packs. See the hash function 
transition doc for details.

3. Terry: JGit has these unreachable garbage packs

4. Peff: You want to solve this loose objects explosion problem?

5. Peff: what if you reference an object in the garbage pack from an 
object in a non-garbage pack?

6. Jonathan N: At GC time the object from the garbage pack is copied to 
a non-garbage pack. Basically rescue it from the garbage. It only saves 
the referenced objects, not the whole garbage pack.

7. Jonathan N: It has been running in production for >2 years.

8. Peff: There are so many non-atomic operations that can happen. And 
races can happen.

9. Jonathan N: If you find races, please comment on the JGit change that 
describes the algorithm. Happens-before relation and grace period.

10. `git gc --prune-now` should no longer create loose objects first, 
before just deleting them.

  parent reply	other threads:[~2020-03-12  4:02 UTC|newest]

Thread overview: 125+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-12  3:55 Notes from Git Contributor Summit, Los Angeles (April 5, 2020) James Ramsay
2020-03-12  3:56 ` [TOPIC 1/17] Reftable James Ramsay
2020-03-12  3:56 ` [TOPIC 2/17] Hooks in the future James Ramsay
2020-03-12 14:16   ` Emily Shaffer
2020-03-13 17:56     ` Junio C Hamano
2020-04-07 23:01       ` Emily Shaffer
2020-04-07 23:51         ` Emily Shaffer
2020-04-08  0:40           ` Junio C Hamano
2020-04-08  1:09             ` Emily Shaffer
2020-04-10 21:31           ` Jeff King
2020-04-13 19:15             ` Emily Shaffer
2020-04-13 21:52               ` Jeff King
2020-04-14  0:54                 ` [RFC PATCH v2 0/2] configuration-based hook management (was: [TOPIC 2/17] Hooks in the future) Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 1/2] hook: scaffolding for git-hook subcommand Emily Shaffer
2020-04-14  0:54                   ` [RFC PATCH v2 2/2] hook: add --list mode Emily Shaffer
2020-04-14 15:15                   ` [RFC PATCH v2 0/2] configuration-based hook management Phillip Wood
2020-04-14 19:24                     ` Emily Shaffer
2020-04-14 20:27                       ` Jeff King
2020-04-15 10:01                         ` Phillip Wood
2020-04-14 20:03                     ` Josh Steadmon
2020-04-15 10:08                       ` Phillip Wood
2020-04-14 20:32                     ` Jeff King
2020-04-15 10:01                       ` Phillip Wood
2020-04-15 14:51                         ` Junio C Hamano
2020-04-15 20:30                           ` Emily Shaffer
2020-04-15 22:19                             ` Junio C Hamano
2020-04-15  3:45                 ` [TOPIC 2/17] Hooks in the future Jonathan Nieder
2020-04-15 20:59                   ` Emily Shaffer
2020-04-20 23:53                     ` [PATCH] doc: propose hooks managed by the config Emily Shaffer
2020-04-21  0:22                       ` Emily Shaffer
2020-04-21  1:20                         ` Junio C Hamano
2020-04-24 23:14                           ` Emily Shaffer
2020-04-25 20:57                       ` brian m. carlson
2020-05-06 21:33                         ` Emily Shaffer
2020-05-06 23:13                           ` brian m. carlson
2020-05-19 20:10                           ` Emily Shaffer
2020-04-15 22:42                   ` [TOPIC 2/17] Hooks in the future Jeff King
2020-04-15 22:48                     ` Emily Shaffer
2020-04-15 22:57                       ` Jeff King
2020-03-12  3:57 ` [TOPIC 3/17] Obliterate James Ramsay
2020-03-12 18:06   ` Konstantin Ryabitsev
2020-03-15 22:19   ` Damien Robert
2020-03-16 12:55     ` Konstantin Tokarev
2020-03-26 22:27       ` Damien Robert
2020-03-16 16:32     ` Elijah Newren
2020-03-26 22:30       ` Damien Robert
2020-03-16 18:32     ` Phillip Susi
2020-03-26 22:37       ` Damien Robert
2020-03-16 20:01     ` Philip Oakley
2020-05-16  2:21       ` nbelakovski
2020-03-12  3:58 ` [TOPIC 4/17] Sparse checkout James Ramsay
2020-03-12  4:00 ` [TOPIC 5/17] Partial Clone James Ramsay
2020-03-17  7:38   ` Allowing only blob filtering was: " Christian Couder
2020-03-17 20:39     ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 1/2] list_objects_filter_options: introduce 'list_object_filter_config_name' Taylor Blau
2020-03-17 20:53         ` Eric Sunshine
2020-03-18 10:03           ` Jeff King
2020-03-18 19:40             ` Junio C Hamano
2020-03-18 22:38             ` Eric Sunshine
2020-03-19 17:15               ` Jeff King
2020-03-18 21:05           ` Taylor Blau
2020-03-17 20:39       ` [RFC PATCH 2/2] upload-pack.c: allow banning certain object filter(s) Taylor Blau
2020-03-17 21:11         ` Eric Sunshine
2020-03-18 21:18           ` Taylor Blau
2020-03-18 11:18         ` Philip Oakley
2020-03-18 21:20           ` Taylor Blau
2020-03-18 10:18       ` [RFC PATCH 0/2] upload-pack.c: limit allowed filter choices Jeff King
2020-03-18 18:26         ` Re*: " Junio C Hamano
2020-03-19 17:03           ` Jeff King
2020-03-18 21:28         ` Taylor Blau
2020-03-18 22:41           ` Junio C Hamano
2020-03-19 17:10             ` Jeff King
2020-03-19 17:09           ` Jeff King
2020-04-17  9:41         ` Christian Couder
2020-04-17 17:40           ` Taylor Blau
2020-04-17 18:06             ` Jeff King
2020-04-21 12:34               ` Christian Couder
2020-04-22 20:41                 ` Taylor Blau
2020-04-22 20:42               ` Taylor Blau
2020-04-21 12:17             ` Christian Couder
2020-03-12  4:01 ` James Ramsay [this message]
2020-03-12  4:02 ` [TOPIC 7/17] Background operations/maintenance James Ramsay
2020-03-12  4:03 ` [TOPIC 8/17] Push performance James Ramsay
2020-03-12  4:04 ` [TOPIC 9/17] Obsolescence markers and evolve James Ramsay
2020-05-09 21:31   ` Noam Soloveichik
2020-05-15 22:26     ` Jeff King
2020-03-12  4:05 ` [TOPIC 10/17] Expel ‘git shell’? James Ramsay
2020-03-12  4:07 ` [TOPIC 11/17] GPL enforcement James Ramsay
2020-03-12  4:08 ` [TOPIC 12/17] Test harness improvements James Ramsay
2020-03-12  4:09 ` [TOPIC 13/17] Cross implementation test suite James Ramsay
2020-03-12  4:11 ` [TOPIC 14/17] Aspects of merge-ort: cool, or crimes against humanity? James Ramsay
2020-03-12  4:13 ` [TOPIC 15/17] Reachability checks James Ramsay
2020-03-12  4:14 ` [TOPIC 16/17] “I want a reviewer” James Ramsay
2020-03-12 13:31   ` Emily Shaffer
2020-03-12 17:31     ` Konstantin Ryabitsev
2020-03-12 17:42       ` Jonathan Nieder
2020-03-12 18:00         ` Konstantin Ryabitsev
2020-03-17  0:43     ` Philippe Blain
2020-03-13 21:25   ` Eric Wong
2020-03-14 17:27     ` Jeff King
2020-03-15  0:36       ` inbox indexing wishlist [was: [TOPIC 16/17] “I want a reviewer”] Eric Wong
2020-03-12  4:16 ` [TOPIC 17/17] Security James Ramsay
2020-03-12 14:38 ` Notes from Git Contributor Summit, Los Angeles (April 5, 2020) Derrick Stolee
2020-03-13 20:47 ` Jeff King
2020-03-15 18:42 ` Jakub Narebski
2020-03-16 19:31   ` Jeff King
  -- strict thread matches above, loose matches on Subject: below --
2019-12-10  2:33 [PATCH 0/6] configuration-based hook management Emily Shaffer
2019-12-10  2:33 ` [PATCH 1/6] hook: scaffolding for git-hook subcommand Emily Shaffer
2019-12-12  9:41   ` Bert Wesarg
2019-12-12 10:47   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 2/6] config: add string mapping for enum config_scope Emily Shaffer
2019-12-10 11:16   ` Philip Oakley
2019-12-10 17:21     ` Philip Oakley
2019-12-10  2:33 ` [PATCH 3/6] hook: add --list mode Emily Shaffer
2019-12-12  9:38   ` Bert Wesarg
2019-12-12 10:58   ` SZEDER Gábor
2019-12-10  2:33 ` [PATCH 4/6] hook: support reordering of hook list Emily Shaffer
2019-12-11 19:21   ` Junio C Hamano
2019-12-10  2:33 ` [PATCH 5/6] hook: remove prior hook with '---' Emily Shaffer
2019-12-10  2:33 ` [PATCH 6/6] hook: teach --porcelain mode Emily Shaffer
2019-12-11 19:33   ` Junio C Hamano
2019-12-11 22:00     ` Emily Shaffer
2019-12-11 22:07       ` Junio C Hamano
2019-12-11 23:15         ` Emily Shaffer
2019-12-11 22:42 ` [PATCH 0/6] configuration-based hook management Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=B459AEB1-BF77-44C9-B06A-4B96C2E22287@jramsay.com.au \
    --to=james@jramsay.com.au \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).