All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Robert Coup via GitGitGadget" <gitgitgadget@gmail.com>
To: git@vger.kernel.org
Cc: "Jonathan Tan" <jonathantanmy@google.com>,
	"John Cai" <johncai86@gmail.com>,
	"Jeff Hostetler" <git@jeffhostetler.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Derrick Stolee" <derrickstolee@github.com>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Calvin Wan" <calvinwan@google.com>,
	"Robert Coup" <robert@coup.net.nz>
Subject: [PATCH v4 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering")
Date: Mon, 28 Mar 2022 14:02:04 +0000	[thread overview]
Message-ID: <pull.1138.v4.git.1648476131.gitgitgadget@gmail.com> (raw)
In-Reply-To: <pull.1138.v3.git.1646406274.gitgitgadget@gmail.com>

If a filter is changed on a partial clone repository, for example from
blob:none to blob:limit=1m, there is currently no straightforward way to
bulk-refetch the objects that match the new filter for existing local
commits. This is because the client will report commits as "have" during
fetch negotiation and any dependent objects won't be included in the
transferred pack. Another use case is discussed at [1].

This patch series introduces a --refetch option to fetch & fetch-pack to
enable doing a full fetch without performing any commit negotiation with the
remote, as a fresh clone does. It builds upon cbe566a071 ("negotiator/noop:
add noop fetch negotiator", 2020-08-18).

 * Using --refetch will produce duplicated objects between the existing and
   newly fetched packs, but maintenance will clean them up when it runs
   automatically post-fetch (if enabled).
 * If a user fetches with --refetch applying a more restrictive partial
   clone filter than previously (eg: blob:limit=1m then blob:limit=1k) the
   eventual state is a no-op, since any referenced object already in the
   local repository is never removed. More advanced repacking which could
   improve this scenario is currently proposed at [2].

[1]
https://lore.kernel.org/git/aa7b89ee-08aa-7943-6a00-28dcf344426e@syntevo.com/
[2]
https://lore.kernel.org/git/21ED346B-A906-4905-B061-EDE53691C586@gmail.com/

Changes since v3:

 * Mention fetch --refetch in the remote.<name>.partialclonefilter
   documentation.

Changes since v2:

 * Changed the name from "repair" to "refetch". While it's conceivable to
   use it in some object DB repair situations that's not the focus of these
   changes.
 * Pass config options to maintenance via GIT_CONFIG_PARAMETERS
 * Split out auto-maintenance to a separate & more robust test
 * Minor fixes/improvements from reviews by Junio & Ævar

Changes since RFC (v1):

 * Changed the name from "refilter" to "repair"
 * Removed dependency between server-side support for filtering and repair
 * Added a test case for a shallow clone
 * Post-fetch auto maintenance now strongly encourages
   repacking/consolidation

Reviewed-by: Calvin Wan calvinwan@google.com

Robert Coup (7):
  fetch-negotiator: add specific noop initializer
  fetch-pack: add refetch
  builtin/fetch-pack: add --refetch option
  fetch: add --refetch option
  t5615-partial-clone: add test for fetch --refetch
  fetch: after refetch, encourage auto gc repacking
  docs: mention --refetch fetch option

 Documentation/config/remote.txt           |  6 +-
 Documentation/fetch-options.txt           | 10 +++
 Documentation/git-fetch-pack.txt          |  4 ++
 Documentation/technical/partial-clone.txt |  3 +
 builtin/fetch-pack.c                      |  4 ++
 builtin/fetch.c                           | 34 +++++++++-
 fetch-negotiator.c                        |  5 ++
 fetch-negotiator.h                        |  8 +++
 fetch-pack.c                              | 46 ++++++++-----
 fetch-pack.h                              |  1 +
 remote-curl.c                             |  6 ++
 t/t5616-partial-clone.sh                  | 81 ++++++++++++++++++++++-
 transport-helper.c                        |  3 +
 transport.c                               |  4 ++
 transport.h                               |  4 ++
 15 files changed, 197 insertions(+), 22 deletions(-)


base-commit: abf474a5dd901f28013c52155411a48fd4c09922
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1138%2Frcoup%2Frc-partial-clone-refilter-v4
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1138/rcoup/rc-partial-clone-refilter-v4
Pull-Request: https://github.com/gitgitgadget/git/pull/1138

Range-diff vs v3:

 1:  96a75be3d8a = 1:  6cd6d4a59f6 fetch-negotiator: add specific noop initializer
 2:  04ca6a07f85 = 2:  03f0de3d28c fetch-pack: add refetch
 3:  879d30c4473 = 3:  f7942344ff8 builtin/fetch-pack: add --refetch option
 4:  a503b98f333 = 4:  78501bbf281 fetch: add --refetch option
 5:  01f22e784a5 = 5:  6c17167ac1e t5615-partial-clone: add test for fetch --refetch
 6:  31046625987 = 6:  28c07219fd8 fetch: after refetch, encourage auto gc repacking
 7:  f923a06aab5 ! 7:  da1e6de7a9f doc/partial-clone: mention --refetch fetch option
     @@ Metadata
      Author: Robert Coup <robert@coup.net.nz>
      
       ## Commit message ##
     -    doc/partial-clone: mention --refetch fetch option
     +    docs: mention --refetch fetch option
      
     -    Document it for partial clones as a means to apply a new filter.
     +    Document it for partial clones as a means to apply a new filter, and
     +    reference it from the remote.<name>.partialclonefilter config parameter.
      
          Signed-off-by: Robert Coup <robert@coup.net.nz>
      
     + ## Documentation/config/remote.txt ##
     +@@ Documentation/config/remote.txt: remote.<name>.promisor::
     + 	objects.
     + 
     + remote.<name>.partialclonefilter::
     +-	The filter that will be applied when fetching from this
     +-	promisor remote.
     ++	The filter that will be applied when fetching from this	promisor remote.
     ++	Changing or clearing this value will only affect fetches for new commits.
     ++	To fetch associated objects for commits already present in the local object
     ++	database, use the `--refetch` option of linkgit:git-fetch[1].
     +
       ## Documentation/technical/partial-clone.txt ##
      @@ Documentation/technical/partial-clone.txt: Fetching Missing Objects
         currently fetches all objects referred to by the requested objects, even

-- 
gitgitgadget

  parent reply	other threads:[~2022-03-28 14:02 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-01 15:49 [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 1/6] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 2/6] fetch-pack: add partial clone refiltering Robert Coup via GitGitGadget
2022-02-04 18:02   ` Jonathan Tan
2022-02-11 14:56     ` Robert Coup
2022-02-17  0:05       ` Jonathan Tan
2022-02-01 15:49 ` [PATCH 3/6] builtin/fetch-pack: add --refilter option Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 4/6] fetch: " Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 5/6] t5615-partial-clone: add test for --refilter Robert Coup via GitGitGadget
2022-02-01 15:49 ` [PATCH 6/6] doc/partial-clone: mention --refilter option Robert Coup via GitGitGadget
2022-02-01 20:13 ` [PATCH 0/6] [RFC] partial-clone: add ability to refetch with expanded filter Junio C Hamano
2022-02-02 15:02   ` Robert Coup
2022-02-16 13:24     ` Robert Coup
2022-02-02 18:59 ` Jonathan Tan
2022-02-02 21:58   ` Robert Coup
2022-02-02 21:59     ` Robert Coup
2022-02-07 19:37 ` Jeff Hostetler
2022-02-24 16:13 ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Robert Coup via GitGitGadget
2022-02-24 16:13   ` [PATCH v2 1/8] fetch-negotiator: add specific noop initializor Robert Coup via GitGitGadget
2022-02-25  6:19     ` Junio C Hamano
2022-02-28 12:22       ` Robert Coup
2022-02-24 16:13   ` [PATCH v2 2/8] fetch-pack: add repairing Robert Coup via GitGitGadget
2022-02-25  6:46     ` Junio C Hamano
2022-02-28 12:14       ` Robert Coup
2022-02-24 16:13   ` [PATCH v2 3/8] builtin/fetch-pack: add --repair option Robert Coup via GitGitGadget
2022-02-24 16:13   ` [PATCH v2 4/8] fetch: " Robert Coup via GitGitGadget
2022-02-24 16:13   ` [PATCH v2 5/8] t5615-partial-clone: add test for fetch --repair Robert Coup via GitGitGadget
2022-02-24 16:13   ` [PATCH v2 6/8] maintenance: add ability to pass config options Robert Coup via GitGitGadget
2022-02-25  6:57     ` Junio C Hamano
2022-02-28 12:02       ` Robert Coup
2022-02-28 17:07         ` Junio C Hamano
2022-02-25 10:29     ` Ævar Arnfjörð Bjarmason
2022-02-28 11:51       ` Robert Coup
2022-02-24 16:13   ` [PATCH v2 7/8] fetch: after repair, encourage auto gc repacking Robert Coup via GitGitGadget
2022-02-28 16:40     ` Ævar Arnfjörð Bjarmason
2022-02-24 16:13   ` [PATCH v2 8/8] doc/partial-clone: mention --repair fetch option Robert Coup via GitGitGadget
2022-02-28 16:43   ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Ævar Arnfjörð Bjarmason
2022-02-28 17:27     ` Robert Coup
2022-02-28 18:54       ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-02-28 22:20       ` [PATCH v2 0/8] fetch: add repair: full refetch without negotiation (was: "refiltering") Ævar Arnfjörð Bjarmason
2022-03-04 15:04   ` [PATCH v3 0/7] " Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-04 21:19       ` Junio C Hamano
2022-03-07 11:31         ` Robert Coup
2022-03-07 17:27           ` Junio C Hamano
2022-03-09 10:00             ` Robert Coup
2022-03-04 15:04     ` [PATCH v3 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-04 15:04     ` [PATCH v3 7/7] doc/partial-clone: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-09  0:27     ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation (was: "refiltering") Calvin Wan
2022-03-09  9:57       ` Robert Coup
2022-03-09 21:32         ` [PATCH v3 0/7] fetch: add repair: full refetch without negotiation Junio C Hamano
2022-03-10  1:07           ` Calvin Wan
2022-03-10 14:29           ` Robert Coup
2022-03-21 17:58             ` Calvin Wan
2022-03-21 21:34               ` Robert Coup
2022-03-28 14:02     ` Robert Coup via GitGitGadget [this message]
2022-03-28 14:02       ` [PATCH v4 1/7] fetch-negotiator: add specific noop initializer Robert Coup via GitGitGadget
2022-03-28 14:02       ` [PATCH v4 2/7] fetch-pack: add refetch Robert Coup via GitGitGadget
2022-03-31 15:09         ` Ævar Arnfjörð Bjarmason
2022-04-01 10:26           ` Robert Coup
2022-03-28 14:02       ` [PATCH v4 3/7] builtin/fetch-pack: add --refetch option Robert Coup via GitGitGadget
2022-03-28 14:02       ` [PATCH v4 4/7] fetch: " Robert Coup via GitGitGadget
2022-03-31 15:18         ` Ævar Arnfjörð Bjarmason
2022-04-01 10:31           ` Robert Coup
2022-03-28 14:02       ` [PATCH v4 5/7] t5615-partial-clone: add test for fetch --refetch Robert Coup via GitGitGadget
2022-03-31 15:20         ` Ævar Arnfjörð Bjarmason
2022-04-01 10:36           ` Robert Coup
2022-03-28 14:02       ` [PATCH v4 6/7] fetch: after refetch, encourage auto gc repacking Robert Coup via GitGitGadget
2022-03-31 15:22         ` Ævar Arnfjörð Bjarmason
2022-04-01 10:51           ` Robert Coup
2022-03-28 14:02       ` [PATCH v4 7/7] docs: mention --refetch fetch option Robert Coup via GitGitGadget
2022-03-28 17:38         ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=pull.1138.v4.git.1648476131.gitgitgadget@gmail.com \
    --to=gitgitgadget@gmail.com \
    --cc=avarab@gmail.com \
    --cc=calvinwan@google.com \
    --cc=derrickstolee@github.com \
    --cc=git@jeffhostetler.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=jonathantanmy@google.com \
    --cc=robert@coup.net.nz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.