All of lore.kernel.org
 help / color / mirror / Atom feed
From: Taylor Blau <me@ttaylorr.com>
To: John Cai <johncai86@gmail.com>
Cc: Taylor Blau <me@ttaylorr.com>, Junio C Hamano <gitster@pobox.com>,
	Christian Couder <christian.couder@gmail.com>,
	Robert Coup <robert.coup@koordinates.com>,
	John Cai via GitGitGadget <gitgitgadget@gmail.com>,
	git <git@vger.kernel.org>, Derrick Stolee <stolee@gmail.com>
Subject: Re: [PATCH v2 0/4] [RFC] repack: add --filter=
Date: Sat, 26 Feb 2022 15:30:03 -0500	[thread overview]
Message-ID: <YhqNy+t5SARNivQ5@nand.local> (raw)
In-Reply-To: <47AC2D8D-ADB2-4280-86F0-6B1E239C1EBE@gmail.com>

On Sat, Feb 26, 2022 at 03:19:11PM -0500, John Cai wrote:
> Thanks for bringing this up again. I meant to write back regarding what you raised
> in the other part of this thread. I think this is a valid concern. To attain the
> goal of offloading certain blobs onto another server(B) and saving space on a git
> server(A), then there will essentially be two steps. One to upload objects to (B),
> and one to remove objects from (A). As you said, these two need to be the inverse of each
> other or else you might end up with missing objects.

Do you mean that you want to offload objects both from a local clone of
some repository, _and_ the original remote it was cloned from?

I don't understand what the role of "another server" is here. If this
proposal was about making it easy to remove objects from a local copy of
a repository based on a filter provided that there was a Git server
elsewhere that could act as a promisor remote, than that makes sense to
me.

But I think I'm not quite understanding the rest of what you're
suggesting.

> > My other concern was around what guarantees we currently provide for a
> > promisor remote. My understanding is that we expect an object which was
> > received from the promisor remote to always be fetch-able later on. If
> > that's the case, then I don't mind the idea of refiltering a repository,
> > provided that you only need to specify a filter once.
>
> Could you clarify what you mean by re-filtering a repository? By that I assumed
> it meant specifying a filter eg: 100mb, and then narrowing it by specifying a
> 50mb filter.

I meant: applying a filter to a local clone (either where there wasn't a
filter before, or a filter which matched more objects) and then removing
objects that don't match the filter.

But your response makes me think of another potential issue. What
happens if I do the following:

    $ git repack -ad --filter=blob:limit=100k
    $ git repack -ad --filter=blob:limit=200k

What should the second invocation do? I would expect that it needs to do
a fetch from the promisor remote to recover any blobs between (100, 200]
KB in size, since they would be gone after the first repack.

This is a problem not just with two consecutive `git repack --filter`s,
I think, since you could cook up the same situation with:

    $ git clone --filter=blob:limit=100k git@github.com:git
    $ git -C git repack -ad --filter=blob:limit=200k

I don't think the existing patches handle this situation, so I'm curious
whether it's something you have considered or not before.

(Unrelated to the above, but please feel free to trim any quoted parts
of emails when responding if they get overly long.)

Thanks,
Taylor

  reply	other threads:[~2022-02-26 20:30 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-01-27  1:49 [PATCH 0/2] repack: add --filter= John Cai via GitGitGadget
2022-01-27  1:49 ` [PATCH 1/2] pack-objects: allow --filter without --stdout John Cai via GitGitGadget
2022-01-27  1:49 ` [PATCH 2/2] repack: add --filter=<filter-spec> option John Cai via GitGitGadget
2022-01-27 15:03   ` Derrick Stolee
2022-01-29 19:14     ` John Cai
2022-01-30  8:16       ` Christian Couder
2022-01-30 13:02       ` John Cai
2022-02-09  2:10 ` [PATCH v2 0/4] [RFC] repack: add --filter= John Cai via GitGitGadget
2022-02-09  2:10   ` [PATCH v2 1/4] pack-objects: allow --filter without --stdout John Cai via GitGitGadget
2022-02-09  2:10   ` [PATCH v2 2/4] repack: add --filter=<filter-spec> option John Cai via GitGitGadget
2022-02-09  2:10   ` [PATCH v2 3/4] upload-pack: allow missing promisor objects John Cai via GitGitGadget
2022-02-09  2:10   ` [PATCH v2 4/4] tests for repack --filter mode John Cai via GitGitGadget
2022-02-17 16:14     ` Robert Coup
2022-02-17 20:36       ` John Cai
2022-02-09  2:27   ` [PATCH v2 0/4] [RFC] repack: add --filter= John Cai
2022-02-16 15:39   ` Robert Coup
2022-02-16 21:07     ` John Cai
2022-02-21  3:11       ` Taylor Blau
2022-02-21 15:38         ` Robert Coup
2022-02-21 17:57           ` Taylor Blau
2022-02-21 21:10         ` Christian Couder
2022-02-21 21:42           ` Taylor Blau
2022-02-22 17:11             ` Christian Couder
2022-02-22 17:33               ` Taylor Blau
2022-02-23 15:40               ` Robert Coup
2022-02-23 19:31               ` Junio C Hamano
2022-02-26 16:01                 ` John Cai
2022-02-26 17:29                   ` Taylor Blau
2022-02-26 20:19                     ` John Cai
2022-02-26 20:30                       ` Taylor Blau [this message]
2022-02-26 21:05                         ` John Cai
2022-02-26 21:44                           ` Taylor Blau
2022-02-22 18:52             ` John Cai
2022-02-22 19:35               ` Taylor Blau

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YhqNy+t5SARNivQ5@nand.local \
    --to=me@ttaylorr.com \
    --cc=christian.couder@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=gitster@pobox.com \
    --cc=johncai86@gmail.com \
    --cc=robert.coup@koordinates.com \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.