From: Johannes Schindelin <Johannes.Schindelin@gmx.de>
To: Elijah Newren <newren@gmail.com>
Cc: "Sergey Organov" <sorganov@gmail.com>, "Eric Wong" <e@80x24.org>,
"Git Mailing List" <git@vger.kernel.org>,
"Junio C Hamano" <gitster@pobox.com>,
"Derrick Stolee" <stolee@gmail.com>, "Jeff King" <peff@peff.net>,
"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
"Lars Schneider" <larsxschneider@gmail.com>,
"Jonathan Nieder" <jrnieder@gmail.com>
Subject: Re: [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere
Date: Mon, 2 Sep 2019 11:29:52 +0200 (CEST) [thread overview]
Message-ID: <nycvar.QRO.7.76.6.1909021111440.46@tvgsbejvaqbjf.bet> (raw)
In-Reply-To: <CABPp-BHMXAQGPaBYyg2dtVeN5h8fW8G4YdhddCeAjY5r74BAzw@mail.gmail.com>
Hi Elijah,
On Fri, 30 Aug 2019, Elijah Newren wrote:
> On Fri, Aug 30, 2019 at 1:40 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> > [...]
> > In my most recent instance of this, I wanted to publish the script I
> > used to use for submitting patch series to the Git mailing list,
> > maintaining tags for iterations and generating cover letters from branch
> > descriptions and interdiffs (this script eventually became GitGitGadget,
> > https://github.com/gitgitgadget/gitgitgadget/commits?after=6fb0ede48f86e729292ee1542729bc0f5a30cfa6+0
> > demonstrates this).
> >
> > To do that, I ran a `git filter-branch` in the repository where I track
> > all the scripts I deem unsuitable for public consumption, to remove all
> > files but `mail-patch-series.sh`, then pushed it to
> > https://github.com/dscho/mail-patch-series
> >
> > Please note that most crucially, I wanted to rewrite a newly-created
> > branch, and only that branch.
> >
> > Could I have done the same using `git fast-export`, filtering the output
> > with a Perl script, then passing it to `git fast-import`? Sure, I was
> > really tempted to do that. In the end, it took less of _my_ time to just
> > let `git filter-branch` do its work with a not-too-complicated index
> > filter.
>
> Why a perl script? Shouldn't
> git fast-export [--no-data] HEAD -- $PATH | git fast-import --force --quiet
> do the trick? And it's probably simpler and shorter than the index
> filter you used.
Does that not keep the full `$PATH`? I wanted the resulting branch to
have the file in the top-level directory.
> That said, yeah it'd be nice to get automatic rewriting of commit
> hashes in commit messages and other niceties from filter-repo (e.g.
> future automatic reattaching of notes to the rewritten commits). Some
> questions:
>
> * What's the backup strategy in case you specify the wrong filters
> (e.g. you have a typo in the pathnames)? filter-repo encourages folks
> to make a clone and then filter the fresh clone, because if anything
> goes awry, you can just delete and restart. (I am heavily opposed to
> the refs/original/ backup mechanism used by filter-branch, for
> multiple reasons.) Is your safety stance just "If I mess up it's my
> own fault; do the rewrite?" Or are you okay with cloning before
> filtering?
Please note that the `refs/original/` refs should not have been written
at all anymore, not after reflogs were introduced.
Incidentally, that is my answer to your question: the reflog is my
backup.
> * If you're okay with cloning before filtering...then is there an
> issue with rewriting all branches, and just pushing the one you need?
> (Is there an issue with "this branch is small, the others are huge,
> and filter-branch is slow -- so rewriting one branch saves me lots of
> time"? Or are there other issues at play too?)
I am not okay with cloning before filtering.
First of all, it is wasteful.
Second of all, in my case it would have been *particularly* wasteful
because the repository in question also has quite a few quite large
blobs (hysterical raisins, don't ask).
> * What if the user has auxiliary information for the branch in other
> refs? For example, git-notes pointing at any of the commits, or tags
> in the history of the branch that might be relevant, or perhaps even
> replace refs in combination with GIT_NO_REPLACE_OBJECTS=1? Is this an
> "I don't care, toss that stuff and just rewrite just this branch?"
In my case: there are no notes. The only time when I make heavy use of
notes is in GitGitGadget. I don't use that feature otherwise.
> * filter-repo by default creates new replace references so that you
> can refer to new commit IDs using old (unabbreviated) commit IDs.
> Would that be considered helpful for this usecase? unhelpful?
> irrelevant, since you'll just push the branch you want somewhere and
> nuke the temporary clone?
I definitely did not need that mapping in all of my `git filter-branch`
use cases.
Of course, I can see how it can come in handy in other circumstances,
just not in the ones I experienced so far.
> I'm not by any means ruling out the possibility of documenting --refs
> and adjusting the defaults when it is used so the user can just run
> something like
> git filter-repo --path $PATH --refs $MYBRANCH
> but I feel like I need to understand answers to questions like the
> above ones so that I can know how to phrase warnings and adjust
> defaults and update the documentation.
In all the scenarios where I used `git filter-branch` (some dozen per
year, so not all *that* many), I needed to rewrite one particular
branch, typically a freshly-created one. I never, ever ever needed to
rewrite all the refs in the repository. Not once ;-)
> > In another instance, a long, long time ago, I needed to restart a
> > repository which had included way too many files for its own good, then
> > rename the old repository and start with a fresh `master` that contained
> > but a single commit whose tree was identical to the previous `master`'s
> > tip commit. I simply grafted that commit, ran `git filter-branch` and
> > had precisely what I needed.
>
> filter-repo supports grafts and replace objects, the same as
> filter-branch. (Although, technically, I didn't have to do a thing to
> support it; fast-export does the special handling of rewriting based
> on grafts and replace objects.) So, I'd say this is fully supported.
>
> Side question: the git-replace documents suggest that the graft file
> is deprecated. Are there any timeframes or plans for phasing out
> beyond the git-replace manpage existing? Should I avoid documenting
> the graft file support in filter-repo? Should I include examples
> using not just git-replace but also using the graft file?
I had meant to prepare a patch series to remove `grafts` support that
Junio could carry in `pu` until the time he considers it appropriate to
merge to `master`, but it seems that this task fell under the rag.
The deprecation itself has been introduced in tags/v2.18.0-rc0~54^2~4,
i.e. it is official as of Git v2.18.0, which was released in mid-June
last year.
My personal gut feeling is that we should let it simmer for another year
before removing support for the `grafts` file (and we may want to update
the label "grafted" when `git log` shows a shallow commit before we
remove that support for `grafts`).
So I'll not work on that patch for now.
> > I would be _delighted_ if these kinds of use case (rewriting a branch,
> > or even just a commit range) became more of a first-class citizen with
> > `git filter-repo`.
>
> I've got all the pieces for supporting a single branch or a commit
> range (e.g. 'git filter-repo --path foo --refs ^master~4 ^stable~23
> mybranch'), but the defaults (error out unless in a bare repo, move
> refs/remotes/origin/* to refs/heads/*, disconnect origin remote,
> expire reflogs & repack & prune, create new replace references so
> folks can access new commits using old commit IDs) may be somewhat
> friction-filled for this usecase. Those defaults other than the new
> replace refs happen to all be turned off with the combination of
> --force and --target, so, assuming turning them off is what you need,
> you could cheat and just specify 'git filter-repo --force --target .
> --refs $MYBRANCH' today and perhaps get what you want, but that's a
> really non-intuitive command line that is way too ugly to recommend.
> And I don't want to tie myself to '--target .' being the magic sauce
> in the future either.
I agree. I would love for my use cases to become more of first-class
citizens. Maybe `--branch <branch>` could serve as the knob?
What I also found really helpful in `git filter-branch` is that it was
possible to pass one-liner shell scripts directly to the command, giving
a lot of freedom about the transformations. I understand that Python
makes it hard to write spaghetti-code one-liners, so you cannot really
pass the snippet in via the command-line, but I hope there is a way to
script things in `git filter-repo`?
Ciao,
Dscho
next prev parent reply other threads:[~2019-09-02 9:30 UTC|newest]
Thread overview: 73+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-22 18:26 RFC: Proposing git-filter-repo for inclusion in git.git Elijah Newren
2019-08-22 20:23 ` Junio C Hamano
2019-08-22 21:12 ` Elijah Newren
2019-08-22 21:34 ` Junio C Hamano
2019-08-26 23:52 ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Elijah Newren
2019-08-26 23:52 ` [RFC PATCH 1/5] t6006: simplify and optimize empty message test Elijah Newren
2019-08-27 1:23 ` Derrick Stolee
2019-08-26 23:52 ` [RFC PATCH 2/5] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-27 1:25 ` Derrick Stolee
2019-08-26 23:52 ` [RFC PATCH 3/5] git-sh-i18n: work with external scripts Elijah Newren
2019-08-27 1:28 ` Derrick Stolee
2019-08-26 23:52 ` [RFC PATCH 4/5] Recommend git-filter-repo instead of git-filter-branch in documentation Elijah Newren
2019-08-27 1:32 ` Derrick Stolee
2019-08-27 6:23 ` Elijah Newren
2019-08-26 23:52 ` [RFC PATCH 5/5] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-27 1:39 ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Derrick Stolee
2019-08-27 6:17 ` Elijah Newren
2019-08-27 7:03 ` Eric Wong
2019-08-27 8:43 ` Sergey Organov
2019-08-27 22:18 ` Elijah Newren
2019-08-28 8:52 ` Sergey Organov
2019-08-28 17:16 ` Elijah Newren
2019-08-28 19:03 ` Sergey Organov
2019-08-30 20:40 ` Johannes Schindelin
2019-08-30 23:22 ` Elijah Newren
2019-09-02 9:29 ` Johannes Schindelin [this message]
2019-09-03 17:37 ` Elijah Newren
2019-08-28 0:22 ` [PATCH v2 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-28 0:22 ` [PATCH v2 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-28 0:22 ` [PATCH v2 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-28 6:00 ` Eric Sunshine
2019-08-28 0:22 ` [PATCH v2 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-28 6:17 ` Eric Sunshine
2019-08-28 21:48 ` Elijah Newren
2019-08-28 0:22 ` [RFC PATCH v2 4/4] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-29 0:06 ` [PATCH v3 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-29 0:06 ` [PATCH v3 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-29 0:06 ` [PATCH v3 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-29 0:06 ` [PATCH v3 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-29 18:10 ` Eric Sunshine
2019-08-30 0:04 ` Elijah Newren
2019-08-29 0:06 ` [PATCH v3 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-30 5:57 ` [PATCH v4 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-30 5:57 ` [PATCH v4 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-02 14:47 ` Johannes Schindelin
2019-08-30 5:57 ` [PATCH v4 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-02 14:45 ` Johannes Schindelin
2019-08-30 5:57 ` [PATCH v4 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-30 5:57 ` [PATCH v4 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-03 18:55 ` [PATCH v5 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-03 18:55 ` [PATCH v5 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-03 21:08 ` Junio C Hamano
2019-09-03 21:58 ` Elijah Newren
2019-09-03 22:25 ` Junio C Hamano
2019-09-03 18:55 ` [PATCH v5 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-03 21:26 ` Junio C Hamano
2019-09-03 22:46 ` Junio C Hamano
2019-09-04 20:32 ` Elijah Newren
2019-09-03 18:55 ` [PATCH v5 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-03 21:40 ` Junio C Hamano
2019-09-04 20:30 ` Elijah Newren
2019-09-03 18:55 ` [PATCH v5 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-04 22:32 ` [PATCH v6 0/3] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-04 22:32 ` [PATCH v6 1/3] t6006: simplify, fix, and optimize empty message test Elijah Newren
2019-09-04 22:32 ` [PATCH v6 2/3] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-04 22:32 ` [PATCH v6 3/3] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-23 3:00 ` RFC: Proposing git-filter-repo for inclusion in git.git Eric Wong
2019-08-23 18:06 ` Elijah Newren
2019-08-23 18:29 ` Elijah Newren
2019-08-28 11:09 ` Johannes Schindelin
2019-08-28 15:06 ` Junio C Hamano
2019-08-23 12:02 ` Derrick Stolee
2019-08-26 19:56 ` Jeff King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=nycvar.QRO.7.76.6.1909021111440.46@tvgsbejvaqbjf.bet \
--to=johannes.schindelin@gmx.de \
--cc=avarab@gmail.com \
--cc=e@80x24.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=jrnieder@gmail.com \
--cc=larsxschneider@gmail.com \
--cc=newren@gmail.com \
--cc=peff@peff.net \
--cc=sorganov@gmail.com \
--cc=stolee@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).