git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Sergey Organov <sorganov@gmail.com>
To: Elijah Newren <newren@gmail.com>
Cc: "Eric Wong" <e@80x24.org>,
	"Git Mailing List" <git@vger.kernel.org>,
	"Junio C Hamano" <gitster@pobox.com>,
	"Derrick Stolee" <stolee@gmail.com>, "Jeff King" <peff@peff.net>,
	"Ævar Arnfjörð Bjarmason" <avarab@gmail.com>,
	"Johannes Schindelin" <Johannes.Schindelin@gmx.de>,
	"Lars Schneider" <larsxschneider@gmail.com>,
	"Jonathan Nieder" <jrnieder@gmail.com>
Subject: Re: [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere
Date: Wed, 28 Aug 2019 22:03:06 +0300	[thread overview]
Message-ID: <87muftqfit.fsf@osv.gnss.ru> (raw)
In-Reply-To: <CABPp-BEYRmhrb4Tx3bGzkx8y53T_0BYhLE5J0cEmxj18WtZs9A@mail.gmail.com> (Elijah Newren's message of "Wed, 28 Aug 2019 10:16:18 -0700")

Hi Elijah,

Elijah Newren <newren@gmail.com> writes:

> Hi Sergey,
>
> On Wed, Aug 28, 2019 at 1:52 AM Sergey Organov <sorganov@gmail.com> wrote:
>>
>> Elijah Newren <newren@gmail.com> writes:
>>
>> > On Tue, Aug 27, 2019 at 1:43 AM Sergey Organov <sorganov@gmail.com> wrote:
>> >>
>> >> Eric Wong <e@80x24.org> writes:
>> >>
>> >>
>> >> [...]

[...]

>> >
>> > Side note: Is the goal to "fix names and email addresses in this
>> > repository"?  If so, this guide fails: it doesn't update tagger names
>> > or email addresses.  Indeed, filter-branch doesn't provide a way to do
>> > that.  (Not to mention other problems like not updating references to
>> > commit hashes in commit messages when it busy rewriting everything.)
>>
>> No. Maybe the original goal was like that, by I, personally, use
>> modified version of this to change my "Author" credentials from
>> "internal" to "public" in branches that I'm going to send upstream, so
>> the actual aim is to change e-mail of particular Author from a@b to c@d
>> in all the commits in a (feature) branch.
>
> There's an interesting usecase I hadn't heard of or thought of before.
> Quick question to see if I'm understanding correctly: "all commits in
> a branch" or "all commits *unique* to a branch"?
>
> (Perhaps the only commits with the author you want to change are among
> the commits that are unique to that branch and so the distinction
> doesn't matter, but it wasn't clear from the description.)

Yes, this is exactly the case for me, as I'm changing entirely linear
topic branch that is going to become patch series to send out. No
complications.

>
>> >> > But I agree that filter-branch isn't useful and certainly
>> >> > shouldn't be encouraged/promoted.
>> >>
>> >> Well, is there more suitable way to change author for a (large) set of
>> >> commits then?
>> >
>> > I would say yes, use git filter-repo (note that this thread started
>> > with me proposing filter-repo for inclusion in git.git -- and getting
>> > suggestions that we should remove stuff instead of adding more stuff).
>> > I'm biased, but I think it's much better at this particular job as
>> > well:
>>
>> Well, I don't want to change the entire repo, and I don't immediately
>> see how to do it with git filter-repo. Is it at all possible?
>
> Yes, it is possible.  filter-repo has a hidden --refs argument
> defaulting to --all; you could instead set it to e.g.
> origin/master..master.

Cool!

>
> --refs is the only hidden option in filter-repo.  I know it may look
> funny that I spent a bunch of effort to create the
> --reference-excluded-parents option to fast-export explicitly so that
> it would be possible to do partial history rewrites like this, and
> then to hide and avoid documenting this option (though I did hint that
> it existed in the documentation if you search for "Partial-repo
> filtering"), but there was a few reasons for this:
>
>   * mixing old and new history for most rewrites that
> filter-branch/filter-repo/bfg/etc are used for can really mess things
> up and make it hard to recover from.  I don't like trying to clean up
> repos with accidental duplicate copies of most commits in the repo,
> and I suspect others like it even less.  So, anything that makes it
> easier to make such mistakes needs to have a really good rationale in
> order for me to expose it.
>   * The only usecases I knew of for partial repo filtering prior to
> your email were (1) side-stepping insanely slow execution time of poor
> filtering tools like filter-branch, and (2) performing operations
> better suited to git-rebase anyway (e.g. the --signoff option to
> rebase did not exist once upon a time and so folks could have used
> filter-branch to fake it, but using rebase is the better way to make
> this change).  And, even after your email, I'm not sure that has
> changed though, as noted below.

Yeah, I share your worries.

[...]

>> Actually, I'd rather expect some support for this in "git rebase", being
>> git history editing/reshaping tool, but it looks like it only has it in
>> the form that is very difficult to automate.
>
> I agree that git rebase would be the better choice here; I typically
> feel it's the better choice for rewrites of recent history.  I think
> it provides just what you need:
>
>   git rebase --exec="git commit --amend --reset-author -C HEAD" $UPSTREAM
>
> (Assuming, of course, that you've either set the right environment
> variables or set user.name and user.email to the new values you want
> so that commit's --reset-author flag can reset to the *new* author.)

This should do the trick for me most of times, thanks a lot for the clue!

However, the script that I'm using doesn't change _all_ the authors, it
only changes those that match particular specific author specified in
the script. I didn't yet actually need this feature, but I can well
imagine it's probable that I will have commits by other author(s) in the
branch and I won't want to attribute their job to myself.

Hmm... That said, using the generic "--exec" to "git rebase" I could
probably come-up with a script that will check the Author of the latest
commit and will choose to either rewrite it or not. Nothing terribly
complex.

>
>> >> > Yet there's probably still users which ARE happy with it, that
>> >> > will never hit the edge cases and problems it poses; and will
>> >> > never read release notes.  And said users are probably getting
>> >> > git from a slow-moving distro, so it'd be a disservice to them
>> >> > if they lost a tool they depend on without any warning.
>> >>
>> >> Personally, I'm far from happy with it, but I have no clue how to
>> >> substitute it in the job above. Anybody?
>> >
>> > The start of this thread where I proposed git filter-repo for
>> > inclusion in git[1] had links to documentation and comparisons to
>> > other tools and such.  You may find those links helpful; if not, let
>> > me know what needs to be fixed in the documentation.
>>
>> Thank you for the references, I find it a very nice tool to have!
>>
>> Pity it's not an entire substitute for git filter-branch.
>
> Au contraire, I believe it is.  :-)

I take your word for it :-)

>
> Thanks for the interesting usecase.  It sounds like we both think this
> one happens to be better solved by rebase, and the command snippet I
> provided above should show you to use rebase to solve it.  However, if
> you come up with any others where partial repo filtering makes sense,
> I'm always willing to reconsider my decision to make the --refs
> argument hidden; it may just mean adding more warnings, but it might
> also involve changing other defaults (e.g. the automatic
> repacking/pruning).  I'd need concrete usecases to know for sure how
> I'd want to handle it.

OK, thanks a lot! Doesn't seem to be necessary for now due to the rebase
trick you've suggested.

>
> Hope that helps,

Sure it does!

-- Sergey

  reply	other threads:[~2019-08-28 19:03 UTC|newest]

Thread overview: 73+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-22 18:26 RFC: Proposing git-filter-repo for inclusion in git.git Elijah Newren
2019-08-22 20:23 ` Junio C Hamano
2019-08-22 21:12   ` Elijah Newren
2019-08-22 21:34     ` Junio C Hamano
2019-08-26 23:52       ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Elijah Newren
2019-08-26 23:52         ` [RFC PATCH 1/5] t6006: simplify and optimize empty message test Elijah Newren
2019-08-27  1:23           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 2/5] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-27  1:25           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 3/5] git-sh-i18n: work with external scripts Elijah Newren
2019-08-27  1:28           ` Derrick Stolee
2019-08-26 23:52         ` [RFC PATCH 4/5] Recommend git-filter-repo instead of git-filter-branch in documentation Elijah Newren
2019-08-27  1:32           ` Derrick Stolee
2019-08-27  6:23             ` Elijah Newren
2019-08-26 23:52         ` [RFC PATCH 5/5] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-27  1:39         ` [RFC PATCH 0/5] Remove git-filter-branch from git.git; host it elsewhere Derrick Stolee
2019-08-27  6:17           ` Elijah Newren
2019-08-27  7:03         ` Eric Wong
2019-08-27  8:43           ` Sergey Organov
2019-08-27 22:18             ` Elijah Newren
2019-08-28  8:52               ` Sergey Organov
2019-08-28 17:16                 ` Elijah Newren
2019-08-28 19:03                   ` Sergey Organov [this message]
2019-08-30 20:40                   ` Johannes Schindelin
2019-08-30 23:22                     ` Elijah Newren
2019-09-02  9:29                       ` Johannes Schindelin
2019-09-03 17:37                         ` Elijah Newren
2019-08-28  0:22         ` [PATCH v2 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-28  0:22           ` [PATCH v2 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-28  0:22           ` [PATCH v2 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-28  6:00             ` Eric Sunshine
2019-08-28  0:22           ` [PATCH v2 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-28  6:17             ` Eric Sunshine
2019-08-28 21:48               ` Elijah Newren
2019-08-28  0:22           ` [RFC PATCH v2 4/4] Remove git-filter-branch, it is now external to git.git Elijah Newren
2019-08-29  0:06           ` [PATCH v3 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-29  0:06             ` [PATCH v3 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-08-29  0:06             ` [PATCH v3 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-08-29  0:06             ` [PATCH v3 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-29 18:10               ` Eric Sunshine
2019-08-30  0:04                 ` Elijah Newren
2019-08-29  0:06             ` [PATCH v3 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-30  5:57             ` [PATCH v4 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-08-30  5:57               ` [PATCH v4 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-02 14:47                 ` Johannes Schindelin
2019-08-30  5:57               ` [PATCH v4 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-02 14:45                 ` Johannes Schindelin
2019-08-30  5:57               ` [PATCH v4 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-08-30  5:57               ` [PATCH v4 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-03 18:55           ` [PATCH v5 0/4] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-03 18:55             ` [PATCH v5 1/4] t6006: simplify and optimize empty message test Elijah Newren
2019-09-03 21:08               ` Junio C Hamano
2019-09-03 21:58                 ` Elijah Newren
2019-09-03 22:25                   ` Junio C Hamano
2019-09-03 18:55             ` [PATCH v5 2/4] t3427: accelerate this test by using fast-export and fast-import Elijah Newren
2019-09-03 21:26               ` Junio C Hamano
2019-09-03 22:46                 ` Junio C Hamano
2019-09-04 20:32                   ` Elijah Newren
2019-09-03 18:55             ` [PATCH v5 3/4] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-03 21:40               ` Junio C Hamano
2019-09-04 20:30                 ` Elijah Newren
2019-09-03 18:55             ` [PATCH v5 4/4] t9902: use a non-deprecated command for testing Elijah Newren
2019-09-04 22:32             ` [PATCH v6 0/3] Warn about git-filter-branch usage and avoid it Elijah Newren
2019-09-04 22:32               ` [PATCH v6 1/3] t6006: simplify, fix, and optimize empty message test Elijah Newren
2019-09-04 22:32               ` [PATCH v6 2/3] Recommend git-filter-repo instead of git-filter-branch Elijah Newren
2019-09-04 22:32               ` [PATCH v6 3/3] t9902: use a non-deprecated command for testing Elijah Newren
2019-08-23  3:00     ` RFC: Proposing git-filter-repo for inclusion in git.git Eric Wong
2019-08-23 18:06       ` Elijah Newren
2019-08-23 18:29         ` Elijah Newren
2019-08-28 11:09         ` Johannes Schindelin
2019-08-28 15:06           ` Junio C Hamano
2019-08-23 12:02     ` Derrick Stolee
2019-08-26 19:56   ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87muftqfit.fsf@osv.gnss.ru \
    --to=sorganov@gmail.com \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=avarab@gmail.com \
    --cc=e@80x24.org \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jrnieder@gmail.com \
    --cc=larsxschneider@gmail.com \
    --cc=newren@gmail.com \
    --cc=peff@peff.net \
    --cc=stolee@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).