Should --update-refs exclude refs pointing to the current HEAD?

All of lore.kernel.org
 help / color / mirror / Atom feed

* Should --update-refs exclude refs pointing to the current HEAD?
@ 2023-04-17  8:21 Stefan Haller
  2023-04-17  8:30 ` Stefan Haller
                   ` (4 more replies)
  0 siblings, 5 replies; 18+ messages in thread
From: Stefan Haller @ 2023-04-17  8:21 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, , Elijah Newren, , Phillip Wood

The --update-refs option of git rebase is so useful that I have it on by
default in my config. For stacked branches I find it hard to think of
scenarios where I wouldn't want it.

However, there are cases for non-stacked branches (i.e. other branches
pointing at the current HEAD) where updating them is undesirable. In
fact, pretty much always, for me. Two examples, both very similar:

1. I have a topic branch which is based off of master; I want to make a
copy of that branch and rebase it onto devel, just to try if that would
work. I don't want the original branch to be moved along in this case.

2. I have a topic branch, and I want to make a copy of it to make some
heavy history rewriting experiments. Again, my interactive rebases would
always rebase both branches in the same way, not what I want. In this
case I could work around it by doing the experiments on the original
branch, creating a tag beforehand that I could reset back to if the
experiments fail. But maybe I do want to keep both branches around for a
while for some reason.

Both of these cases could be fixed by --update-refs not touching any
refs that point to the current HEAD. I'm having a hard time coming up
with cases where you would ever want those to be updated, in fact.

Any opinions?

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
@ 2023-04-17  8:30 ` Stefan Haller
  2023-04-17  8:34 ` Kristoffer Haugsbakk
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 18+ messages in thread
From: Stefan Haller @ 2023-04-17  8:30 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Elijah Newren, Phillip Wood

On 17.04.23 10:21, Stefan Haller wrote:
> The --update-refs option of git rebase is so useful that I have it on by
> default in my config. For stacked branches I find it hard to think of
> scenarios where I wouldn't want it.
> 
> However, there are cases for non-stacked branches (i.e. other branches
> pointing at the current HEAD) where updating them is undesirable. In
> fact, pretty much always, for me. Two examples, both very similar:
> 
> 1. I have a topic branch which is based off of master; I want to make a
> copy of that branch and rebase it onto devel, just to try if that would
> work. I don't want the original branch to be moved along in this case.
> 
> 2. I have a topic branch, and I want to make a copy of it to make some
> heavy history rewriting experiments. Again, my interactive rebases would
> always rebase both branches in the same way, not what I want. In this
> case I could work around it by doing the experiments on the original
> branch, creating a tag beforehand that I could reset back to if the
> experiments fail. But maybe I do want to keep both branches around for a
> while for some reason.
> 
> Both of these cases could be fixed by --update-refs not touching any
> refs that point to the current HEAD. I'm having a hard time coming up
> with cases where you would ever want those to be updated, in fact.

One question then is whether the behavior makes sense for the case where
you have a stack of branches, and you make a copy of the topmost one and
then do either of the two above scenarios with that copy. With my
proposal it would leave the old top of the stack alone, but it *would*
update all the inner branches of the stack. Whether that's desired or
not is very unclear to me, and I would leave it up to user to add
--no-update-refs in this case if they don't want it.

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
  2023-04-17  8:30 ` Stefan Haller
@ 2023-04-17  8:34 ` Kristoffer Haugsbakk
  2023-04-17  9:22   ` Stefan Haller
  2023-04-17 12:14 ` Phillip Wood
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 18+ messages in thread
From: Kristoffer Haugsbakk @ 2023-04-17  8:34 UTC (permalink / raw)
  To: Stefan Haller; +Cc: Derrick Stolee, , Elijah Newren, , Phillip Wood, git

Hi

On Mon, Apr 17, 2023, at 10:21, Stefan Haller wrote:
> 2. I have a topic branch, and I want to make a copy of it to make some
> heavy history rewriting experiments. Again, my interactive rebases would
> always rebase both branches in the same way, not what I want. In this
> case I could work around it by doing the experiments on the original
> branch, creating a tag beforehand that I could reset back to if the
> experiments fail. But maybe I do want to keep both branches around for a
> while for some reason.

I would use a lightweight tag, too, since this option doesn’t touch tags.[1]

Why do you want to keep both branches around? I would keep the tag
around and then branch off of that if I want to make another divergent
history in the future.

This is interesting to me since copying branches indeed does not seem to
*gel* with this git-rebase(1) option. But I never really understood the
use-case for copying branches rather than using lightweight tags.

† 1: I wonder why it wasn’t called `--update-branches`. On the one hand,
    the option ignores refs other than branches. On the other hand, the
    command in the todo list *will* update tags if you tell it to, and
    even refs like `/refs/notes/*`. But `--update-branches` seems like a
    better name, at least outside the todo editor.

-- 
Kristoffer Haugsbakk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:34 ` Kristoffer Haugsbakk
@ 2023-04-17  9:22   ` Stefan Haller
  2023-04-18  2:00     ` Felipe Contreras
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Haller @ 2023-04-17  9:22 UTC (permalink / raw)
  To: Kristoffer Haugsbakk; +Cc: Derrick Stolee, Elijah Newren, Phillip Wood, git

On 17.04.23 10:34, Kristoffer Haugsbakk wrote:
> Hi
> 
> On Mon, Apr 17, 2023, at 10:21, Stefan Haller wrote:
>> 2. I have a topic branch, and I want to make a copy of it to make some
>> heavy history rewriting experiments. Again, my interactive rebases would
>> always rebase both branches in the same way, not what I want. In this
>> case I could work around it by doing the experiments on the original
>> branch, creating a tag beforehand that I could reset back to if the
>> experiments fail. But maybe I do want to keep both branches around for a
>> while for some reason.
> 
> I would use a lightweight tag, too, since this option doesn’t touch tags.[1]
> 
> Why do you want to keep both branches around? 

Several reasons:

Maybe the original branch was pushed already, and I'm collaborating on
it with a coworker. At the same time, I want to run my rebase experiment
in parallel on a copy.

Maybe I want to create github PRs for both of them, in order to run CI
on them, or get feedback for both of them from my coworkers.

Also, it just seems to be the most natural workflow for many people. I
have seen my coworkers do this a lot without thinking much whether there
would be a better way.

My question is not so much whether copying branches is a good idea, it's
more about how --update-refs should deal with copied branches *if* you
decide to use them.

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
  2023-04-17  8:30 ` Stefan Haller
  2023-04-17  8:34 ` Kristoffer Haugsbakk
@ 2023-04-17 12:14 ` Phillip Wood
  2023-04-20 15:27   ` Stefan Haller
  2024-03-05  7:40 ` Stefan Haller
  2024-03-24 10:42 ` Stefan Haller
  4 siblings, 1 reply; 18+ messages in thread
From: Phillip Wood @ 2023-04-17 12:14 UTC (permalink / raw)
  To: Stefan Haller, git; +Cc: Derrick Stolee, Elijah Newren, Kristoffer Haugsbakk

Hi Stefan

On 17/04/2023 09:21, Stefan Haller wrote:
> The --update-refs option of git rebase is so useful that I have it on by
> default in my config. For stacked branches I find it hard to think of
> scenarios where I wouldn't want it.
> 
> However, there are cases for non-stacked branches (i.e. other branches
> pointing at the current HEAD) where updating them is undesirable. In
> fact, pretty much always, for me. Two examples, both very similar:
> 
> 1. I have a topic branch which is based off of master; I want to make a
> copy of that branch and rebase it onto devel, just to try if that would
> work. I don't want the original branch to be moved along in this case.
> 
> 2. I have a topic branch, and I want to make a copy of it to make some
> heavy history rewriting experiments. Again, my interactive rebases would
> always rebase both branches in the same way, not what I want. In this
> case I could work around it by doing the experiments on the original
> branch, creating a tag beforehand that I could reset back to if the
> experiments fail. But maybe I do want to keep both branches around for a
> while for some reason.
> 
> Both of these cases could be fixed by --update-refs not touching any
> refs that point to the current HEAD.

I'd use a detached HEAD for the "experimental" rebase and then update 
the branch if the rebase was successful. If you really want to use 
another branch you could try running "git commit --amend --only" before 
rebasing to update the commit date so the two branches don't point to 
the same commit.

We could add a command line option to restrict the branches that are 
updated by --update-refs but I'm not that enthusiastic about it.

> I'm having a hard time coming up
> with cases where you would ever want those to be updated, in fact.

If a user using stacked branches creates a new branch and then realizes 
they need to fix something on the parent before creating any commits on 
the new branch they would want both to be updated. e.g.	
	$ git symbolic-ref HEAD
	refs/heads/topic
	$ git checkout -b another-topic
	# fix a bug in topic - want topic and another-topic to be
	# updated
	$ git rebase -i --update-refs HEAD~2

Best Wishes

Phillip

> Any opinions?
> 
> -Stefan
> 


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  9:22   ` Stefan Haller
@ 2023-04-18  2:00     ` Felipe Contreras
  0 siblings, 0 replies; 18+ messages in thread
From: Felipe Contreras @ 2023-04-18  2:00 UTC (permalink / raw)
  To: Stefan Haller, Kristoffer Haugsbakk
  Cc: Derrick Stolee, Elijah Newren, Phillip Wood, git

Stefan Haller wrote:
> On 17.04.23 10:34, Kristoffer Haugsbakk wrote:
> > On Mon, Apr 17, 2023, at 10:21, Stefan Haller wrote:
> >> 2. I have a topic branch, and I want to make a copy of it to make some
> >> heavy history rewriting experiments. Again, my interactive rebases would
> >> always rebase both branches in the same way, not what I want. In this
> >> case I could work around it by doing the experiments on the original
> >> branch, creating a tag beforehand that I could reset back to if the
> >> experiments fail. But maybe I do want to keep both branches around for a
> >> while for some reason.
> > 
> > I would use a lightweight tag, too, since this option doesn’t touch tags.[1]
> > 
> > Why do you want to keep both branches around? 
> 
> Several reasons:
> 
> Maybe the original branch was pushed already, and I'm collaborating on
> it with a coworker. At the same time, I want to run my rebase experiment
> in parallel on a copy.
> 
> Maybe I want to create github PRs for both of them, in order to run CI
> on them, or get feedback for both of them from my coworkers.
> 
> Also, it just seems to be the most natural workflow for many people. I
> have seen my coworkers do this a lot without thinking much whether there
> would be a better way.

I also do this, however, I often create a new branch to point to the previous
one (`git branch foo-1`). I know I can refer to it with `foo@{1}`, but then I
have to keep track if I rebase more than once, or do any other reflog
operation.

If I've sent the series for review with my tool `git send-series`, then I don't
have to worry about that because I have refs for every version I sent.

A notion of branch versions really comes in handy.

Cheers.

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17 12:14 ` Phillip Wood
@ 2023-04-20 15:27   ` Stefan Haller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Haller @ 2023-04-20 15:27 UTC (permalink / raw)
  To: phillip.wood, git; +Cc: Derrick Stolee, Elijah Newren, Kristoffer Haugsbakk

On 17.04.23 14:14, Phillip Wood wrote:
> On 17/04/2023 09:21, Stefan Haller wrote:
>> Both of these cases could be fixed by --update-refs not touching any
>> refs that point to the current HEAD.
> 
>> I'm having a hard time coming up
>> with cases where you would ever want those to be updated, in fact.
> 
> If a user using stacked branches creates a new branch and then realizes
> they need to fix something on the parent before creating any commits on
> the new branch they would want both to be updated. e.g.   
>     $ git symbolic-ref HEAD
>     refs/heads/topic
>     $ git checkout -b another-topic
>     # fix a bug in topic - want topic and another-topic to be
>     # updated
>     $ git rebase -i --update-refs HEAD~2

OK, this is indeed one situation where my proposed change would do the
wrong thing.

It is of course impossible for git to tell whether you were meaning to
create a stack of branches here, or whether this is one of the cases
where I'm creating a copy of a branch and want to "detach" it from its
source branch, as in the examples I posted earlier in this thread.

In my personal experience the latter is much more common than the
former, and it's also easier to correct the mistake manually in your
example by hard-resetting one branch to the other again, so I still
think it would be a useful change.

Any other opinions about this?

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
                   ` (2 preceding siblings ...)
  2023-04-17 12:14 ` Phillip Wood
@ 2024-03-05  7:40 ` Stefan Haller
  2024-03-05 16:22   ` Junio C Hamano
  2024-03-07  7:59   ` Kristoffer Haugsbakk
  2024-03-24 10:42 ` Stefan Haller
  4 siblings, 2 replies; 18+ messages in thread
From: Stefan Haller @ 2024-03-05  7:40 UTC (permalink / raw)
  To: git; +Cc: Derrick Stolee, Elijah Newren, Phillip Wood, Christian Couder

On 17.04.23 10:21, Stefan Haller wrote:
> The --update-refs option of git rebase is so useful that I have it on by
> default in my config. For stacked branches I find it hard to think of
> scenarios where I wouldn't want it.
> 
> However, there are cases for non-stacked branches (i.e. other branches
> pointing at the current HEAD) where updating them is undesirable. In
> fact, pretty much always, for me. Two examples, both very similar:
> 
> 1. I have a topic branch which is based off of master; I want to make a
> copy of that branch and rebase it onto devel, just to try if that would
> work. I don't want the original branch to be moved along in this case.
> 
> 2. I have a topic branch, and I want to make a copy of it to make some
> heavy history rewriting experiments. Again, my interactive rebases would
> always rebase both branches in the same way, not what I want. In this
> case I could work around it by doing the experiments on the original
> branch, creating a tag beforehand that I could reset back to if the
> experiments fail. But maybe I do want to keep both branches around for a
> while for some reason.
> 
> Both of these cases could be fixed by --update-refs not touching any
> refs that point to the current HEAD. I'm having a hard time coming up
> with cases where you would ever want those to be updated, in fact.

Coming back to this after almost a year, I can say that I'm still
running into this problem relatively frequently, and it is annoying
every single time. Excluding refs pointing at the current head from
being updated, as proposed above, would be a big usability improvement
for me.

And I now see that "git replay --contained --onto" has the same problem,
which I find very unfortunate. In my opinion, "contained" should only
include refs that form a stack, but not copies of the current branch.

Of course, since branch stacks are only a heuristic and not a built-in
concept, it's impossible for git to distinguish between a pair of copied
branches and a degenerate stack whose top-most branch is (still) empty,
as in the example in [1]. In my personal experience though, degenerate
stacks like that are very rare, but copied branches are not, so for me
it would make a lot of sense to change the behavior of both "rebase
--update-refs" and "replay --contained".

-Stefan

[1] <https://public-inbox.org/git/
     98548a5b-7d30-543b-b943-fd48d8926a33@gmail.com/>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-05  7:40 ` Stefan Haller
@ 2024-03-05 16:22   ` Junio C Hamano
  2024-03-06  2:57     ` Elijah Newren
  2024-03-07  7:59   ` Kristoffer Haugsbakk
  1 sibling, 1 reply; 18+ messages in thread
From: Junio C Hamano @ 2024-03-05 16:22 UTC (permalink / raw)
  To: Stefan Haller
  Cc: git, Derrick Stolee, Elijah Newren, Phillip Wood, Christian Couder

Stefan Haller <lists@haller-berlin.de> writes:

>> Both of these cases could be fixed by --update-refs not touching any
>> refs that point to the current HEAD. I'm having a hard time coming up
>> with cases where you would ever want those to be updated, in fact.

The point of "update-refs", as I understand it, is that in addition
to the end point of the history (E in "git rebase --onto N O E"),
any branch tips that are between O..E can be migrated to point at
their rewritten counterparts.  So I am not sure how it fundamentally
solves much by protecting only refs that point at a single commit
("the current HEAD" in your statement).

When I want to see how the rebased history would look like without
touching the original, I often rebase a detached HEAD (i.e. instead
of the earlier one, use "git rebase --onto N O E^0", or when
rebasing the current branch, "git rebase [--onto N] O HEAD^0") and
that would protect the current branch well, but --update-refs of
course would not work well.  There is no handy place like detached
HEAD that can be used to save rewritten version of these extra
branch tips.

If branch tips A, B, and C are involved in the range of commits
being rewritten, one way to help us in such a situation may be to
teach "git rebase" to (1) somehow create a new set of proposed-A,
proposed-B, and proposed-C refs (they do not have to be branches),
while keeping the original A, B, and C intact, (2) allow us to
inspect the resulting refs, compare the corresponding ones from
these two sets, and (3) allow us to promote (possibly a subset of)
proposed- ones to their counterpart real branches after we inspect
them.  The latter two do not have to be subcommands of "git rebase"
but can be separate and new commands.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-05 16:22   ` Junio C Hamano
@ 2024-03-06  2:57     ` Elijah Newren
  2024-03-06 21:00       ` Stefan Haller
  0 siblings, 1 reply; 18+ messages in thread
From: Elijah Newren @ 2024-03-06  2:57 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Stefan Haller, git, Derrick Stolee, Phillip Wood, Christian Couder

[Restoring part of Stefan's earlier message so I can respond to both
that piece, as well as add to the ideas Junio presents.]

Hi,

On Tue, Mar 5, 2024 at 8:22 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Stefan Haller <lists@haller-berlin.de> writes:
>
> >> And I now see that "git replay --contained --onto" has the same problem,
> >> which I find very unfortunate. In my opinion, "contained" should only
> >> include refs that form a stack, but not copies of the current branch.

I wouldn't want to change the default.  Even if we were to add an
option, I'm not entirely sure what it should even implement.  In
addition to Phillip's previous response in the thread, and part of
Junio's response below (which I'll add to):

1) What if there is a branch that is "just a copy" of one of the
branches earlier in the "stack"?  Since it's "just a copy", shouldn't
it be excluded for similar reasons to what you are arguing?  And, if
so, which branch is the copy?

2) Further, a "stack", to me at least, suggests a linear history
without branching (i.e. each commit has at most one parent _and_ at
most one child among the commits in the stack).  I designed `git
replay` to handle diverging histories (i.e. rebasing multiple branches
that _might_ share a subset of common history but none necessarily
need to fully contain the others, though perhaps the branches do share
some other contained branches), and I want it to handle replaying
merges as well.  While `git rebase --update-refs` is absolutely
limited to "stacks", and thus your argument might make sense in the
context of `git rebase`, since you are bringing `git replay` into the
mix, it needs to apply beyond a stack of commits.  It's not clear to
me how to genericize your suggestions to handle cases other than a
simple stack of commits, though.

3) This is mostly covered in (1) and (2), but to be explicit: `git
replay` is completely against the HEAD-is-special assumptions that are
pervasive within `git rebase`, and your problem is entirely phrased as
HEAD-is-special due to your call out of "the current branch".  Is your
argument limited to such special cases?  (If so, it might still be
valid for `git rebase`, of course.)

4) Aren't there easier ways to handle this -- for both rebase and
replay?  I'll suggest some alternatives below...

> >> Both of these cases could be fixed by --update-refs not touching any
> >> refs that point to the current HEAD. I'm having a hard time coming up
> >> with cases where you would ever want those to be updated, in fact.
>
> The point of "update-refs", as I understand it, is that in addition
> to the end point of the history (E in "git rebase --onto N O E"),
> any branch tips that are between O..E can be migrated to point at
> their rewritten counterparts.  So I am not sure how it fundamentally
> solves much by protecting only refs that point at a single commit
> ("the current HEAD" in your statement).
>
> When I want to see how the rebased history would look like without
> touching the original, I often rebase a detached HEAD (i.e. instead
> of the earlier one, use "git rebase --onto N O E^0", or when
> rebasing the current branch, "git rebase [--onto N] O HEAD^0") and
> that would protect the current branch well, but --update-refs of
> course would not work well.  There is no handy place like detached
> HEAD that can be used to save rewritten version of these extra
> branch tips.
>
> If branch tips A, B, and C are involved in the range of commits
> being rewritten, one way to help us in such a situation may be to
> teach "git rebase" to (1) somehow create a new set of proposed-A,
> proposed-B, and proposed-C refs (they do not have to be branches),
> while keeping the original A, B, and C intact, (2) allow us to
> inspect the resulting refs, compare the corresponding ones from
> these two sets, and (3) allow us to promote (possibly a subset of)
> proposed- ones to their counterpart real branches after we inspect
> them.  The latter two do not have to be subcommands of "git rebase"
> but can be separate and new commands.

Here, Junio is suggesting one alternative, and it's already
implemented in `git replay`.  Let me extend upon it and add two other
alternatives as well:

4a) `git replay` does what Junio suggests naturally, since it doesn't
update the refs but instead gives commands which can be fed to `git
update-ref --stdin`.  Thus, users can inspect the output of `git
replay` and only perform the updates they want (by feeding a subset of
the lines to update-ref --stdin).

4b) For `git replay`, --contained is just syntactic sugar -- it isn't
necessary.  git replay will allow you to list multiple branches that
you want replayed, so you can specify which branches are relevant to
you.  (This doesn't help with `git rebase`, because `--update-refs` is
the only way to get additional branches replayed.)

4c) For `git rebase --update-refs`, you can add `--interactive` and
then delete the `update-ref` line(s) corresponding to the refs you
don't want updated.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-06  2:57     ` Elijah Newren
@ 2024-03-06 21:00       ` Stefan Haller
  2024-03-07  5:36         ` Elijah Newren
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Haller @ 2024-03-06 21:00 UTC (permalink / raw)
  To: Elijah Newren, Junio C Hamano
  Cc: git, Derrick Stolee, Phillip Wood, Christian Couder

On 06.03.24 03:57, Elijah Newren wrote:

> 1) What if there is a branch that is "just a copy" of one of the
> branches earlier in the "stack"?  Since it's "just a copy", shouldn't
> it be excluded for similar reasons to what you are arguing?  And, if
> so, which branch is the copy?

This is a good point, but in my experience it's a lot more rare. Maybe
I'm looking at all this just from my own experience, and there might be
other usecases that are very different from mine, but as far as I am
concerned, copies of branches are not long-lived. There is no point in
having two branches point at the same commit. When I create a copy of a
branch, I do that only to rebase the copy somewhere else _immediately_,
leaving the original branch where it was. Which means that I encounter
copied branches only at the top of the stack, not in the middle. Which
means that I'm fine with keeping the current behavior of "rebase
--update-ref" to update both copies of that middle-of-the-stack branch,
because it never happens in practice for me.

> 2) Further, a "stack", to me at least, suggests a linear history
> without branching (i.e. each commit has at most one parent _and_ at
> most one child among the commits in the stack).  I designed `git
> replay` to handle diverging histories (i.e. rebasing multiple branches
> that _might_ share a subset of common history but none necessarily
> need to fully contain the others, though perhaps the branches do share
> some other contained branches), and I want it to handle replaying
> merges as well.  While `git rebase --update-refs` is absolutely
> limited to "stacks", and thus your argument might make sense in the
> context of `git rebase`, since you are bringing `git replay` into the
> mix, it needs to apply beyond a stack of commits.  It's not clear to
> me how to genericize your suggestions to handle cases other than a
> simple stack of commits, though.

I don't see a contradiction here. I don't tend to do this in practice,
but I can totally imagine a tree of stacked branches that share some
common base branches in the beginning and then diverge into different
branches from there. It's true that "rebase --update-refs", when told to
rebase one of the leaf branches, will destroy this tree because it pulls
the base branches away from under the other leaf branches, but this is
unrelated to my proposal, it has this problem today already. And it's
awesome that git replay has a way to avoid this by rebasing the whole
tree at once, keeping everything intact. Still, I don't see what's bad
about excluding branches that point at the same commits as the leaf
branches it is told to rebase when using "replay --contains". (I suppose
what I'm suggesting is to treat "--contains" to mean "is included in the
half-open interval from base to tip" of the revision range you are
rebasing, rather than the closed interval.)

Maybe I should make this more explicit again: I'm not trying to solve
the problem of making a copy of a stack of branches, and rebasing that
copy somewhere else. I think this can't be solved except by making
branch stacks a new concept in git, which I'm not sure we want to do.

> 3) This is mostly covered in (1) and (2), but to be explicit: `git
> replay` is completely against the HEAD-is-special assumptions that are
> pervasive within `git rebase`, and your problem is entirely phrased as
> HEAD-is-special due to your call out of "the current branch".  Is your
> argument limited to such special cases?  (If so, it might still be
> valid for `git rebase`, of course.)

No, I don't think I need HEAD to be special. "The thing that I'm
rebasing" is special, and it is always HEAD for git rebase, but it can
be something else for replay.

> 4a) `git replay` does what Junio suggests naturally, since it doesn't
> update the refs but instead gives commands which can be fed to `git
> update-ref --stdin`.  Thus, users can inspect the output of `git
> replay` and only perform the updates they want (by feeding a subset of
> the lines to update-ref --stdin).

At this point I probably need to explain that I'm rarely using the
command line. I'm a user and co-maintainer of lazygit, and I want to
make lazygit work in such a way that "it does the right thing" in as
many cases as possible.

> 4b) For `git replay`, --contained is just syntactic sugar -- it isn't
> necessary.  git replay will allow you to list multiple branches that
> you want replayed, so you can specify which branches are relevant to
> you.

That's great, even if it means that I have to redo some of the work that
--contains would already do for me, just because I want a slightly
different behavior.

> 4c) For `git rebase --update-refs`, you can add `--interactive` and
> then delete the `update-ref` line(s) corresponding to the refs you
> don't want updated.

Yes, that's what I always do today to work around the problem. It's just
easy to forget, and I find it annoying that I have to take this extra
step every time.

One last remark: whenever I describe my use case involving copies of
branches, people tell me not to do that, use detached heads instead, or
other ways to achieve what I want. But then I don't understand why my
proposal would make a difference for them. If you don't use copied
branches, then why do you care whether "rebase --update-refs" or "replay
--contained" moves those copies or not? I still haven't heard a good
argument for why the current behavior is desirable, except for the one
example of a degenerate stack that Phillip Wood described in [1].

-Stefan


[1] <https://public-inbox.org/git/
     98548a5b-7d30-543b-b943-fd48d8926a33@gmail.com/>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-06 21:00       ` Stefan Haller
@ 2024-03-07  5:36         ` Elijah Newren
  2024-03-07 20:16           ` Stefan Haller
  0 siblings, 1 reply; 18+ messages in thread
From: Elijah Newren @ 2024-03-07  5:36 UTC (permalink / raw)
  To: Stefan Haller
  Cc: Junio C Hamano, git, Derrick Stolee, Phillip Wood, Christian Couder

On Wed, Mar 6, 2024 at 1:00 PM Stefan Haller <lists@haller-berlin.de> wrote:
>
> On 06.03.24 03:57, Elijah Newren wrote:
>
> > 1) What if there is a branch that is "just a copy" of one of the
> > branches earlier in the "stack"?  Since it's "just a copy", shouldn't
> > it be excluded for similar reasons to what you are arguing?  And, if
> > so, which branch is the copy?
>
> This is a good point, but in my experience it's a lot more rare. Maybe
> I'm looking at all this just from my own experience, and there might be
> other usecases that are very different from mine, but as far as I am
> concerned, copies of branches are not long-lived.

> There is no point in having two branches point at the same commit.

But isn't that what you're doing?

> When I create a copy of a
> branch, I do that only to rebase the copy somewhere else _immediately_,
> leaving the original branch where it was.

If it is inherently tied like this, why not create the new branch
immediately after the rebase (with active_branch@{1} as the start
point), instead of creating it immediately before?

> Which means that I encounter
> copied branches only at the top of the stack, not in the middle. Which
> means that I'm fine with keeping the current behavior of "rebase
> --update-ref" to update both copies of that middle-of-the-stack branch,
> because it never happens in practice for me.

You've really lost me here; are you saying you're fine changing the
design to add inherent edgecase bugs to the code because those edge
cases "never happen in practice for me"?  I've spent a lot of time
dealing with built up cruft in git from partial solutions and fixes
that overlooked subsets of relevant testcases, so I'm not a fan of
that statement and in particular the last two words of it.  Perhaps
I'm reading it wrong, and if so I apologize, but it triggered unhappy
memories of mine from merge-recursive.c and dir.c and elsewhere.

> > 2) Further, a "stack", to me at least, suggests a linear history
> > without branching (i.e. each commit has at most one parent _and_ at
> > most one child among the commits in the stack).  I designed `git
> > replay` to handle diverging histories (i.e. rebasing multiple branches
> > that _might_ share a subset of common history but none necessarily
> > need to fully contain the others, though perhaps the branches do share
> > some other contained branches), and I want it to handle replaying
> > merges as well.  While `git rebase --update-refs` is absolutely
> > limited to "stacks", and thus your argument might make sense in the
> > context of `git rebase`, since you are bringing `git replay` into the
> > mix, it needs to apply beyond a stack of commits.  It's not clear to
> > me how to genericize your suggestions to handle cases other than a
> > simple stack of commits, though.
>
> I don't see a contradiction here. I don't tend to do this in practice,
> but I can totally imagine a tree of stacked branches that share some
> common base branches in the beginning and then diverge into different
> branches from there. It's true that "rebase --update-refs", when told to
> rebase one of the leaf branches, will destroy this tree because it pulls
> the base branches away from under the other leaf branches, but this is
> unrelated to my proposal, it has this problem today already. And it's
> awesome that git replay has a way to avoid this by rebasing the whole
> tree at once, keeping everything intact. Still, I don't see what's bad
> about excluding branches that point at the same commits as the leaf
> branches it is told to rebase when using "replay --contains".

By "leaf branches", do you mean (a) those commits explicitly mentioned
on the command line for being replayed, (b) only the subset of the
branches mentioned on the command line which aren't an ancestor of
another commit being replayed, or (c) something else?

> (I suppose
> what I'm suggesting is to treat "--contains" to mean "is included in the
> half-open interval from base to tip" of the revision range you are
> rebasing, rather than the closed interval.)

"half-open interval"?  That to me again implies a simple stack, which
since we're trying to address the more general case, makes me more
confused rather than less.

Let me re-ask my question another way.  If someone runs
    git replay --onto A --contained ^B ^C D E F
when branches G, H, & I are in the revision range of "^B ^C D E F",
with G in particular pointing where D does and H pointing where E
does, and E contains D in its history, and F contains commits that are
in neither D nor E, how do I figure out which of D-I should be
updated?

> Maybe I should make this more explicit again: I'm not trying to solve
> the problem of making a copy of a stack of branches, and rebasing that
> copy somewhere else. I think this can't be solved except by making
> branch stacks a new concept in git, which I'm not sure we want to do.

Oh, I hadn't even thought of that.  Yeah, that'd be even more complex.

> > 3) This is mostly covered in (1) and (2), but to be explicit: `git
> > replay` is completely against the HEAD-is-special assumptions that are
> > pervasive within `git rebase`, and your problem is entirely phrased as
> > HEAD-is-special due to your call out of "the current branch".  Is your
> > argument limited to such special cases?  (If so, it might still be
> > valid for `git rebase`, of course.)
>
> No, I don't think I need HEAD to be special. "The thing that I'm
> rebasing" is special, and it is always HEAD for git rebase, but it can
> be something else for replay.

But what exactly should that something else be?  I still don't
understand what that is from your explanation so far.

> > 4a) `git replay` does what Junio suggests naturally, since it doesn't
> > update the refs but instead gives commands which can be fed to `git
> > update-ref --stdin`.  Thus, users can inspect the output of `git
> > replay` and only perform the updates they want (by feeding a subset of
> > the lines to update-ref --stdin).
>
> At this point I probably need to explain that I'm rarely using the
> command line. I'm a user and co-maintainer of lazygit, and I want to
> make lazygit work in such a way that "it does the right thing" in as
> many cases as possible.

...and I'm pointing out that `git replay` has the necessary tools to
enable you to do so.  Unlike `git rebase --update-refs` it doesn't
automatically update the branches, but just creates the new commits
and tells you what it could update each branch to, in a format that
you can pass along to another tool to actually do the updates of the
branches.  As such, you can write your tool to take that output, pick
out the bits you like, and only pass those bits along so that only
some of the branches are updated.

> > 4b) For `git replay`, --contained is just syntactic sugar -- it isn't
> > necessary.  git replay will allow you to list multiple branches that
> > you want replayed, so you can specify which branches are relevant to
> > you.
>
> That's great, even if it means that I have to redo some of the work that
> --contains would already do for me, just because I want a slightly
> different behavior.

Right, but I thought you were maintaining lazygit, meaning that
programming it to select the branches you want is a one time cost?

Something like `git log --format=%D --decorate-refs=refs/heads/
${base}..HEAD^1 | grep -v ^$`, plus adding in the current branch,
right?

Or is the concern with this suggestion the performance hit you'd take
(which admittedly might be a problem with this solution, since you
walk the commits an extra time)?

> > 4c) For `git rebase --update-refs`, you can add `--interactive` and
> > then delete the `update-ref` line(s) corresponding to the refs you
> > don't want updated.
>
> Yes, that's what I always do today to work around the problem. It's just
> easy to forget, and I find it annoying that I have to take this extra
> step every time.

And if you forget, then after the rebase it's trivial to move the
updated branch back to where you want it, right?

   git branch -f ${copy_branch_name} ${current_branch_name}@{1}

In fact, that's probably easier than making the rebase interactive,
and should be easier to remember since you only ever create these
branches precisely when you want to do a rebase.

> One last remark: whenever I describe my use case involving copies of
> branches, people tell me not to do that, use detached heads instead, or
> other ways to achieve what I want. But then I don't understand why my
> proposal would make a difference for them. If you don't use copied
> branches, then why do you care whether "rebase --update-refs" or "replay
> --contained" moves those copies or not? I still haven't heard a good
> argument for why the current behavior is desirable, except for the one
> example of a degenerate stack that Phillip Wood described in [1].

The current behavior is easy to describe and explain to users, and
generalizes nicely to cases of replaying multiple diverging and
converging branches.

To me, the behavior you're proposing doesn't seem to share either of
those qualities, at least not as you've explained it so far.

But, perhaps that's because I still don't really understand your
usecase.  I'm trying to, and it's possible I could be convinced there
is a proposal here that is easy to explain to users and generalizes
nicely.  My way of attempting to get that out is to make
counter-proposals and ask questions as a way of teasing out what your
usecase is and what a refined proposal might be.  Currently, it seems
there are two trivial alternative solutions that would solve this
problem more cleanly (namely, either creating the branch after the
fact instead of beforehand, or simply updating the branch after the
fact)...but maybe I'm still just missing something?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-05  7:40 ` Stefan Haller
  2024-03-05 16:22   ` Junio C Hamano
@ 2024-03-07  7:59   ` Kristoffer Haugsbakk
  2024-03-07  8:22     ` Elijah Newren
  1 sibling, 1 reply; 18+ messages in thread
From: Kristoffer Haugsbakk @ 2024-03-07  7:59 UTC (permalink / raw)
  To: Stefan Haller
  Cc: Derrick Stolee, Elijah Newren, Phillip Wood, Christian Couder, git

On Tue, Mar 5, 2024, at 08:40, Stefan Haller wrote:
> Coming back to this after almost a year, I can say that I'm still
> running into this problem relatively frequently, and it is annoying
> every single time. Excluding refs pointing at the current head from
> being updated, as proposed above, would be a big usability improvement
> for me.

Sounds like a ref-stash command is in order…

    # I want a new branch
    git checkout -b new
    # But I don’t want it to be affected by the next rebase
    git ref-stash push
    git rebase [...]
    # Now I’m done: put the ref back where it was
    git ref-stash pop

-- 
Kristoffer Haugsbakk

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-07  7:59   ` Kristoffer Haugsbakk
@ 2024-03-07  8:22     ` Elijah Newren
  0 siblings, 0 replies; 18+ messages in thread
From: Elijah Newren @ 2024-03-07  8:22 UTC (permalink / raw)
  To: Kristoffer Haugsbakk
  Cc: Stefan Haller, Derrick Stolee, Phillip Wood, Christian Couder, git

Hi,

On Wed, Mar 6, 2024 at 11:59 PM Kristoffer Haugsbakk
<code@khaugsbakk.name> wrote:
>
> On Tue, Mar 5, 2024, at 08:40, Stefan Haller wrote:
> > Coming back to this after almost a year, I can say that I'm still
> > running into this problem relatively frequently, and it is annoying
> > every single time. Excluding refs pointing at the current head from
> > being updated, as proposed above, would be a big usability improvement
> > for me.
>
> Sounds like a ref-stash command is in order…

A what?

>     # I want a new branch
>     git checkout -b new
>     # But I don’t want it to be affected by the next rebase

This doesn't make any sense; rebase always operates on the current
branch, and Stefan wasn't asking for anything otherwise.  He was just
concerned that with --update-refs, one of the other branches it also
operated on was one he didn't want it to operate on.

Perhaps you meant
    git branch new
for your first command?

>     git ref-stash push
>     git rebase [...]
>     # Now I’m done: put the ref back where it was
>     git ref-stash pop

Leaving aside questions about how ref-stash is supposed to interact
with each and every other command out there, and how it's supposed to
know which branches it's operating on when you do pushes and pops...

Why do we need to invent a new command, when we already have the
reflog?  You could drop both ref-stash commands, and instead just have
a
   git branch -f new new@{1}
at the end (assuming of course "git branch new" was used instead of
your "git checkout -b new", as I suggested earlier) to put "new" back
to where it was before the rebase.  That's fewer commands.

Or, even simpler, drop the initial branch creation and both ref-stash
commands by just not creating the branch until after the rebase.
That'd make the entire set of commands just be:
   git rebase --update-refs [...]
   git branch new current_branch@{1}

Plus, either solution works today and needs no new changes.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-07  5:36         ` Elijah Newren
@ 2024-03-07 20:16           ` Stefan Haller
  2024-03-09  3:28             ` Elijah Newren
  0 siblings, 1 reply; 18+ messages in thread
From: Stefan Haller @ 2024-03-07 20:16 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, Derrick Stolee, Phillip Wood, Christian Couder

Elijah, thanks for your patience with this. I appreciate the time and
energy you put into understanding what I want to achieve. The questions
you are asking help me understand my proposal better myself.

It seems that I didn't do a very good job at getting my point across so
far, so I'll try again in a more structured way.

Let's begin by describing two very different user scenarios:

1) Stacked branches. Git supports these reasonably well for simple cases
through the "rebase --update-refs" command (and the "rebase.updateRefs"
config), but since they are not a first-class concept, git needs to rely
on heuristics to determine which branches are part of a stack. For
simple cases this works very well, but more esoteric cases can have
problems (e.g. a non-linear topology of multiple stacks that may share
common base branches and then diverge, in which case rebasing one of
them destroys the others; or degenerate stacks involving "empty"
branches either in the middle or at the top, in which case there's no
way to tell what the order of the branches is supposed to be).

2) Copying a branch, and rebasing it away from the original one (for
non-stacked branches, see below). The use case is that you have a branch
called topic-1 (branched off main), which is pushed and in review
already, with CI running on it, and you want to test whether it works on
devel, so you make a new branch called topic-1-on-devel off of topic-1,
and rebase it onto devel. You want to make a draft PR of that new branch
to have CI run on it, too, and of course you want to keep the original
branch untouched. For me and most of my co-worker that I have observed
in pairing sessions, the natural way to achieve this is as described
above: checkout a new branch, and rebase it where you want it to go.

Next I'll describe my goals, and my non-goals. I know I can easily
achieve 2) by simply not using --update-refs, but I like to have
"rebase.updateRefs" set to true by default because it is so useful, and
having to remember to use --no-update-refs whenever I do 2) is annoying.

So my goal is to make 2) work well (in the simple, non-stacked case)
even when "rebase.updateRefs" is true, while not making 1) work any
worse in the "normal", non-degenerate case.

I'm _not_ trying to fix the problems that --update-refs has today (I
briefly mentioned some of them above, but there are more), and I'm not
trying to make 2) work well with stacked branches. It would certainly be
nice if that would work too, but I don't think it can without
introducing branch stacks as a first-class feature in git, so I'll have
to live with not supporting that case well. It would still be a big
improvement for me without that.

I'll now go on to respond to some of your questions inline below, but
I'll skip some of them in order to not make this too long. Do let me
know if there are still open questions that I didn't address.

On 07.03.24 06:36, Elijah Newren wrote:
> On Wed, Mar 6, 2024 at 1:00 PM Stefan Haller <lists@haller-berlin.de> wrote:
>>
>> On 06.03.24 03:57, Elijah Newren wrote:
>>
>>> 1) What if there is a branch that is "just a copy" of one of the
>>> branches earlier in the "stack"?  Since it's "just a copy", shouldn't
>>> it be excluded for similar reasons to what you are arguing?  And, if
>>> so, which branch is the copy?
>>
>> This is a good point, but in my experience it's a lot more rare. Maybe
>> I'm looking at all this just from my own experience, and there might be
>> other usecases that are very different from mine, but as far as I am
>> concerned, copies of branches are not long-lived.
> 
>> There is no point in having two branches point at the same commit.
> 
> But isn't that what you're doing?

Only briefly, not permanently. I only described this to illustrate why
it never happens to encounter branch copies in the middle of a stack.

>> When I create a copy of a
>> branch, I do that only to rebase the copy somewhere else _immediately_,
>> leaving the original branch where it was.
> 
> If it is inherently tied like this, why not create the new branch
> immediately after the rebase (with active_branch@{1} as the start
> point), instead of creating it immediately before?

That would be the wrong way round. I want to leave the original branch
untouched, make a new branch and rebase that away from the original.

>> Which means that I encounter
>> copied branches only at the top of the stack, not in the middle. Which
>> means that I'm fine with keeping the current behavior of "rebase
>> --update-ref" to update both copies of that middle-of-the-stack branch,
>> because it never happens in practice for me.
> 
> You've really lost me here; are you saying you're fine changing the
> design to add inherent edgecase bugs to the code because those edge
> cases "never happen in practice for me"?

Wait, now you are really turning things around. You make it sound like
my proposal is responsible for what you call a "bug" here. It's not, git
already behaves like this (and you may or may not consider that a
problem), and my proposal doesn't change anything about it. It doesn't
"fix" it, that's right (and this is what I referred to when I said "I'm
fine with it"), but it doesn't make it any worse either.

>> I don't see a contradiction here. I don't tend to do this in practice,
>> but I can totally imagine a tree of stacked branches that share some
>> common base branches in the beginning and then diverge into different
>> branches from there. It's true that "rebase --update-refs", when told to
>> rebase one of the leaf branches, will destroy this tree because it pulls
>> the base branches away from under the other leaf branches, but this is
>> unrelated to my proposal, it has this problem today already. And it's
>> awesome that git replay has a way to avoid this by rebasing the whole
>> tree at once, keeping everything intact. Still, I don't see what's bad
>> about excluding branches that point at the same commits as the leaf
>> branches it is told to rebase when using "replay --contains".
> 
> By "leaf branches", do you mean (a) those commits explicitly mentioned
> on the command line for being replayed, (b) only the subset of the
> branches mentioned on the command line which aren't an ancestor of
> another commit being replayed, or (c) something else?

If I understand you right (and if I understand the user interface of
git-replay right), then what I mean is the combination of all single
commits that are mentioned on the command line, plus the right side of
all A..B ranges that are mentioned on the command line. In my mental
model those are "the things that are being rebased" (please let me know
if that mental model is wrong), and I am proposing to exclude all
branches from updating that point to any of those and are not mentioned
on the command line, because they can be considered copies.

> Let me re-ask my question another way.  If someone runs
>     git replay --onto A --contained ^B ^C D E F
> when branches G, H, & I are in the revision range of "^B ^C D E F",
> with G in particular pointing where D does and H pointing where E
> does, and E contains D in its history, and F contains commits that are
> in neither D nor E, how do I figure out which of D-I should be
> updated?

D, E, F, and I are updated, G and H are not; this seems very obvious to
me. D, E, and F because they are all mentioned explicitly; G and H are
not updated because they point to one of the "things-to-be-rebased", so
they are copies; I is updated because it is contained in E but does not
point at one of the "things-to-be-rebased", so it's part of a "stack"
(or whatever you want to call this topology).

It's a heuristic; we need a way to distinguish things that are part of a
stack from things that are copies. My heuristic for this relies on the
assumption that the stack is not degenerate in the sense that it doesn't
contain any "empty" branches in the middle or at the top of the stack,
otherwise it wouldn't be possible to distinguish the two.

>> No, I don't think I need HEAD to be special. "The thing that I'm
>> rebasing" is special, and it is always HEAD for git rebase, but it can
>> be something else for replay.
> 
> But what exactly should that something else be?  I still don't
> understand what that is from your explanation so far.

All the refs that are mentioned on the command line, either as a single
commit or as the second half of an A..B expression. It may well be that
I have some misconception of how exactly git replay works, this sounds
like the most likely explanation for why we don't understand each other.

> Something like `git log --format=%D --decorate-refs=refs/heads/
> ${base}..HEAD^1 | grep -v ^$`, plus adding in the current branch,
> right?
> 
> Or is the concern with this suggestion the performance hit you'd take
> (which admittedly might be a problem with this solution, since you
> walk the commits an extra time)?

Yes, I can do that, and no, I'm not concerned about performance. We
already have all that data cached in memory anyway, so that's not a
problem. But this would only work for git replay, there's no way to do
the same thing for --update-ref. My goal is to offer the same features
both for the checked out branch (using rebase) and other branches (using
replay), and have them behave the same.

So my proposal is more about changing --update-ref, since I can solve it
manually for replay, as you described. However, _if_ we decide to change
"rebase --update-ref", then I think it would make sense to change
"replay --contains" in the same way, so that they behave more consistently.

>> One last remark: whenever I describe my use case involving copies of
>> branches, people tell me not to do that, use detached heads instead, or
>> other ways to achieve what I want. But then I don't understand why my
>> proposal would make a difference for them. If you don't use copied
>> branches, then why do you care whether "rebase --update-refs" or "replay
>> --contained" moves those copies or not? I still haven't heard a good
>> argument for why the current behavior is desirable, except for the one
>> example of a degenerate stack that Phillip Wood described in [1].
> 
> The current behavior is easy to describe and explain to users, and
> generalizes nicely to cases of replaying multiple diverging and
> converging branches.

It sounds like you value the property of being easy to describe higher
than doing the expected thing in as many cases as possible.

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-07 20:16           ` Stefan Haller
@ 2024-03-09  3:28             ` Elijah Newren
  2024-03-12  9:28               ` Stefan Haller
  0 siblings, 1 reply; 18+ messages in thread
From: Elijah Newren @ 2024-03-09  3:28 UTC (permalink / raw)
  To: Stefan Haller
  Cc: Junio C Hamano, git, Derrick Stolee, Phillip Wood, Christian Couder

On Thu, Mar 7, 2024 at 12:16 PM Stefan Haller <lists@haller-berlin.de> wrote:
>
> Elijah, thanks for your patience with this. I appreciate the time and
> energy you put into understanding what I want to achieve. The questions
> you are asking help me understand my proposal better myself.
>
> It seems that I didn't do a very good job at getting my point across so
> far, so I'll try again in a more structured way.

I think I fell short on communication on a few points as well; sorry about that.

> Let's begin by describing two very different user scenarios:
>
> 1) Stacked branches. Git supports these reasonably well for simple cases
> through the "rebase --update-refs" command (and the "rebase.updateRefs"
> config), but since they are not a first-class concept, git needs to rely
> on heuristics to determine which branches are part of a stack. For
> simple cases this works very well, but more esoteric cases can have
> problems (e.g. a non-linear topology of multiple stacks that may share
> common base branches and then diverge, in which case rebasing one of
> them destroys the others; or degenerate stacks involving "empty"
> branches either in the middle or at the top, in which case there's no
> way to tell what the order of the branches is supposed to be).
>
> 2) Copying a branch, and rebasing it away from the original one (for
> non-stacked branches, see below). The use case is that you have a branch
> called topic-1 (branched off main), which is pushed and in review
> already, with CI running on it, and you want to test whether it works on
> devel, so you make a new branch called topic-1-on-devel off of topic-1,
> and rebase it onto devel. You want to make a draft PR of that new branch
> to have CI run on it, too, and of course you want to keep the original
> branch untouched. For me and most of my co-worker that I have observed
> in pairing sessions, the natural way to achieve this is as described
> above: checkout a new branch, and rebase it where you want it to go.
>
> Next I'll describe my goals, and my non-goals. I know I can easily
> achieve 2) by simply not using --update-refs, but I like to have
> "rebase.updateRefs" set to true by default because it is so useful, and
> having to remember to use --no-update-refs whenever I do 2) is annoying.
>
> So my goal is to make 2) work well (in the simple, non-stacked case)
> even when "rebase.updateRefs" is true, while not making 1) work any
> worse in the "normal", non-degenerate case.

Thanks, this is helpful background.

> I'm _not_ trying to fix the problems that --update-refs has today (I
> briefly mentioned some of them above, but there are more), and I'm not
> trying to make 2) work well with stacked branches. It would certainly be
> nice if that would work too, but I don't think it can without
> introducing branch stacks as a first-class feature in git, so I'll have
> to live with not supporting that case well. It would still be a big
> improvement for me without that.

> >> When I create a copy of a
> >> branch, I do that only to rebase the copy somewhere else _immediately_,
> >> leaving the original branch where it was.
> >
> > If it is inherently tied like this, why not create the new branch
> > immediately after the rebase (with active_branch@{1} as the start
> > point), instead of creating it immediately before?
>
> That would be the wrong way round. I want to leave the original branch
> untouched, make a new branch and rebase that away from the original.

Ah, sorry for misunderstanding.  Still, though, what's wrong with running
    git branch -f original_branch original_branch@{1}
after the operation?  That'll make the original branch point to where
it was before the rebase operation.  Since there's no separation in
time between when you create the new copy branch and do this rebase
operation, it's not a matter of forgetting that there was this
original branch that you wanted to reflect its own pre-rebase state,
right?

Also, since you're not using the git cli directly but going through
lazygit, isn't this something you can just include in lazygit as part
of whatever overall operation is creating the new copy branch and
rebasing it?

> >> Which means that I encounter
> >> copied branches only at the top of the stack, not in the middle. Which
> >> means that I'm fine with keeping the current behavior of "rebase
> >> --update-ref" to update both copies of that middle-of-the-stack branch,
> >> because it never happens in practice for me.
> >
> > You've really lost me here; are you saying you're fine changing the
> > design to add inherent edgecase bugs to the code because those edge
> > cases "never happen in practice for me"?
>
> Wait, now you are really turning things around. You make it sound like
> my proposal is responsible for what you call a "bug" here. It's not, git
> already behaves like this (and you may or may not consider that a
> problem), and my proposal doesn't change anything about it. It doesn't
> "fix" it, that's right (and this is what I referred to when I said "I'm
> fine with it"), but it doesn't make it any worse either.

Ah, I see where I was unclear as well, and my lack of clarity stemmed
from not understanding your proposal.  To try to close the loop, allow
me to re-translate your "This is a good point, but..it never happens
in practice for me." paragraph, the way I _erroneously_ read it at the
time:

"""
For my new proposal, the case you bring up is a good point.  But it
doesn't happen for me, so I propose to leave it as undefined behavior.
[As undefined behavior, anyone that triggers it is likely to get
behavior they deem buggy and not like it, but that won't affect me.]
"""

Now, obviously, that doesn't sound quite right.  I knew it at the
time, but reading and re-reading your paragraph, it kept coming out
that way for me.  Thus I tried to ask if that's what you really meant,
and apologizing in advance if I was mis-reading.

Anyway, with the extra explanation in your latest email, I now see
that you weren't leaving it undefined, but your proposal wasn't clear
to me either in that paragraph or in combination with the rest of your
previous email.  Sorry for my misunderstanding.

> > By "leaf branches", do you mean (a) those commits explicitly mentioned
> > on the command line for being replayed, (b) only the subset of the
> > branches mentioned on the command line which aren't an ancestor of
> > another commit being replayed, or (c) something else?
>
> If I understand you right (and if I understand the user interface of
> git-replay right), then what I mean is the combination of all single
> commits that are mentioned on the command line, plus the right side of
> all A..B ranges that are mentioned on the command line. In my mental
> model those are "the things that are being rebased" (please let me know
> if that mental model is wrong), and I am proposing to exclude all
> branches from updating that point to any of those and are not mentioned
> on the command line, because they can be considered copies.
>
> > Let me re-ask my question another way.  If someone runs
> >     git replay --onto A --contained ^B ^C D E F
> > when branches G, H, & I are in the revision range of "^B ^C D E F",
> > with G in particular pointing where D does and H pointing where E
> > does, and E contains D in its history, and F contains commits that are
> > in neither D nor E, how do I figure out which of D-I should be
> > updated?
>
> D, E, F, and I are updated, G and H are not; this seems very obvious to
> me. D, E, and F because they are all mentioned explicitly; G and H are
> not updated because they point to one of the "things-to-be-rebased", so
> they are copies; I is updated because it is contained in E but does not
> point at one of the "things-to-be-rebased", so it's part of a "stack"
> (or whatever you want to call this topology).
>
> It's a heuristic; we need a way to distinguish things that are part of a
> stack from things that are copies. My heuristic for this relies on the
> assumption that the stack is not degenerate in the sense that it doesn't
> contain any "empty" branches in the middle or at the top of the stack,
> otherwise it wouldn't be possible to distinguish the two.

Ah, okay, now I understand the concrete proposal; thanks.

> However, _if_ we decide to change
> "rebase --update-ref", then I think it would make sense to change
> "replay --contains" in the same way, so that they behave more consistently.

Yep, makes sense.

> >> One last remark: whenever I describe my use case involving copies of
> >> branches, people tell me not to do that, use detached heads instead, or
> >> other ways to achieve what I want. But then I don't understand why my
> >> proposal would make a difference for them. If you don't use copied
> >> branches, then why do you care whether "rebase --update-refs" or "replay
> >> --contained" moves those copies or not? I still haven't heard a good
> >> argument for why the current behavior is desirable, except for the one
> >> example of a degenerate stack that Phillip Wood described in [1].
> >
> > The current behavior is easy to describe and explain to users, and
> > generalizes nicely to cases of replaying multiple diverging and
> > converging branches.
>
> It sounds like you value the property of being easy to describe higher
> than doing the expected thing in as many cases as possible.

There's certainly some truth to that.  For example, "three-way merges"
are unsophisticated, and folks have suggested various ways of "making
them smarter" over the years (though there was more of this years
ago).  To quote someone else on that:
"""
Me _personally_, I want to have something that is very repeatable and
non-clever. Something I understand _or_ tells me that it can't do it.
"""
and
"""
I just don't like your notion that you should support the 5% problem with
ugly hacks, and then you dismiss the 95% problem with "nothing else does
it either".

In other words, we're already merging manually for the 95%. Why do you
think the 5% is so important?
"""

Three-way merging and rebase --update-refs are obviously quite
different, so this might not be a good analogy.  I'm just saying you
are right that I do sometimes tend to have the same biases as the
author of the above quotes.

But, to be more direct about this particular issue, I've actually had
the usecase Phillip described, and never run into yours.  Yes, it's
rare that I've run into Phillip's described case, but rare is still
more often than never.  That said, I totally accept that I might be an
oddball.  So, I think it's important to look at the alternatives:

* If we make no modifications to --update-refs:
  * the --update-refs behavior is very simple to describe
  * folks with your usecase immediately understand why the copied
branch was updated, even though you didn't want it to be
  * you have a trivial workaround you can run, as mentioned above (git
branch -f original_branch original_branch@{1})

* If we modify --update-refs as you suggest:
  * the --update-refs behavior is more complicated to describe to users
  * folks with Phillip's usecase probably assume a bug and report it
since it isn't going to make any sense to them (and, my guess is, many
would report a bug even if the behavior is documented)

The downsides for the latter option seem worse to me, so unless the
first usecase is predominant, I'd rather not make a change.  Granted,
you did claim the first usecase would be far more common, and you may
be right, but it's not so clear cut to me; I don't know how to
validate that.  I'd at least first like to hear why the workaround for
your usecase that looks trivial to me is too onerous for you.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2024-03-09  3:28             ` Elijah Newren
@ 2024-03-12  9:28               ` Stefan Haller
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Haller @ 2024-03-12  9:28 UTC (permalink / raw)
  To: Elijah Newren
  Cc: Junio C Hamano, git, Derrick Stolee, Phillip Wood, Christian Couder

On 09.03.24 04:28, Elijah Newren wrote:
>> That would be the wrong way round. I want to leave the original branch
>> untouched, make a new branch and rebase that away from the original.
> 
> Ah, sorry for misunderstanding.  Still, though, what's wrong with running
>     git branch -f original_branch original_branch@{1}
> after the operation?

It's unintuitive. Users don't think this way, at least as far as I have
observed them (and I don't think this way myself). Also, for many users
the branch{n} syntax to access previous reflog entries is an advanced
concept that they are not familiar with.

> Also, since you're not using the git cli directly but going through
> lazygit, isn't this something you can just include in lazygit as part
> of whatever overall operation is creating the new copy branch and
> rebasing it?

Yes, there are various workarounds that I could build into lazygit.
Right now I'm planning to have lazygit check whether any branch heads
point at any of the commits in the range of commits that is being
rebased except for the head, and if not, add --no-update-refs. This will
solve it well enough for most cases, and it doesn't bother me too much
that I have to add this additional complexity to our code. I was just
hoping that cli users typing

  git checkout -b original-branch copy
  git rebase --onto devel main

would get the same improvement. It bothers me a bit that we have to
build clients around the git cli that make it perform better than the
git cli does.

>> Wait, now you are really turning things around. You make it sound like
>> my proposal is responsible for what you call a "bug" here. It's not, git
>> already behaves like this (and you may or may not consider that a
>> problem), and my proposal doesn't change anything about it. It doesn't
>> "fix" it, that's right (and this is what I referred to when I said "I'm
>> fine with it"), but it doesn't make it any worse either.
> 
> Ah, I see where I was unclear as well, and my lack of clarity stemmed
> from not understanding your proposal.  To try to close the loop, allow
> me to re-translate your "This is a good point, but..it never happens
> in practice for me." paragraph, the way I _erroneously_ read it at the
> time:
> 
> """
> For my new proposal, the case you bring up is a good point.  But it
> doesn't happen for me, so I propose to leave it as undefined behavior.
> [As undefined behavior, anyone that triggers it is likely to get
> behavior they deem buggy and not like it, but that won't affect me.]
> """
> 
> Now, obviously, that doesn't sound quite right.  I knew it at the
> time, but reading and re-reading your paragraph, it kept coming out
> that way for me.  Thus I tried to ask if that's what you really meant,
> and apologizing in advance if I was mis-reading.
> 
> Anyway, with the extra explanation in your latest email, I now see
> that you weren't leaving it undefined, but your proposal wasn't clear
> to me either in that paragraph or in combination with the rest of your
> previous email.  Sorry for my misunderstanding.

I think it's worth clarifying this again, and see whether "undefined
behavior" is the right term to use here. Again, this discussion has
improved my own understanding of the matter, so let me try to spell it
out again:

The fundamental underlying problem is that when we encounter two
branches pointing at the same commit in a rebase, git has no way to
distinguish whether this is because there's an "empty" branch in a stack
(either at the top or in the middle), or whether one branch is a copy of
the other. In the first case, both branches should be updated by "rebase
--update-ref", in the second case only one of them should, since the
other is not part of the stack. Since there's no way for git to tell for
sure, it can only guess which of the two was meant by the user, with a
heuristic that hopefully guesses right in the majority of cases. I think
it would be wrong to call it a "bug" (or an "edge case bug" like you did
earlier) if it guesses wrong in a particular scenario.

Right now, it _always_ guesses in favor of the stack, so it never
considers a branch to be a copy. For my own use of git, and of my
co-workers as I have observed them in pairing sessions, this is almost
always wrong. I have never encountered an empty branch in a stack, as
far as I remember, but I am encountering copies of branches fairly
often, so I'd like to improve the heuristic to make git guess right in
these cases. Note that this is definitely not a 5% thing as in your
three-way merging example; I can't provide any hard numbers of course,
but it feels much more like the classical 80/20 rule to me (where my
proposal would improve it for 80% of the cases, to be clear).

So, I concluded that copies are much more frequent than empty branches
in a stack, so it would make sense for me to turn the heuristic around
and always guess in favor of a copied branch. The problem is that we can
only do this for the tip of the branch, because only in that case can we
tell which branch is the copy (the one being rebased) and which one is
the original that should be left alone. For branches in the middle of
the stack we just can't tell, so we have to guess in favor of an empty
branch in a stack and update both refs, since otherwise we'd have to
randomly pick one of them to update and leave the other one alone,
risking to break the stack this way.

So that's really where my proposal comes from: guess in favor of a
copied branch only at the tip but not in the middle; not because we only
want it at the tip, but just because we only can at the tip.

But fortunately, it is in fact true that I almost never create a copy of
a branch in the middle of a stack, but then I almost never have empty
branches in the middle of a stack either, so it doesn't really matter to
me which way the heuristic guesses in this case.

I hope this clarifies it a bit more.

Having written all this, I do realize that it's probably too complex to
explain to users (not the behavior itself, which is fairly simple, but
the rationale behind it).

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: Should --update-refs exclude refs pointing to the current HEAD?
  2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
                   ` (3 preceding siblings ...)
  2024-03-05  7:40 ` Stefan Haller
@ 2024-03-24 10:42 ` Stefan Haller
  4 siblings, 0 replies; 18+ messages in thread
From: Stefan Haller @ 2024-03-24 10:42 UTC (permalink / raw)
  To: git; +Cc: Elijah Newren, Phillip Wood

On 17.04.23 10:21, Stefan Haller wrote:
> The --update-refs option of git rebase is so useful that I have it on by
> default in my config. For stacked branches I find it hard to think of
> scenarios where I wouldn't want it.
> 
> However, there are cases for non-stacked branches (i.e. other branches
> pointing at the current HEAD) where updating them is undesirable. In
> fact, pretty much always, for me. Two examples, both very similar:
> 
> 1. I have a topic branch which is based off of master; I want to make a
> copy of that branch and rebase it onto devel, just to try if that would
> work. I don't want the original branch to be moved along in this case.
> 
> 2. I have a topic branch, and I want to make a copy of it to make some
> heavy history rewriting experiments. Again, my interactive rebases would
> always rebase both branches in the same way, not what I want. In this
> case I could work around it by doing the experiments on the original
> branch, creating a tag beforehand that I could reset back to if the
> experiments fail. But maybe I do want to keep both branches around for a
> while for some reason.
> 
> Both of these cases could be fixed by --update-refs not touching any
> refs that point to the current HEAD. I'm having a hard time coming up
> with cases where you would ever want those to be updated, in fact.

Sorry for continuing to beat this dead horse, but I just can't help
adding this other use case that I just ran into yesterday and that
supports my point as well.

Suppose I have a branch b, and I realize that it has commits for two
separate features, so I want to split it up into two independent
branches. The most natural way to do this is to create a branch b2 off
of b, do an interactive rebase on b and drop half of its commits, then
checkout b2, do an interactive rebase on it too and drop the other half
of the commits. With rebase.updateRefs set to true, the first
interactive rebase changes both b and b2, which is not what I want.

Of course you can argue that since I'm doing an interactive rebase in
this case, it's easy to see the update-ref todo and delete it if I don't
want it. That's true, but it's an extra thing that I have to pay
attention to.

Also, there's a twist as I'm writing this from the perspective of
lazygit again. Lazygit has a feature to drop commits from a branch
without doing an interactive rebase; it runs an interactive rebase
behind the scenes, sets the marked commits to "drop", and continues the
rebase. I don't get a chance to delete the update-ref todo in this case.

Now you can argue that this is really lazygit's problem then, and I can
add code to it to delete the update-ref todo in this scenario if that's
what I want. That's true again, and I will, but it bugs me that we have
to add clients around stock git to get the desired behavior. I would
prefer to change git so that it behaves in the desired way in the first
place.

And finally you can argue again that there's Phillip Wood's
counter-example, where you create b2 off of b with the intention to
create a stack, but before you make the first commit on b2 you realize
there's this unwanted commit in b that you want to drop first. In this
case you do want to update both b and b2. My take on this is that 1)
it's a much more rare case in my experience, and 2) it's much easier to
recover from. Once you realize that b2 wasn't updated by the rebase, you
just reset --hard it to b again. This is a straight-forward operation
that even novice git users know how to do. Compare this to having to
reset b2 back to where it was when you realize that it was updated by
the rebase and you didn't want it to; you need to get it back from the
reflog in that case, which is a more advanced operation for many users.

-Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-03-24 10:42 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-17  8:21 Should --update-refs exclude refs pointing to the current HEAD? Stefan Haller
2023-04-17  8:30 ` Stefan Haller
2023-04-17  8:34 ` Kristoffer Haugsbakk
2023-04-17  9:22   ` Stefan Haller
2023-04-18  2:00     ` Felipe Contreras
2023-04-17 12:14 ` Phillip Wood
2023-04-20 15:27   ` Stefan Haller
2024-03-05  7:40 ` Stefan Haller
2024-03-05 16:22   ` Junio C Hamano
2024-03-06  2:57     ` Elijah Newren
2024-03-06 21:00       ` Stefan Haller
2024-03-07  5:36         ` Elijah Newren
2024-03-07 20:16           ` Stefan Haller
2024-03-09  3:28             ` Elijah Newren
2024-03-12  9:28               ` Stefan Haller
2024-03-07  7:59   ` Kristoffer Haugsbakk
2024-03-07  8:22     ` Elijah Newren
2024-03-24 10:42 ` Stefan Haller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.