From: Jerry Zhang <jerry@skydio.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Git Mailing List <git@vger.kernel.org>,
Ross Yeager <ross@skydio.com>, Abraham Bachrach <abe@skydio.com>,
Brian Kubisiak <brian.kubisiak@skydio.com>
Subject: Re: [PATCH] git-rev-list: add --exclude-path-first-parent flag
Date: Tue, 20 Apr 2021 17:16:59 -0700 [thread overview]
Message-ID: <CAMKO5Cu68cnUu6UEuwQSHoFQ31g9g4TtYgy5vpe35cr90cETXw@mail.gmail.com> (raw)
In-Reply-To: <xmqqczutiddk.fsf@gitster.g>
On Sat, Apr 17, 2021 at 12:22 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> Jerry Zhang <jerry@skydio.com> writes:
>
> > On Fri, Apr 16, 2021 at 5:45 PM Junio C Hamano <gitster@pobox.com> wrote:
> >>
> >> Jerry Zhang <jerry@skydio.com> writes:
> >>
> >> > Add the --exclude-path-first-parent flag,
> >> > which works similarly to --first-parent,
> >> > but affects only the graph traversal for
> >> > the set of commits being excluded.
> >> >
> >> > -A-------E-HEAD
> >> > \ /
> >> > B-C-D
> >> >
> >> > In this example, the goal is to return the
> >> > set {B, C, D} which represents a working
> >> > branch that has been merged into main branch
> >> > E. `git rev-list D ^E` will end up returning
> >> > no commits since the exclude path eliminates
> >> > D and its ancestors.
> >> > `git rev-list --exclude-path-first-parent D ^E`
> >> > however will return {B, C, D} as desired.
> >>
> >> It is not clera why you want to have this, instead of doing a more
> >> obvious "D..E^". Even better is "E^..E", which is often what you
> >> want when viewing a history like my 'seen' that is a straight-line
> >> into which tips of branches are merged.
> > My motivation is to find the point at which a release branch forked off from
> > a main branch, even though the release branch could have been merged
> > into the main branch multiple times since it was forked off.
> >
> > If we add another merge from release to main, it will be more clear
> > that those give different results:
> >
> > -A-----E-F-main
> > \ / /
> > B-C-D-release
> >
> > `git rev-list --exclude-path-first-parent release ^main` returns {B, C, D}.
> > I've added commit F to show that we don't necessarily have info on E,
> > there could be many commits between it and the tip of main.
>
> OK, you meant to deal with repeated merges into integration branch.
>
> So the idea is to just name the end point merge, say F (you also
> could name D as the starting point, but see below), and
>
> - initially mark its first parent as UNINTERESTING (i.e. E), and
> other parents as INTERESTING (i.e. D).
>
> - run the revision traversal machinery, but when propagating the
> UNINTERESTING bit, give it only to the first parent. The second
> and later parents won't become UNINTERESTING.
>
> - stop after we exhaust INTERESTING commits.
>
> It would probably work for your idealized topology, but I do not
> know what happens when there are criss-cross merges. In the revised
> picture, you are merging down from the B-C-D chain into the
> mainline, but once the B-C-D chain becomes longer and diverges too
> much from the mainline, it becomes tempting to break the "merge only
> in one direction" discipline and merge back from the mainline, to
> "catch up", and such a merge will have the history of B-C-D line of
> development as its first parent. Would that screw up the selection
> of which line of development is uninteresting?
Yeah this flag (as well as the --first-parent flag) is mainly only useful
because "git merge" will always put the "branch you're on" as parent 1
and the "branch being merged in" as parent 2. It is possible to break
this assumption with either commit-tree or by merging while on one
branch and pushing to another, but then the user should understand
the consequences of doing so. In our case this isn't possible because
a server handles all merges into the main branches.
>
> >> > Add the --exclude-path-first-parent flag,
> >> > which works similarly to --first-parent,
> >> > but affects only the graph traversal for
> >> > the set of commits being excluded.
> >> >
> >> > -A-------E-HEAD
> >> > \ /
> >> > B-C-D
>
> In any case, it was totally unclear from the proposed log messsage,
> and the overlong option name that does not say much did not help me
> guess what you wanted to do with it. Specifically, it is not clear
> what "exclude" means (we do not usually use the word in the context
Exclude appears in the first paragraph of the man for git rev-list:
" List commits that are reachable by following the parent
links from the given commit(s), but exclude commits that
are reachable from the one(s) given with a ^ in front of
them. The output is given in reverse chronological order
by default."
It appears 5+ more times in the man page with the same meaning.
> of revision traversal), and when we talk about "path" in the context
> of revision traversal, we almost always mean the paths to the files,
> i.e. pathspec that limits and simplifies the shape of the history.
"path" is used in the same man page for the flag "--ancestry-path".
I agree that it could be ambiguous though, so perhaps "chain" would
be better.
> Also, it claims that it works similarly to --first-parent, but what
> you are doing is to propagate UNINTERESTING bit on the first-parent
> chain, which ends up showing the side branch (i.e. B-C-D chain),
> without showing the commits on the first-parent chain (A and E).
>
> What are the words that convey the idea behind this operation
> clearly at the conceptual level? Let's think aloud to see if we can
> come up with a better name.
>
> * first parents are unintertesting
>
> * show commits on side branch(es)
>
> * follow side branch.
>
> I think that is closer to the problem you are solving, if I
> understand what you wrote above correctly.
>
> Perhaps --show-side-branch or --follow-side-branch? I dunno.
For my particular use-case I am using it in combination with
--first-parent and a single include and exclude commit to show the
commits on the "side-branch" of the include commit. But if you specify
multiple commits for either or don't use --first-parent, the behavior is
different and I don't think "--side-branch" describes it well in those cases.
Since I don't believe I can predict all use-cases for the flag,
I'd rather name it by what it "does" rather than what it is "for".
If we're concerned about length, maybe "first-parent-not" could
get the meaning across:
- for "rev-list --first-parent A --not B" only first parents are visited
along A's ancestry
- for "rev-list --first-parent-not A --not B" it might be reasonable
that since B is a "not" commit, only first parents are visited along
B's ancestry.
Overall I don't think we can make a name so clear that the user
can avoid the man page anyway.
next prev parent reply other threads:[~2021-04-21 0:17 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-04-17 0:15 [PATCH] git-rev-list: add --exclude-path-first-parent flag Jerry Zhang
2021-04-17 0:45 ` Junio C Hamano
2021-04-17 1:07 ` Jerry Zhang
2021-04-17 4:09 ` Felipe Contreras
2021-04-17 7:22 ` Junio C Hamano
2021-04-21 0:16 ` Jerry Zhang [this message]
2021-04-21 0:48 ` [PATCH V2] git-rev-list: add --first-parent-not flag Jerry Zhang
2021-07-28 3:20 ` [PATCH V3] " Jerry Zhang
2021-12-11 2:13 ` Jerry Zhang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAMKO5Cu68cnUu6UEuwQSHoFQ31g9g4TtYgy5vpe35cr90cETXw@mail.gmail.com \
--to=jerry@skydio.com \
--cc=abe@skydio.com \
--cc=brian.kubisiak@skydio.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=ross@skydio.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).