git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Randall S. Becker" <rsbecker@nexbridge.com>
To: "'Emily Shaffer'" <emilyshaffer@google.com>,
	"'Jacob Keller'" <jacob.keller@gmail.com>
Cc: "'Git mailing list'" <git@vger.kernel.org>,
	"'Ævar Arnfjörð Bjarmason'" <avarab@gmail.com>,
	"'Jonathan Nieder'" <jrnieder@gmail.com>,
	"'Albert Cui'" <albertcui@google.com>,
	"'Junio C Hamano'" <gitster@pobox.com>,
	"'Matheus Tavares Bernardino'" <matheus.bernardino@usp.br>
Subject: RE: RFC/Discussion - Submodule UX Improvements
Date: Tue, 20 Apr 2021 15:38:13 -0400	[thread overview]
Message-ID: <01a001d7361c$b73c8560$25b59020$@nexbridge.com> (raw)
In-Reply-To: <YH8hxQKarZW6sU+9@google.com>

On April 20, 2021 2:48 PM, Emily Shaffer wrote:
> On Tue, Apr 20, 2021 at 09:18:05AM -0700, Jacob Keller wrote:
> >
> > On Mon, Apr 19, 2021 at 12:28 PM Randall S. Becker
> > <rsbecker@nexbridge.com> wrote:
> > > On April 19, 2021 3:15 PM, Jacob Keller wrote:
> > > > A sort of dream I had was a flow where I could do something from
> > > > the parent like "git blame <path/to/submodule>/submodule/file" and
> > > > have it present a blame of that files contents keyed on the
> > > > *parent* commit that changed the submodule to have that line, as
> > > > opposed to being forced to go into the submodule and figure out
> > > > what commit introduced it and then go back to the parent and find
> > > > out what commit changed the submodule to include that submodule
> commit.
> > >
> > > Not going to disagree, but are you looking for the blame on the
submodule
> ref file itself or files in the submodule? It's hard to teach git to do a
blame on a
> one-line file.
> > >
> >
> > Well, I would like if "git blame <path/to/submodule>" did.. something
> > other than just fail. Sometimes my brain is working in a "blame where
> > this came from" and I type that out and then get frustrated when it
> > fails. Additionally...
> >
> > > Otherwise, and I think this is what you really are going for, teaching
it to do
> a blame based on "git blame <path/to/submodule>/submodule/file" would be
> very nice and abstracts out the need for the user (or more importantly to
me =
> scripts) to understand that a submodule is involved; however, it is
opening up a
> very large door: "should/could we teach git to abstract submodules out of
every
> command". This would potentially replace a significant part of the use
cases for
> the "git submodule foreach" sub-command. In your ask, the current paradigm
> "cd <path/to/submodule>/submodule && git blame file" or pretty much every
> other command does work, but it requires the user/script to know you have
a
> submodule in the path. So my question is: is this worth the effort? I
don't have a
> good answer to that question. Half of my brain would like this very
much/the
> other half is scared of the impact to the code.
> > >
> > > Just my musings.
> >
> > I'm not asking for "git blame <path/to/submodule>/<file>" to give the
> > the same outout as "cd <path/to/submodule> && git blame <file>"
> >
> > What i'm asking is: given this file, tell me which commit in the
> > parent did the line get introduced. So basically I want to walk over
> > the changes to the submodule pointer and find out when it get
> > introduced into the parent, not when it got introduced into the
> > submodule itself.
> >
> > This is a related question, but it is actually not trivial to go
> > instantly from "it was in xyz submodule commit" to "it was then pulled
> > in by xyz parent commit". It's something that is quite tedious to do
> > manually, especially since the submodule pointer could change
> > arbitrarily so knowing the submodule commit doesn't mean you can
> > simply grep for which commit set the submodule exactly to that commit.
> > Essentially, I want a 'git blame' that ignores all changes which
> > aren't actually the submodule pointer, update.
> >
> > I think that's something that is much harder to do manually, but feels
> > like it should be relatively simple to implement within the blame
> > algorithm. I don't feel like this is something strictly replaceable by
> > "git submodule foreach"
> 
> I think I understand what you're saying. Something like the following
> tree:
> 
> super   sub
> b------->4
>          3
>          2
> a------->1
> 
> producing something like this:
> 
> 'git -C sub blame main.c'
> 
> 1 AU Thor	2020-01-01
> 2 CO Mitter	2020-01-02		int main() {
> 4 AU Thor	2020-01-04		  printf("Hello world!\n");
> 3 Dev E		2020-01-03		  return 0;
> 2 CO Mitter	2020-01-02		}
> 
> and
> 'git blame sub/main.c'
> 
> a Mai N		2020-01-01
> b Senior Dev	2020-01-04		int main() {
> b Senior Dev	2020-01-04		  printf("Hello world!\n");
> b Senior Dev	2020-01-04		  return 0;
> b Senior Dev	2020-01-04		}
> 
> or to put it another way: if we are treating superproject commit as "the
whole
> feature", then it could be useful to see "which feature added this change"
> instead of "which atomic commit inside a feature added this change".
> 
> To me, it sounds expensive to compute... wouldn't you  need to say, for
each
> blame line, "is this commit an ancestor of the commit associated in THIS
> superproject commit? ...how about the next superproject commit?"
> But I also don't have much experience with the blame implementation so
> maybe I'm thinking naively :) :)
> 
> And even if it is expensive, considering that Jacob and Randall both had
> different ideas of what their ideal 'git blame' recursive behavior would
be,
> maybe it makes sense to use a flag to ask for the more expensive behavior,
e.g.
> 'git blame --show-superproject-commit sub/main.c'?

I was partly trying to figure out which path Jacob was requesting and "both"
seem useful to me. Looking at our own super-repo history and comparing what
is in one specific submodule, we have the commit on the submodule ref file
flipping repeatedly between two commits during a period of time (in
timestamp order) where an there were multiple submodule topic branches in
parallel with super-repo topic branches. I'm not saying that was a good
thing, just the reality of one particular icky submodule. Once back on the
main branch, things moved to something rational, but a blame of changing
submodule contents of sub/main.c --show-superproject-commit would lead to
something inherently non-deterministic until the topic branches are all
pruned and/or merged, at least in this degenerative situation. After the
merge, everything was back to deterministic and simple to compute.

Randall


  reply	other threads:[~2021-04-20 19:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-16 23:36 RFC/Discussion - Submodule UX Improvements Emily Shaffer
2021-04-18  5:22 ` Christian Couder
2021-04-20 23:10   ` Emily Shaffer
2021-04-19  3:20 ` Philippe Blain
2021-04-20 23:03   ` Emily Shaffer
2021-04-20 23:30     ` Junio C Hamano
2021-04-21  2:27     ` Philippe Blain
2021-04-19 12:56 ` Randall S. Becker
2021-04-19 12:56 ` Aaron Schrab
2021-04-20 18:49   ` Emily Shaffer
2021-04-20 19:29     ` Randall S. Becker
2021-04-19 19:14 ` Jacob Keller
2021-04-19 19:28   ` Randall S. Becker
2021-04-20 16:18     ` Jacob Keller
2021-04-20 18:47       ` Emily Shaffer
2021-04-20 19:38         ` Randall S. Becker [this message]
2021-04-21  6:57         ` Jacob Keller
2021-04-22 15:32 ` Jacob Keller

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='01a001d7361c$b73c8560$25b59020$@nexbridge.com' \
    --to=rsbecker@nexbridge.com \
    --cc=albertcui@google.com \
    --cc=avarab@gmail.com \
    --cc=emilyshaffer@google.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jacob.keller@gmail.com \
    --cc=jrnieder@gmail.com \
    --cc=matheus.bernardino@usp.br \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).