All of lore.kernel.org
 help / color / mirror / Atom feed
* Managing sub-projects
@ 2016-06-18 23:20 Michael Eager
  2016-06-20  2:01 ` Stefan Beller
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Eager @ 2016-06-18 23:20 UTC (permalink / raw)
  To: Git Mailing List

I'm trying to create a git repository for a tool chain for a proprietary
processor.  I'd like to create a private repo with documentation, build
scripts, etc., which includes several sub-projects: binutils, gcc, newlib,
etc.  Each of the sub-projects will have a branch which has support for
the new processor.  These branches need to be maintained in my repo, not
in the upstream repo.  I want to be able to periodically rebase these
branches from the upstream repo.

I've looked at several schemes, but each one seems to do something other
than what I want.

Git submodule:  Branches created in the sub-projects are pushed to the
upstream repo, not to my repo.  I tried to change origin and created an
upstream reference, but was not able to get changes pushed to my repo.

git subtree:  Does not maintain sub-project history or allow rebase.

git slave:  Requires multiple private repos.  Appears to require the
same branch names in each sub-project.

repo: Appears to work a bit like git submodules, where pushes on the
sub-projects go to the upstream repo, not to the private repo.

Any other ways to do what I want without creating a separate forked
repo for each of the sub-projects?  Or have I misunderstood one of
these schemes?

-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Managing sub-projects
  2016-06-18 23:20 Managing sub-projects Michael Eager
@ 2016-06-20  2:01 ` Stefan Beller
  2016-06-21 23:06   ` Michael Eager
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Beller @ 2016-06-20  2:01 UTC (permalink / raw)
  To: Michael Eager; +Cc: Git Mailing List

On Sat, Jun 18, 2016 at 4:20 PM, Michael Eager <eager@eagerm.com> wrote:
>
> Any other ways to do what I want without creating a separate forked
> repo for each of the sub-projects?  Or have I misunderstood one of
> these schemes?

I think forking is the way to go here, as you want to have new code
and maintain that.

I assume you want to keep the history for each project separate, so
I would recommend against using subtrees.

git slave looks interesting for your use case. (I looked at it once
before, but have no deep knowledge about it as it is not part of core git)

IIUC the repo tool is tailored to be a multi-repo manager optimised
for usage with Gerrit, not just plain Git. The repo tool tracks branches
in the subproject instead of versions (as submodules do), so consistency
is hard, specially when looking back in history. (Not sure if you care about
that, but if you want to use e.g. git bisect, having an easy reproducable
history is a must)

Personally I would try out submodules.

> Git submodule:  Branches created in the sub-projects are pushed to the
> upstream repo, not to my repo.  I tried to change origin and created an
> upstream reference, but was not able to get changes pushed to my repo.

Beware that there are 2 areas you need to look at. First the submodule repo
needs to have a remote that points away from the projects origin (to your
private fork).

Then you have to look at the superproject that
1) records the sha1 for the submodules internally
2) all other information except the tracking sha1s must be user provided,
    where the .gitmodules file contains recommendations (i.e. the url where to
    obtain the submodule from, whether to clone it shallowly,
    if we have a specific branch in mind). The contents of that file
    are not binding, e.g. if the url provided in the .gitmodules file becomes
    outdated later, it is still possible to setup the
submodule/superproject correctly.

However for your business purpose, you would put the url of the private forks
in the recommended URL of the submodules.

As the superproject only tracks the sha1, and has this recommended pointer
where to get the submodule repository from, you need to take special care
in a rebase workflow, because the old rebased commits fall out of the
reachability
of the graph of objects, e.g.:

Say you have a version `abc` in a submodule that is one commit on top of
canonical projects history, and `abc` is recorded as the sha1 in the
superproject.

Then you rebase the commit in the submodule to a newer version of the upstream,
which then becomes a new commit `def` and `abc` is not referenced any more,
so it can be garbage collected.

This is bad for the history of the superproject as it then points to
an unreachable
commit in its history.

To preserve the historic non-rebased `abc` commit, you could have a
set of branches
(or tags) that maintain all the old non rebased versions.

This problem comes up with submodules with any workflow that requires
non fast forward changes (forced pushes), I think.

So maybe you need to have an alias in the submodule for rebasing, that
is roughly:

rebase:
    if rebased history is published
        create a tag, e.g.: "$(date -I)-${sha1}"
        (and push that tag here or later?)
    rebase as normal
    carry on with life

To get back to your complaint:

>  I tried to change origin and created an
> upstream reference, but was not able to get changes pushed to my repo.

I would imagine this to be

     (cd submodule && git remote set-url origin <your fork> && git push origin)

for plain pushing in the submodule and then

    $EDIT .gitmodules
    # edit submodule.<name>.url to point at <your fork>

to get the superproject correct.

Thanks,
Stefan














>
> --
> Michael Eager    eager@eagercon.com
> 1960 Park Blvd., Palo Alto, CA 94306  650-325-8077
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Managing sub-projects
  2016-06-20  2:01 ` Stefan Beller
@ 2016-06-21 23:06   ` Michael Eager
  2016-06-21 23:36     ` Stefan Beller
  0 siblings, 1 reply; 4+ messages in thread
From: Michael Eager @ 2016-06-21 23:06 UTC (permalink / raw)
  To: Stefan Beller; +Cc: Git Mailing List

Hi Stephan!

On 06/19/2016 07:01 PM, Stefan Beller wrote:
> On Sat, Jun 18, 2016 at 4:20 PM, Michael Eager <eager@eagerm.com> wrote:
>>
>> Any other ways to do what I want without creating a separate forked
>> repo for each of the sub-projects?  Or have I misunderstood one of
>> these schemes?
>
> I think forking is the way to go here, as you want to have new code
> and maintain that.

This was my conclusion.

What I originally wanted was a repo with two origins, the upstream for
the master and public branches, and my repo for my branches.  Git may
be able to do all kinds of magic, but this two-origin scheme sounded
strange after I thought about it for a while.

> Personally I would try out submodules.

I've used submodules on another project.  There are some odd quirks,
and lots of web pages which say to avoid submodules like the plague, but I
didn't have lots of trouble.  (After an initial bit of confusion while
getting familiar with submodules, which is what I can say about every
feature in Git.)

>> Git submodule:  Branches created in the sub-projects are pushed to the
>> upstream repo, not to my repo.  I tried to change origin and created an
>> upstream reference, but was not able to get changes pushed to my repo.
>
> Beware that there are 2 areas you need to look at. First the submodule repo
> needs to have a remote that points away from the projects origin (to your
> private fork).

I'll create an "upstream" remote to the project repo, so I can pull/rebase
from the upstream into my forked repo.  The "origin" will point to my repo.

> Then you have to look at the superproject that
> 1) records the sha1 for the submodules internally
> 2) all other information except the tracking sha1s must be user provided,
>      where the .gitmodules file contains recommendations (i.e. the url where to
>      obtain the submodule from, whether to clone it shallowly,
>      if we have a specific branch in mind). The contents of that file
>      are not binding, e.g. if the url provided in the .gitmodules file becomes
>      outdated later, it is still possible to setup the
> submodule/superproject correctly.
>
> However for your business purpose, you would put the url of the private forks
> in the recommended URL of the submodules.
>
> As the superproject only tracks the sha1, and has this recommended pointer
> where to get the submodule repository from, you need to take special care
> in a rebase workflow, because the old rebased commits fall out of the
> reachability
> of the graph of objects, e.g.:
>
> Say you have a version `abc` in a submodule that is one commit on top of
> canonical projects history, and `abc` is recorded as the sha1 in the
> superproject.
>
> Then you rebase the commit in the submodule to a newer version of the upstream,
> which then becomes a new commit `def` and `abc` is not referenced any more,
> so it can be garbage collected.
>
> This is bad for the history of the superproject as it then points to
> an unreachable
> commit in its history.
>
> To preserve the historic non-rebased `abc` commit, you could have a
> set of branches
> (or tags) that maintain all the old non rebased versions.

Sounds like every time I rebase, I should tag the repo to annotate this,
and (as a side effect) retain the history.

> This problem comes up with submodules with any workflow that requires
> non fast forward changes (forced pushes), I think.
>
> So maybe you need to have an alias in the submodule for rebasing, that
> is roughly:
>
> rebase:
>      if rebased history is published
>          create a tag, e.g.: "$(date -I)-${sha1}"
>          (and push that tag here or later?)
>      rebase as normal
>      carry on with life

What do you mean "if rebased history is published".

Generally I'd apply a tag after the rebase was completed successfully,
then push both the updated branch and tags to my repo.

> To get back to your complaint:
>
>>   I tried to change origin and created an
>> upstream reference, but was not able to get changes pushed to my repo.
>
> I would imagine this to be
>
>       (cd submodule && git remote set-url origin <your fork> && git push origin)
>
> for plain pushing in the submodule and then
>
>      $EDIT .gitmodules
>      # edit submodule.<name>.url to point at <your fork>
>
> to get the superproject correct.

Thanks.


-- 
Michael Eager	 eager@eagercon.com
1960 Park Blvd., Palo Alto, CA 94306  650-325-8077

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Managing sub-projects
  2016-06-21 23:06   ` Michael Eager
@ 2016-06-21 23:36     ` Stefan Beller
  0 siblings, 0 replies; 4+ messages in thread
From: Stefan Beller @ 2016-06-21 23:36 UTC (permalink / raw)
  To: Michael Eager; +Cc: Git Mailing List

Hi Michael,

On Tue, Jun 21, 2016 at 4:06 PM, Michael Eager <eager@eagerm.com> wrote:
> Hi Stephan!
>
> On 06/19/2016 07:01 PM, Stefan Beller wrote:
>>
>> On Sat, Jun 18, 2016 at 4:20 PM, Michael Eager <eager@eagerm.com> wrote:
>>>
>>>
>>> Any other ways to do what I want without creating a separate forked
>>> repo for each of the sub-projects?  Or have I misunderstood one of
>>> these schemes?
>>
>>
>> I think forking is the way to go here, as you want to have new code
>> and maintain that.
>
>
> This was my conclusion.
>
> What I originally wanted was a repo with two origins, the upstream for
> the master and public branches, and my repo for my branches.  Git may
> be able to do all kinds of magic, but this two-origin scheme sounded
> strange after I thought about it for a while.

Well, 2 origins sound strange indeed, but "origin" is just a name to point at
a remote place. You could have ["origin" , "private"].

Once upon a time, I used ["mainline", "origin", "<other peoples name>", ...],
which I confused myself with, so now I am down to
["origin", "private", "<other peoples name">].

The difference for my work flow is that I have read permissions only
on all but one remote.

>
>> Personally I would try out submodules.
>
>
> I've used submodules on another project.  There are some odd quirks,
> and lots of web pages which say to avoid submodules like the plague, but I
> didn't have lots of trouble.  (After an initial bit of confusion while
> getting familiar with submodules, which is what I can say about every
> feature in Git.)
>
>>> Git submodule:  Branches created in the sub-projects are pushed to the
>>> upstream repo, not to my repo.  I tried to change origin and created an
>>> upstream reference, but was not able to get changes pushed to my repo.
>>
>>
>> Beware that there are 2 areas you need to look at. First the submodule
>> repo
>> needs to have a remote that points away from the projects origin (to your
>> private fork).
>
>
> I'll create an "upstream" remote to the project repo, so I can pull/rebase
> from the upstream into my forked repo.  The "origin" will point to my repo.

That is similar to the "mainline" I mention above. :)

In your work flow is there such a thing of an upstream of the
superpoject containing
all these subprojects? I thought that was a collection you are
ultimately creating,
such that the superproject has only one remote (your authoritative copy), while
each submodule has 2 remotes "upstream" (that I assume to be read only for you)
and an "origin" (your maintained version, which then contains stuff that is
referenced by your superproject).

>
>
>> Then you have to look at the superproject that
>> 1) records the sha1 for the submodules internally
>> 2) all other information except the tracking sha1s must be user provided,
>>      where the .gitmodules file contains recommendations (i.e. the url
>> where to
>>      obtain the submodule from, whether to clone it shallowly,
>>      if we have a specific branch in mind). The contents of that file
>>      are not binding, e.g. if the url provided in the .gitmodules file
>> becomes
>>      outdated later, it is still possible to setup the
>> submodule/superproject correctly.
>>
>> However for your business purpose, you would put the url of the private
>> forks
>> in the recommended URL of the submodules.
>>
>> As the superproject only tracks the sha1, and has this recommended pointer
>> where to get the submodule repository from, you need to take special care
>> in a rebase workflow, because the old rebased commits fall out of the
>> reachability
>> of the graph of objects, e.g.:
>>
>> Say you have a version `abc` in a submodule that is one commit on top of
>> canonical projects history, and `abc` is recorded as the sha1 in the
>> superproject.
>>
>> Then you rebase the commit in the submodule to a newer version of the
>> upstream,
>> which then becomes a new commit `def` and `abc` is not referenced any
>> more,
>> so it can be garbage collected.
>>
>> This is bad for the history of the superproject as it then points to
>> an unreachable
>> commit in its history.
>>
>> To preserve the historic non-rebased `abc` commit, you could have a
>> set of branches
>> (or tags) that maintain all the old non rebased versions.
>
>
> Sounds like every time I rebase, I should tag the repo to annotate this,
> and (as a side effect) retain the history.
>
>> This problem comes up with submodules with any workflow that requires
>> non fast forward changes (forced pushes), I think.
>>
>> So maybe you need to have an alias in the submodule for rebasing, that
>> is roughly:
>>
>> rebase:
>>      if rebased history is published
>>          create a tag, e.g.: "$(date -I)-${sha1}"
>>          (and push that tag here or later?)
>>      rebase as normal
>>      carry on with life
>
>
> What do you mean "if rebased history is published".

bad wording.

    "If the commits that you are going to rebase are published:"



>
> Generally I'd apply a tag after the rebase was completed successfully,
> then push both the updated branch and tags to my repo.

Sure, you can also tag after rebasing.

You only need to make sure that the history that is lost during rebase is not
pointed at from the superproject, which is the case in either version. When
first rebasing and then tagging, the tag is preserving the history for the next
rebase though, which is why I did not think of it first.

>
>> To get back to your complaint:
>>
>>>   I tried to change origin and created an
>>> upstream reference, but was not able to get changes pushed to my repo.
>>
>>
>> I would imagine this to be
>>
>>       (cd submodule && git remote set-url origin <your fork> && git push
>> origin)
>>
>> for plain pushing in the submodule and then
>>
>>      $EDIT .gitmodules
>>      # edit submodule.<name>.url to point at <your fork>
>>
>> to get the superproject correct.
>
>
> Thanks.
>

Thanks,
Stefan

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-06-21 23:44 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-06-18 23:20 Managing sub-projects Michael Eager
2016-06-20  2:01 ` Stefan Beller
2016-06-21 23:06   ` Michael Eager
2016-06-21 23:36     ` Stefan Beller

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.