* Simplifying work across multiple projects (while tracking relationships among commit histories)
@ 2010-05-31 6:41 Yang Zhang
2010-05-31 8:56 ` Jonathan Nieder
0 siblings, 1 reply; 2+ messages in thread
From: Yang Zhang @ 2010-05-31 6:41 UTC (permalink / raw)
To: git
After looking at some of the tools/techniques out there for working
with multiple git projects (submodules, subtree merge, braid, repo),
it seems that none are really well-suited for our use case. We're
developing a large system consisting of several components (libraries,
servers, applications, etc.). None of these components will ever exist
or be released as a stand-alone product. We're in "rapid development"
mode, so we're not even close to dealing with e.g. manually
maintaining information on versions/dependencies, and we just want
very tight integration among all the components -- yet the components
do deserve their own disentangled histories and (eventually)
independent branches/tags/versions/etc.
If we were using svn, all the code would live in a single repository,
and that would be all there was to think about this. However, it seems
that our use case (surprisingly) doesn't have a lot of good support in
the DVCS world.
For now, we'll probably just have some simple scripts that basically
do 'for i in $projects' loops for pulls, pushes, commits, etc.
However, this loses a lot of information that should be tracked about
the version/dependency information among the projects -- information
that at the same time we're not interested in manually tracking. We're
currently thinking of having a simple system that is initially set up
with a dependency graph among projects, e.g.:
a: no dependencies
b: depends on a
and whenever a commit is made to a project with dependencies (b), the
commit (perhaps in the commit message) contains a reference to the
particular versions of the dependent project(s) (a) that were checked
out.
The tool could simplify the use of such a scheme, e.g.:
- automatically augmenting commit messages with this information
- on commits/pushes, first commit/push the dependent projects
- checking out consistent versions of all the projects (or subgraphs thereof)
Does this make sense to others? Are we overlooking a better/existing
approach? Would it be worth building this? Suggestions on design
improvements to such a tool over what was described (e.g. better
approach than augmenting commit messages)?
--
Yang Zhang
http://yz.mit.edu/
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: Simplifying work across multiple projects (while tracking relationships among commit histories)
2010-05-31 6:41 Simplifying work across multiple projects (while tracking relationships among commit histories) Yang Zhang
@ 2010-05-31 8:56 ` Jonathan Nieder
0 siblings, 0 replies; 2+ messages in thread
From: Jonathan Nieder @ 2010-05-31 8:56 UTC (permalink / raw)
To: Yang Zhang; +Cc: git, Jens Lehmann, Johan Herland
Hi Yang,
Yang Zhang wrote:
> We're
> developing a large system consisting of several components (libraries,
> servers, applications, etc.).
[...]
> For now, we'll probably just have some simple scripts that basically
> do 'for i in $projects' loops for pulls, pushes, commits, etc.
> However, this loses a lot of information that should be tracked about
> the version/dependency information among the projects -- information
> that at the same time we're not interested in manually tracking. We're
> currently thinking of having a simple system that is initially set up
> with a dependency graph among projects, e.g.:
>
> a: no dependencies
> b: depends on a
>
> and whenever a commit is made to a project with dependencies (b), the
> commit (perhaps in the commit message) contains a reference to the
> particular versions of the dependent project(s) (a) that were checked
> out.
It sounds to me like submodules would be a better approach. Because
it fits my great love of complaining (and I would like to hear what
solutions you come up with, if any), let me try to go through the
problems you would run into.
I am not a submodule developer or heavy user, so please check anything
I say before relying on it.
1. Suppose you are working on the program frobber and you notice it
contains a usable sub-component veryfastregexp. So you make a new
repository for it, make sure it builds on its own, and publish.
As always when starting a new project, there is a question of how
much early history to preserve. Probably best to start with a
single commit, and provide a separate branch with
‘filter-branch --subdirectory-filter’ output if you are feeling
generous.
In frobber, you remove the copy of veryfastregexp, add it back
with ‘git submodule add git://someserver/path/to/veryfastregexp’,
commit, and publish.
- New clones must use ‘git clone --recursive’. How do you
advertise this?
- Existing clones must use ‘git submodule update --init’ after
they pull. In fact, it seems to me it’s not a bad idea to
always use ‘git submodule update --init --recursive’ after each
pull. How do you advertise this?
- Incoming patches that touch both veryfastregexp and frobber
have to be split into separate patches for the two projects.
How?
- Pull requests are even worse (or just as bad, depending on
how you solved the previous problem).
2. Some people like to use the latest stable version of all components
they use, while other people like to avoid change wherever
possible. I’ll consider the latter sort of person in a moment.
The developers of frobber want to use the latest version on the
master branch for all components. So they try the following:
git submodule foreach '
git checkout master &&
git pull &&
git submodule update --init --recursive
'
This checks out a branch tracking the upstream master branch
for each submodule.
Next they run ‘git add -u’ to mention all the updated submodule
versions, test to make sure everything’s okay, and commit.
- This does not bring sub-submodules to the latest version at the
same time. If the frobber developers wanted to do that, they
might try
git submodule foreach --recursive '
git checkout master &&
git pull &&
git submodule update --init --recursive
'
Then they run ‘git add -u’, test, and run ‘git commit’. But
the editor informs them that submodules have unstaged changes.
What happened, what are the consequences for others using this
project, and can this be avoided?
I’ll return to this in a moment.
3. Some people never upgrade until forced to.
The veryfastregexp library is a resounding success, picked up
by other people in the company, and rapidly developed. After a
particularly painful upgrade, the developers of frobber have a turn
for the conservative. From now on, their policy is “necessary
fixes only”. So they would like to maintain their own branch and
cherry-pick from master.
They put in a request for privileges to commit to their own branch,
and wait. In the meantime, an important fix comes up. So they
do the only thing they can do: publish a fork of veryfastregexp,
update .gitmodules to point to it, run ‘git add -u’ to register
the version they are using, test, commit, and push.
New clones made with ‘git clone --recursive’ will use the
project-specific version of veryfastregexp.
- Users with existing clones must update the URL with
‘git remote set-url origin <new url>’ from the submodule
or ‘git submodule sync’; otherwise, the next time they run
‘git submodule update’ there will be an ‘unable to checkout’
error. How do you advertise this?
- Suppose a frobber developer tries the following from the
frobber repository:
cd veryfastregexp && git cherry-pick important-fix
runs ‘git add -u’ from the toplevel, tests, commits, and
pushes the result. Of course, an important step is missing: he
forgot to push to the veryfastregexp-frobber repository!
Anyone who tries to pull this change and run ‘git submodule
update’ will find the commit object missing and be unable to
check out the new revision.
An update hook could have prevented this, since from the
server side it is obvious which objects a new clone will have
access to. Where can one find such a hook?
The frobber developers’ request for a branch in the veryfastregexp
repository is granted. So they switch .gitmodules back again
and keep the submodule pointed to the for-frobber branch, updating
as needed.
- Now the old recipe
git submodule foreach --recursive '
git checkout master &&
git pull &&
git submodule update --init --recursive
'
does not work for them anymore, since this would switch
the frobber branch back. What should they do to adjust?
How can they make it easy for new people on their team to
get started, too?
4. The developers who want all components to be aggressively updated
(see #2 above) need to do something similar. They first switch all
components to point to repositories with branches they own, and
then run aggressively-update, where aggressively-update is a
script something like
#!/bin/sh
git reset --keep upstream/master &&
git submodule foreach aggressively-update &&
git add -u &&
make test &&
git commit -v &&
git push -f origin master
- Can this be modified to pick up new submodules?
Thoughts welcome.
Jonathan
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2010-05-31 8:56 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-05-31 6:41 Simplifying work across multiple projects (while tracking relationships among commit histories) Yang Zhang
2010-05-31 8:56 ` Jonathan Nieder
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.