All of lore.kernel.org
 help / color / mirror / Atom feed
* Migrating SVN to Git, and preserve merge information
@ 2012-04-10 15:18 Nick Douma
  2012-04-10 22:57 ` Andrew Sayers
  0 siblings, 1 reply; 5+ messages in thread
From: Nick Douma @ 2012-04-10 15:18 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 2017 bytes --]

Hi,

I currently have two SVN repositories I'm looking to convert to Git.
While the process itself isn't all that hard (following this guide[1]),
I'm looking to do a bit more than just convert raw commits.

The repositories basically have two main branches at a time, for example:

* trunk
* current version, for example 3.5

Our SVN workflow consisted of working on trunk, and then manually
merging single commits from trunk to the other branch. We quite
consistently mentioned the merged commits in the SVN commit message, and
I'm looking to use this information to generate a more accurate tree in
Git. The commit messages are for example:

trunk  rev 100: Fixed important bug
branch rev 101: Merged r100 from trunk

Or more elaborate:

branch rev 200: Merged r100,r101,r.... from trunk

In these examples, tools like gitk or git log should show a line from
rev 100 to rev 101 in the first example, and lines from r100, r101,rn to
rev 200 in example two.

I have tried to create a custom grafts file to create the parent-child
relations above, and basically finds all merge commits and converts them
into graft lines like so:

<merge commit sha hash> <original git parent sha hash> <sha hash of
merged rev 1> ... <sha hash of merged rev n>

This all works to a certain extent, but the problem arises when trying
to view older history in these repositories. If I use the grafts file,
and do `gitk --all`, gitk freezes. Gitg doesn't show commits before a
certain point. tig and `git log --graph` all show a huge amount of
parentless commits near the bottom. All of this leads me to the
conclusion that something is wrong with the method of using grafts,
rather than problems in the individual tools.


Can someone find something obvious that I'm missing in the above
approach? Or alternatively suggest another more appropriate method of
achieving the same results in Git?

Kind regards,

Nick Douma



[1]: http://john.albin.net/git/convert-subversion-to-git


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Migrating SVN to Git, and preserve merge information
  2012-04-10 15:18 Migrating SVN to Git, and preserve merge information Nick Douma
@ 2012-04-10 22:57 ` Andrew Sayers
  2012-04-11  7:24   ` Nick Douma
  0 siblings, 1 reply; 5+ messages in thread
From: Andrew Sayers @ 2012-04-10 22:57 UTC (permalink / raw)
  To: Nick Douma; +Cc: git

Hi Nick,

Would I be right in thinking that a commit like "Merged r100,r101,r102
from trunk" will create three grafts?  If so, that might be the problem.

Git differentiates between "merges" (which include every commit up to
and including the specified one) and "cherry-picks" (which just include
the specified commit), whereas SVN calls both of these "merges".  Grafts
are a way of creating "merges" rather than "cherry-picks" (which git
doesn't have any metadata for), and it's not at all easy to get "merge"
data out of SVN in the general case.  Having said that, it's often a
good enough heuristic to pick the highest revision number mentioned in
the commit message and pretend it's a merge.

Incidentally, I'm planning to work on this area of SVN->git conversion
in the coming months.  I don't have anything you could use yet, but I
don't suppose the scripts you used are available somewhere?  Getting
revision information out of log files is particularly tricky, and
everyone stumbles over a different set of issues.  I'd be really
interested to pick any nuggets of wisdom out of the approach you took.

	- Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Migrating SVN to Git, and preserve merge information
  2012-04-10 22:57 ` Andrew Sayers
@ 2012-04-11  7:24   ` Nick Douma
  2012-04-11 11:13     ` Santi Béjar
  2012-04-11 17:58     ` Andrew Sayers
  0 siblings, 2 replies; 5+ messages in thread
From: Nick Douma @ 2012-04-11  7:24 UTC (permalink / raw)
  To: Andrew Sayers; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2361 bytes --]

Hi Andrew,

On 11-04-12 00:57, Andrew Sayers wrote:
> Would I be right in thinking that a commit like "Merged r100,r101,r102
> from trunk" will create three grafts?  If so, that might be the problem.

One SVN 'merge' commit will generate one graft. The graft itself will
contain all revisions mentioned in the merge commit, and whatever the
original git parent was before the graft.

> Git differentiates between "merges" (which include every commit up to
> and including the specified one) and "cherry-picks" (which just include
> the specified commit), whereas SVN calls both of these "merges".  Grafts
> are a way of creating "merges" rather than "cherry-picks" (which git
> doesn't have any metadata for), and it's not at all easy to get "merge"
> data out of SVN in the general case.  Having said that, it's often a
> good enough heuristic to pick the highest revision number mentioned in
> the commit message and pretend it's a merge.

The way we 'merged' in SVN was indeed more like cherry-picking, but I'm
looking to display this information as a merge in Git. I also would like
to include all revisions if possible.

The real problem I seem to be having is not completely understanding how
Git grafts work, because I think I'm hitting some kind of limitation or
bug, or just not using it right.

> Incidentally, I'm planning to work on this area of SVN->git conversion
> in the coming months.  I don't have anything you could use yet, but I
> don't suppose the scripts you used are available somewhere?  Getting
> revision information out of log files is particularly tricky, and
> everyone stumbles over a different set of issues.  I'd be really
> interested to pick any nuggets of wisdom out of the approach you took.

I don't really have any useful script for you at the moment, but the
main approach is this:

* I first tag all SVN Git commits with the original SVN revision, like:
"svn/1234"
* Then I retrieve all commits with "merge" in the message, but not "unmerge"
* Now I filter all revisions from the commit message using a regex or two.
* Using all relevant revisions, I retrieve the corresponding SHA hashes
using the tag names I created in step 1.
* Finally, I write a graft file in format:
<merge commit> <original git parent> <merge rev 1> ... <merge rev n>

Kind regards,

Nick Douma


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 262 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Migrating SVN to Git, and preserve merge information
  2012-04-11  7:24   ` Nick Douma
@ 2012-04-11 11:13     ` Santi Béjar
  2012-04-11 17:58     ` Andrew Sayers
  1 sibling, 0 replies; 5+ messages in thread
From: Santi Béjar @ 2012-04-11 11:13 UTC (permalink / raw)
  To: Nick Douma; +Cc: Andrew Sayers, git

On Wed, Apr 11, 2012 at 9:24 AM, Nick Douma <n.douma@nekoconeko.nl> wrote:
> Hi Andrew,
>
> On 11-04-12 00:57, Andrew Sayers wrote:
>> Would I be right in thinking that a commit like "Merged r100,r101,r102
>> from trunk" will create three grafts?  If so, that might be the problem.
>
> One SVN 'merge' commit will generate one graft. The graft itself will
> contain all revisions mentioned in the merge commit, and whatever the
> original git parent was before the graft.

This graft does not represent the SVN ´merge´. There are two types of
SVN ´merges´, and you are describing the cherry-pick one. So you have
to represent them as such, as cherry-pick. And a cherry-pick in git is
just a normal commit, it just happens to apply to a different branch
(commit), it normally has the same commit message (except some people
prefer to add "(cherry pick from commit ...)". And I think this is
just what you are looking for.

So, I would recommend you to just rewrite the SVN commit id with git
commit sha1. Only in the case where you have a real merge (rev XXX:
Merge all history up to rXXX-1 from trunk), you use the graft
approach.

>
>> Git differentiates between "merges" (which include every commit up to
>> and including the specified one) and "cherry-picks" (which just include
>> the specified commit), whereas SVN calls both of these "merges".  Grafts
>> are a way of creating "merges" rather than "cherry-picks" (which git
>> doesn't have any metadata for), and it's not at all easy to get "merge"
>> data out of SVN in the general case.  Having said that, it's often a
>> good enough heuristic to pick the highest revision number mentioned in
>> the commit message and pretend it's a merge.
>
> The way we 'merged' in SVN was indeed more like cherry-picking, but I'm
> looking to display this information as a merge in Git. I also would like
> to include all revisions if possible.

But git needs a meaningful history!! If you use merges to represent
cherry-picks it will confuse some of the git tools.

If the question is to "display this information" in a meaningful way
just use the approach outlined above.

>
> The real problem I seem to be having is not completely understanding how
> Git grafts work, because I think I'm hitting some kind of limitation or
> bug, or just not using it right.

If gitk --all freeze theres is definitly a bug somewhere. I don´t
really know, but maybe the problem is that some parents in your grafts
are descendent of the others.

Anyway, we need a minimal method to reproduce it.

HTH,
Santi

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Migrating SVN to Git, and preserve merge information
  2012-04-11  7:24   ` Nick Douma
  2012-04-11 11:13     ` Santi Béjar
@ 2012-04-11 17:58     ` Andrew Sayers
  1 sibling, 0 replies; 5+ messages in thread
From: Andrew Sayers @ 2012-04-11 17:58 UTC (permalink / raw)
  To: Nick Douma; +Cc: git

Thanks Nick for the outline - the main thing I take from this is that
"unmerge" is used in real world revision messages, so it's important for
me to test that any regexp does the right thing there.

It sounds like grafts aren't the right solution for your problem.  I
can't really suggest much without a better understanding of what you're
trying to achieve, but you might be interested in `git cherry`, which
tries to detect cherry-picked commits by comparing the diffs.

	- Andrew

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2012-04-11 17:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-04-10 15:18 Migrating SVN to Git, and preserve merge information Nick Douma
2012-04-10 22:57 ` Andrew Sayers
2012-04-11  7:24   ` Nick Douma
2012-04-11 11:13     ` Santi Béjar
2012-04-11 17:58     ` Andrew Sayers

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.