GSOC remote-svn

* GSOC remote-svn
@ 2012-07-22 21:03 Florian Achleitner
  2012-07-22 21:43 ` Jonathan Nieder
                   ` (2 more replies)
  0 siblings, 3 replies; 21+ messages in thread
From: Florian Achleitner @ 2012-07-22 21:03 UTC (permalink / raw)
  To: jrnieder, davidbarr; +Cc: git

Hi!

Refering to Jonathan's concerns in Saturday night's IRC log:

> [22:59:34] <jrnieder> barrbrain, flyingflo: I'm worried about the remote 
helper project
> [23:00:05] <jrnieder> someone needs to review remote-svn.c to catch things 
like that refspec issue which should be straightforward to an experienced eye

Let me explain the refspec issue:

In the the exisiting code in contrib/svn-fe commits are always imported to 
refs/heads/master, that was hardcoded. So I thought that couldn't be it.
I made the name of the  branch to import variable, depending on the name of 
the remote.
But my remote-helper didn't advertise the refspec capability, so transport-
helper assumed it imports to refs/heads/master, which is the default import 
refspec. The subsequent update of references in store_updated_refs lead to 
wrong values after the fetch, which I considered a bug and tried to fix.

In fact I didn't realize that the actual updating of references is not done by 
the remote-helper. I thought the remote-helper would have to evaluate the 
fetch refspec and tell fast-import the correct target branch.
Furthermore I confused 'private namespace' with refs/remotes/<remote's name>/, 
which I considered somehow private too.

After several mailing iterations, showing me that I was wrong, I found what 
the right point is, namely that the remote helper writes references to a 
really private dir in refs/<remote name>/, it doesn't touch anything else, and 
by advertising the 'refspec' capability, git-fetch knows where the private 
refs are and updates non-private references according to the fetch refspec in 
some post-processing in store_updated_refs. (Ok, you will say "of course!", 
but I didn't know that I was wrong and it's hidden in some 1000 lines of 
code).

For me that was not very easy to figure out, and it took a lot of time, but I 
think now remote-svn does it right.

> [23:00:38] <jrnieder> (also, remote-svn.c should be at the toplevel so it 
can be tested more easily with tests in t/
> [23:01:10] <jrnieder> and it should not be named remote-svn, since we 
haven't pinned down details about the svn:: conversion yet.  That's why 
Dmitry's was called git-remote-svn-alpha)

Ok. Why is that important? I think if it's not called remote-svn git doesn't 
find it as a helper for the 'svn' protocol. Actually in my local git tree, I 
have a symlink in the toplevel (to simpify PATH).

> [23:01:45] <jrnieder> I'm happy to review patches but I don't have a lot of 
time for it, which has been a problem:
> [23:02:11] <jrnieder>  * I think I wasn't cc-ed on earlier discussion so 
they seem to come out of the blue.  That's fine, but
> [23:03:05] <jrnieder>  * I really rely on patches that do one logical thing 
with a commit message describing the context and what the patch is trying to 
accomplish.  That makes review way, way easier when it is happening.

Probably I should stop sending proposals or incomplete stuff to the list/you.
The current state may probably be viewed easier in my github repo.

I think for creating patches that are acceptable I will need to squash and 
split a lot of my development  commits after the code is somehow finished and 
no longer experimental.

> [23:04:42] <jrnieder> Also it seems very chaotic: there are basic things 
about remote-svn.c that need fixing, and then patches for other things are 
appearing on top of that.
> [23:04:49] <jrnieder> Help?
> [23:05:26] <jrnieder> thanks, and hope that helps

About the current state:

Tester:
I wrote a small simulation script in python that mimics svnrdumps behaviour by 
replaying an existing svn dump file from a start rev up to an end rev to test 
incremental imports. I use it together with a little testrepo shell script.
Will need to bring that into t/ later, after figuring out how the test 
framework works. As it's not finished it's not published.

Incremental import:
By reading the latest svn revision number from a note attached to the private 
master ref, it starts future imports from the next svn revision. That 
basically works well.
It doesn't reuse mark files. What's the point of reusing them? Dmitry's svn-
alpha did that.
All I need to know is the revision to start from and the branch i want to add 
commits to, right? It now simply reads that from the note.

This got stuck on another problem:
Incremental update of the note tree doesn't work. fast-import refuses to 
update the notes tree: '<newsha1> doesn't contain <oldsha1>'.
I don't yet know what's the reason for this.
I'm digging into the internals of notes to find out why..
(no problem with the file tree).

This state hasn't hit the list of course, as it's in no way useful nor 
complete.

I often get caught in the traingle of those three processes (git transport-
helper, fast-import, remote-svn) needing to understand a lot about the 
existing two to understand why things don't work and why they need to work 
like they do.

--
Florian

^ permalink raw reply	[flat|nested] 21+ messages in thread