All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7] Foreign VCS support
@ 2009-05-27 18:15 Daniel Barkalow
  2009-06-15 16:26 ` Pete Wyckoff
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Barkalow @ 2009-05-27 18:15 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

This is a newly-generated and reorganized replacement for db/foreign-scm. 
It applies to recent next.

I'd be particularly interested in feedback from people working on hg 
import/export programs, since that's probably a better user for this code 
than my p4 helper, and also doesn't rely on closed-source code.

The overall goal is to be able to configure a remote in some way and then 
use "git fetch/pull" and "git push" to interact with other version control 
systems, instead of having interface scripts with their own command lines, 
additional instructions, etc.

The main changes are:

 (1) The remote does not need to use a url for the location
 (2) The helper is rearranged to be able to do multiple operations in a 
     single execution, allowing it to keep a single open connection for 
     listing the refs and then fetching them.
 (3) There is some preliminary support for exporting data to the foreign 
     system.

Also, the p4 example helper is significantly more capable and efficient 
than the previously-posted one.

Daniel Barkalow (7):
  Add specification of git-vcs-* helper programs
  Use a function to determine whether a remote is valid
  Allow fetch to modify refs
  Allow programs to not depend on remotes having urls
  Add a config option for remotes to specify a foreign vcs
  Add a transport implementation using git-vcs-* helpers
  Implement git-vcs-p4

 Documentation/config.txt     |    4 +
 Documentation/git-vcs-p4.txt |   47 ++
 Documentation/git-vcs.txt    |   77 +++
 Makefile                     |   24 +
 builtin-clone.c              |    6 +-
 builtin-fetch.c              |   19 +-
 builtin-ls-remote.c          |    4 +-
 builtin-push.c               |   54 ++-
 remote.c                     |   15 +-
 remote.h                     |    2 +
 transport-foreign.c          |  200 +++++++
 transport.c                  |   25 +-
 transport.h                  |   44 ++-
 vcs-p4/p4client-api.cc       |  455 +++++++++++++++
 vcs-p4/p4client.c            |  158 ++++++
 vcs-p4/p4client.h            |   74 +++
 vcs-p4/vcs-p4.c              | 1246 ++++++++++++++++++++++++++++++++++++++++++
 vcs-p4/vcs-p4.h              |  135 +++++
 18 files changed, 2545 insertions(+), 44 deletions(-)
 create mode 100644 Documentation/git-vcs-p4.txt
 create mode 100644 Documentation/git-vcs.txt
 create mode 100644 transport-foreign.c
 create mode 100644 vcs-p4/p4client-api.cc
 create mode 100644 vcs-p4/p4client.c
 create mode 100644 vcs-p4/p4client.h
 create mode 100644 vcs-p4/vcs-p4.c
 create mode 100644 vcs-p4/vcs-p4.h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-05-27 18:15 [PATCH 0/7] Foreign VCS support Daniel Barkalow
@ 2009-06-15 16:26 ` Pete Wyckoff
  2009-06-15 18:50   ` Daniel Barkalow
  0 siblings, 1 reply; 7+ messages in thread
From: Pete Wyckoff @ 2009-06-15 16:26 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

barkalow@iabervon.org wrote on Wed, 27 May 2009 14:15 -0400:
> This is a newly-generated and reorganized replacement for db/foreign-scm. 
> It applies to recent next.

I finally got a chance to look at this new version.  If you have
updates somewhere, let me know.  I had to fix up some merge issues
to put it on top of 1.6.3.2.

First, I'd prefer to use the non-P4-API version, but it doesn't
link:

vcs-p4/vcs-p4.o: In function `p4_integrate':
/u/petew/src/git/vcs-p4/vcs-p4.c:405: undefined reference to `_p4_call_unknown'
vcs-p4/vcs-p4.o: In function `p4_resolve':
/u/petew/src/git/vcs-p4/vcs-p4.c:412: undefined reference to `_p4_call_unknown'
vcs-p4/vcs-p4.o: In function `p4_edit':
/u/petew/src/git/vcs-p4/vcs-p4.c:423: undefined reference to `_p4_call_unknown'
vcs-p4/vcs-p4.o: In function `p4_submit':
/u/petew/src/git/vcs-p4/vcs-p4.c:479: undefined reference to `_p4_call_form_io'
vcs-p4/vcs-p4.o: In function `export_p4':
/u/petew/src/git/vcs-p4/vcs-p4.c:1008: undefined reference to `p4_write_tree'
/u/petew/src/git/vcs-p4/vcs-p4.c:1031: undefined reference to `p4_release_tree'

Next, it took me a while to discover the format of the remote entry.
Can you think of a way that "git remote add ..." could just work?
How to tell it the "vcs = p4" setting at add time, for instance?

[remote "upstream"]
        url = foo
        codeline = //depot/path/to/project
        fetch = +refs/p4/*:refs/remotes/upstream/*
        port = my-p4-server:1666
        vcs = p4

The command "git remote show upstream" complains about "foo" in the url.
Looking at your patches, the only place that sets remote->foreign_vcs is
remote's handle_config().  This isn't going to work for "git remote show
upstream" which calls transport_get() with a NULL remote.

Maybe use urls of the form "vcs-p4://depot/path/to/project".  Might
solve both issues.

It does work, and I have added two unrelated parts of the p4 repo in
the same git.  I'm going to try some much larger scale tests soon.

		-- Pete

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-06-15 16:26 ` Pete Wyckoff
@ 2009-06-15 18:50   ` Daniel Barkalow
  2009-06-16 12:37     ` Pete Wyckoff
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Barkalow @ 2009-06-15 18:50 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

On Mon, 15 Jun 2009, Pete Wyckoff wrote:

> barkalow@iabervon.org wrote on Wed, 27 May 2009 14:15 -0400:
> > This is a newly-generated and reorganized replacement for db/foreign-scm. 
> > It applies to recent next.
> 
> I finally got a chance to look at this new version.  If you have
> updates somewhere, let me know.  I had to fix up some merge issues
> to put it on top of 1.6.3.2.

Yeah, I haven't really looked at it since I posted it.

> First, I'd prefer to use the non-P4-API version, but it doesn't
> link:
> 
> vcs-p4/vcs-p4.o: In function `p4_integrate':
> /u/petew/src/git/vcs-p4/vcs-p4.c:405: undefined reference to `_p4_call_unknown'
> vcs-p4/vcs-p4.o: In function `p4_resolve':
> /u/petew/src/git/vcs-p4/vcs-p4.c:412: undefined reference to `_p4_call_unknown'
> vcs-p4/vcs-p4.o: In function `p4_edit':
> /u/petew/src/git/vcs-p4/vcs-p4.c:423: undefined reference to `_p4_call_unknown'
> vcs-p4/vcs-p4.o: In function `p4_submit':
> /u/petew/src/git/vcs-p4/vcs-p4.c:479: undefined reference to `_p4_call_form_io'
> vcs-p4/vcs-p4.o: In function `export_p4':
> /u/petew/src/git/vcs-p4/vcs-p4.c:1008: undefined reference to `p4_write_tree'
> /u/petew/src/git/vcs-p4/vcs-p4.c:1031: undefined reference to `p4_release_tree'

Sorry about that; I've been using the api version (because calling the 
command-line client with the frequency necessary to fetch stuff upsets our 
P4 admins). I keep meaning to write those methods, but I forgot. It 
shouldn't be too hard to write them if you want to use them. The 
_p4_call_* ones should call p4 and parse the output; the other two can be 
empty because the tree gets written in the filesystem in the call that is 
implemented. Note that these methods are only used in export (at least 
currently).

> Next, it took me a while to discover the format of the remote entry.
> Can you think of a way that "git remote add ..." could just work?
> How to tell it the "vcs = p4" setting at add time, for instance?

Setting the vcs in add is pretty simple, but setting the necessary other 
options would be tricky.

> [remote "upstream"]
>         url = foo
>         codeline = //depot/path/to/project
>         fetch = +refs/p4/*:refs/remotes/upstream/*
>         port = my-p4-server:1666
>         vcs = p4
> 
> The command "git remote show upstream" complains about "foo" in the url.
> Looking at your patches, the only place that sets remote->foreign_vcs is
> remote's handle_config().  This isn't going to work for "git remote show
> upstream" which calls transport_get() with a NULL remote.

I'm not sure why it does that. It actually has the remote, and it uses the 
urls from the remote, but it fails to pass the remote to transport_get(). 
I think it should (with the other changes in this series) be:

transport = transport_get(states->remote, NULL);

where transport_get() gets the remote and figures out the url, rather than 
having different code for trying to determine it.

> Maybe use urls of the form "vcs-p4://depot/path/to/project".  Might
> solve both issues.

For a proper URL, it should be vcs-p4:///depot/path/to/project; but the 
semantics of the "url" attribute aren't right for the codeline here, where 
the same project can consist of content from different locations added 
together as different branches. That's why I went with a vcs-p4-specific 
option instead of using the url at all.

Also, SVN will want to have some part specify the vcs-helper, and still 
have different URL methods for the location(s) to interact with.

> It does work, and I have added two unrelated parts of the p4 repo in
> the same git.  I'm going to try some much larger scale tests soon.

Great. Does you p4 depot have integrates? Does it have tricky integrates? 
I've got some support for it, but it's not nearly as elaborate as what the 
perl people wrote to convert their depot. I've tested it pretty thoroughly 
with tidy history, but I know there are problems with the case where 
someone integrates a file into the project from their sandbox.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-06-15 18:50   ` Daniel Barkalow
@ 2009-06-16 12:37     ` Pete Wyckoff
  2009-06-16 17:29       ` Daniel Barkalow
  0 siblings, 1 reply; 7+ messages in thread
From: Pete Wyckoff @ 2009-06-16 12:37 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

barkalow@iabervon.org wrote on Mon, 15 Jun 2009 14:50 -0400:
> On Mon, 15 Jun 2009, Pete Wyckoff wrote:
[..]
> Sorry about that; I've been using the api version (because calling the 
> command-line client with the frequency necessary to fetch stuff upsets our 
> P4 admins). I keep meaning to write those methods, but I forgot. It 
> shouldn't be too hard to write them if you want to use them. The 
> _p4_call_* ones should call p4 and parse the output; the other two can be 
> empty because the tree gets written in the filesystem in the call that is 
> implemented. Note that these methods are only used in export (at least 
> currently).

I shall concentrate on the API version for now too, lest my admins
get testy.

> > Next, it took me a while to discover the format of the remote entry.
> > Can you think of a way that "git remote add ..." could just work?
> > How to tell it the "vcs = p4" setting at add time, for instance?
> 
> Setting the vcs in add is pretty simple, but setting the necessary other 
> options would be tricky.

I'll try to help with a bit of documentation at least once this
appears to work.

> > [remote "upstream"]
> >         url = foo
> >         codeline = //depot/path/to/project
> >         fetch = +refs/p4/*:refs/remotes/upstream/*
> >         port = my-p4-server:1666
> >         vcs = p4
> > 
> > The command "git remote show upstream" complains about "foo" in the url.
> > Looking at your patches, the only place that sets remote->foreign_vcs is
> > remote's handle_config().  This isn't going to work for "git remote show
> > upstream" which calls transport_get() with a NULL remote.
> 
> I'm not sure why it does that. It actually has the remote, and it uses the 
> urls from the remote, but it fails to pass the remote to transport_get(). 
> I think it should (with the other changes in this series) be:
> 
> transport = transport_get(states->remote, NULL);
> 
> where transport_get() gets the remote and figures out the url, rather than 
> having different code for trying to determine it.

Agreed, looks harmless.

> Great. Does you p4 depot have integrates? Does it have tricky integrates? 
> I've got some support for it, but it's not nearly as elaborate as what the 
> perl people wrote to convert their depot. I've tested it pretty thoroughly 
> with tidy history, but I know there are problems with the case where 
> someone integrates a file into the project from their sandbox.

Oh does it have integrates.  :)

Running the p4 filelog command on its own takes 15 min.
Each file has at least one "branch from" and a bunch of "branch
into" lines.  Older files in the repo have some "copy into" lines
too.

And there are plenty with the two-rev "branch from" case that
provoked a comment in your code.  I don't understand it either.
Here's a snippet:

//depot/path/to/project/dir/foo.txt
... #1 change 555 branch on 2001/02/03 by x@y (ktext) 'Branch oldproj @444'
... ... branch into //depot/path/to/branch1/dir/foo.txt#1
... ... branch into //depot/path/to/branch2/dir/foo.txt#1
... ... branch into //depot/path/to/branch3/dir/foo.txt#1
... ... branch from //depot/oldproj/dir/main/foo.txt#1,#28

The name and the "into" branches are in current use.  The last
"from" branch is the historic location from where everything was
moved recently.  Note the extra "main/" directory in there.  This
pattern is common throughout the historic depot format, but it
confuses get_related_file(), which complains and returns NULL.

I don't particularly want to maintain this historic codeline
information, even if vcs-p4 could figure it out.  Is there a way
just to import the history of the file without having it be part of
a discovered codeline?  Perhaps always call get_codeline() and let
me use ignore_codelines to skip the oldproj ones by hand?  I do want
git to know about the "branch1" and similar, though, eventually.

I do set findbranches = false to avoid the big lookup in the list
operation, for now at least.  

You have some extra lines in "list" that duplicate codelines[0]:

-                       git_config(handle_config, remote);
-
-                       ALLOC_GROW(env, env_nr + 1, env_alloc);
-                       env[env_nr++] = NULL;
-

		-- Pete

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-06-16 12:37     ` Pete Wyckoff
@ 2009-06-16 17:29       ` Daniel Barkalow
  2009-06-18 11:53         ` Pete Wyckoff
  0 siblings, 1 reply; 7+ messages in thread
From: Daniel Barkalow @ 2009-06-16 17:29 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

On Tue, 16 Jun 2009, Pete Wyckoff wrote:

> barkalow@iabervon.org wrote on Mon, 15 Jun 2009 14:50 -0400:
> > On Mon, 15 Jun 2009, Pete Wyckoff wrote:
> [..]
> > Sorry about that; I've been using the api version (because calling the 
> > command-line client with the frequency necessary to fetch stuff upsets our 
> > P4 admins). I keep meaning to write those methods, but I forgot. It 
> > shouldn't be too hard to write them if you want to use them. The 
> > _p4_call_* ones should call p4 and parse the output; the other two can be 
> > empty because the tree gets written in the filesystem in the call that is 
> > implemented. Note that these methods are only used in export (at least 
> > currently).
> 
> I shall concentrate on the API version for now too, lest my admins
> get testy.

That's probably wise. The non-API version is fine if connections are 
cheap, and just connecting to the server is cheap, but tunneling each 
connection over ssh is not cheap.

> > > Next, it took me a while to discover the format of the remote entry.
> > > Can you think of a way that "git remote add ..." could just work?
> > > How to tell it the "vcs = p4" setting at add time, for instance?
> > 
> > Setting the vcs in add is pretty simple, but setting the necessary other 
> > options would be tricky.
> 
> I'll try to help with a bit of documentation at least once this
> appears to work.

What the options are should be accurately documented already in 
Documentation/git-vcs-p4.txt (unless I forgot stuff). It's just getting 
things set reasonably through a program that's tough. Also note that you 
can set a bunch of stuff globally, most likely, and just a few things 
per-remote (or per-repo).

> > Great. Does you p4 depot have integrates? Does it have tricky integrates? 
> > I've got some support for it, but it's not nearly as elaborate as what the 
> > perl people wrote to convert their depot. I've tested it pretty thoroughly 
> > with tidy history, but I know there are problems with the case where 
> > someone integrates a file into the project from their sandbox.
> 
> Oh does it have integrates.  :)
> 
> Running the p4 filelog command on its own takes 15 min.
> Each file has at least one "branch from" and a bunch of "branch
> into" lines.  Older files in the repo have some "copy into" lines
> too.
> 
> And there are plenty with the two-rev "branch from" case that
> provoked a comment in your code.  I don't understand it either.
> Here's a snippet:
> 
> //depot/path/to/project/dir/foo.txt
> ... #1 change 555 branch on 2001/02/03 by x@y (ktext) 'Branch oldproj @444'
> ... ... branch into //depot/path/to/branch1/dir/foo.txt#1
> ... ... branch into //depot/path/to/branch2/dir/foo.txt#1
> ... ... branch into //depot/path/to/branch3/dir/foo.txt#1
> ... ... branch from //depot/oldproj/dir/main/foo.txt#1,#28
>
> The name and the "into" branches are in current use.  The last
> "from" branch is the historic location from where everything was
> moved recently.  Note the extra "main/" directory in there.  This
> pattern is common throughout the historic depot format, but it
> confuses get_related_file(), which complains and returns NULL.

So you've got a per-project mainline, which is integrated into branches. 
My depot used to have mainlines, but eventually ditched them in favor of 
only having branches, where 1.2 integrates into 1.3, and 1.3 integrates 
into 1.4, and so on.

I'm surprised that the "main" is confusing get_related_file(); I'd think 
that it would decide that //depot/oldproj/dir/main/foo.txt is foo.txt in 
the codeline //depot/oldproj/dir/main/.

> I don't particularly want to maintain this historic codeline
> information, even if vcs-p4 could figure it out.  Is there a way
> just to import the history of the file without having it be part of
> a discovered codeline?  Perhaps always call get_codeline() and let
> me use ignore_codelines to skip the oldproj ones by hand?  I do want
> git to know about the "branch1" and similar, though, eventually.

In general, I just write my fetch rules so that the old codelines get 
discovered and imported (so that the history is there), but then they 
aren't fetched. That is, they're junk in refs/p4/* (which lets git keep 
track of the fact that it imported them), but they don't end up in 
refs/remotes/origin/*.

> I do set findbranches = false to avoid the big lookup in the list
> operation, for now at least.  
> 
> You have some extra lines in "list" that duplicate codelines[0]:
> 
> -                       git_config(handle_config, remote);
> -
> -                       ALLOC_GROW(env, env_nr + 1, env_alloc);
> -                       env[env_nr++] = NULL;
> -

Oops, good catch; I refactored that stuff but forgot to remove some of the 
duplication.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-06-16 17:29       ` Daniel Barkalow
@ 2009-06-18 11:53         ` Pete Wyckoff
  2009-06-18 18:07           ` Daniel Barkalow
  0 siblings, 1 reply; 7+ messages in thread
From: Pete Wyckoff @ 2009-06-18 11:53 UTC (permalink / raw)
  To: Daniel Barkalow
  Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

barkalow@iabervon.org wrote on Tue, 16 Jun 2009 13:29 -0400:
> On Tue, 16 Jun 2009, Pete Wyckoff wrote:
> > //depot/path/to/project/dir/foo.txt
> > ... #1 change 555 branch on 2001/02/03 by x@y (ktext) 'Branch oldproj @444'
> > ... ... branch into //depot/path/to/branch1/dir/foo.txt#1
> > ... ... branch into //depot/path/to/branch2/dir/foo.txt#1
> > ... ... branch into //depot/path/to/branch3/dir/foo.txt#1
> > ... ... branch from //depot/oldproj/dir/main/foo.txt#1,#28
> >
> > The name and the "into" branches are in current use.  The last
> > "from" branch is the historic location from where everything was
> > moved recently.  Note the extra "main/" directory in there.  This
> > pattern is common throughout the historic depot format, but it
> > confuses get_related_file(), which complains and returns NULL.
> 
> So you've got a per-project mainline, which is integrated into branches. 
> My depot used to have mainlines, but eventually ditched them in favor of 
> only having branches, where 1.2 integrates into 1.3, and 1.3 integrates 
> into 1.4, and so on.
> 
> I'm surprised that the "main" is confusing get_related_file(); I'd think 
> that it would decide that //depot/oldproj/dir/main/foo.txt is foo.txt in 
> the codeline //depot/oldproj/dir/main/.

Because I'm trying to get it to fetch //depot/path/to/project
so the base->name is "/dir/foo.txt" and the path is
"//depot/oldproj/dir/main/foo.txt".

I hacked the code to look for the basename as well.  It worked
further, but added a codeline for every directory that had ever
existed.  There's way too many of these.  And took forever, and
stumbled on some odd codeline names that don't appear to be in the
depot.  These triggered a coredump that I also worked around.

> > I don't particularly want to maintain this historic codeline
> > information, even if vcs-p4 could figure it out.  Is there a way
> > just to import the history of the file without having it be part of
> > a discovered codeline?  Perhaps always call get_codeline() and let
> > me use ignore_codelines to skip the oldproj ones by hand?  I do want
> > git to know about the "branch1" and similar, though, eventually.
> 
> In general, I just write my fetch rules so that the old codelines get 
> discovered and imported (so that the history is there), but then they 
> aren't fetched. That is, they're junk in refs/p4/* (which lets git keep 
> track of the fact that it imported them), but they don't end up in 
> refs/remotes/origin/*.

I ended up using a codelineformat regex to pick out just the recent
branches, ignoring all the history.  This did mostly work, like how
the git-p4 script would do things.  That's okay for me, as I didn't
need the gobs of history anyway.  I do need to handle cross-codeline
integrates that happen in the future nicely, though.  Maybe now I
can relax the regexp and have it discover new codelines only if
they appear in future integrates.

Here's the combined patch that I ended up with.  Don't bother
preserving this, just roll the good bits into your patches in the
future.

If you make any nice changes, please send out the patches again.  I
would like to give it another spin once I have some recent
integrates to play with.

		-- Pete


diff --git a/vcs-p4/vcs-p4.c b/vcs-p4/vcs-p4.c
index b1cff2f..dcbda78 100644
--- a/vcs-p4/vcs-p4.c
+++ b/vcs-p4/vcs-p4.c
@@ -96,6 +96,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
 	struct p4_codeline **posn, *codeline;
 	int i;
 	unsigned char sha1[20];
+	char *refname;
 
 	if (codeline_regex && regexec(codeline_regex, path, 0, NULL, 0))
 		return NULL;
@@ -103,6 +104,12 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
 	for (posn = &depot->codelines; *posn; posn = &(*posn)->next)
 		if (!strcmp(path, (*posn)->path))
 			return *posn;
+
+	refname = codeline_to_refname(path);
+	if (refname == NULL) {
+		fprintf(stderr, "%s: invalid codeline %s\n", __func__, path);
+		return NULL;
+	}
 	codeline = xcalloc(1, sizeof(*codeline));
 	codeline->depot = depot;
 	codeline->path = xstrdup(path);
@@ -111,7 +118,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
 		if (!strcmp(path, ignore_codelines[i]))
 			codeline->ignore = 1;
 
-	codeline->refname = codeline_to_refname(path);
+	codeline->refname = refname;
 	if (!get_sha1(codeline->refname, sha1)) {
 		struct commit *commit = lookup_commit(sha1);
 		char *field;
@@ -129,6 +136,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
 		}
 	}
 	*posn = codeline;
+	fprintf(stderr, "added codeline %s\n", codeline->path);
 	return codeline;
 }
 
@@ -210,27 +218,51 @@ static struct p4_file *get_file_by_full(struct p4_codeline *codeline,
 	return *posn;
 }
 
+#define dfprintf(fp, ...)
+/* #define dfprintf fprintf */
+
 static struct p4_file *get_related_file(struct p4_file *base, const char *path)
 {
-	int basenamelen = strlen(base->name);
-	int reldirlen = strlen(path) - basenamelen;
+	int basenamelen, reldirlen;
 	struct p4_codeline *codeline;
-	if (reldirlen > 0 && !strcmp(path + reldirlen, base->name)) {
+	const char *cp;
+
+	cp = base->name;
+strip:
+	basenamelen = strlen(cp);
+	reldirlen = strlen(path) - basenamelen;
+	dfprintf(stderr, "consider base %s in path %s\n", cp, path);
+	if (reldirlen > 0 && !strcmp(path + reldirlen, cp)) {
 		/* File with the same name in another codeline */
-		char *other = xstrndup(path, reldirlen);
-		//printf("# find %s in %s\n", path, other);
+		char *other;
+
+		/* else later filelog will add another / and fail */
+		if (path[reldirlen-1] == '/')
+			--reldirlen;
+		other = xstrndup(path, reldirlen);
+		dfprintf(stderr, "look for %s in %s\n", path, other);
 		codeline = get_codeline(base->codeline->depot, other);
-		free(other);
-		if (codeline)
+		if (codeline) {
+			dfprintf(stderr, "found %s in %s\n", path, other);
+			free(other);
 			return get_file_by_full(codeline, path);
-		return NULL;
+		} else {
+			free(other);
+		}
+	}
+	/* strip just the basename and try again to find it */
+	cp = strrchr(cp, '/');
+	if (cp) {
+		++cp;
+		goto strip;
 	}
+	dfprintf(stderr, "look for %s directly\n", path);
 	codeline = find_codeline(base->codeline->depot, path);
 	if (codeline) {
 		/* File with a different name in some known codeline */
 		return get_file_by_full(codeline, path);
 	}
-	fprintf(stderr, "Trying to identify %s\n", path);
+	dfprintf(stderr, "failed to find codeline for %s\n", path);
 	/* Not in any known codeline; need to recheck this after
 	 * discovering codelines completes.
 	 */
@@ -615,7 +647,7 @@ static void p4_filelog_cb(struct p4_filelog_data *data,
 			rev = strtoul(posn, &posn, 10);
 			if (*posn != ',')
 				break;
-			posn += 2;
+			posn += 2;  /* skip ,# */
 		} while (1);
 		if (from) {
 			struct p4_file *rel_file =
@@ -1168,11 +1200,6 @@ int main(int argc, const char **argv)
 		} else if (!strcmp(buf.buf, "list")) {
 			int i;
 
-			git_config(handle_config, remote);
-
-			ALLOC_GROW(env, env_nr + 1, env_alloc);
-			env[env_nr++] = NULL;
-
 			if (find_new_codelines) {
 				struct p4_codeline *codeline;
 				save_commit_buffer = 1;

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH 0/7] Foreign VCS support
  2009-06-18 11:53         ` Pete Wyckoff
@ 2009-06-18 18:07           ` Daniel Barkalow
  0 siblings, 0 replies; 7+ messages in thread
From: Daniel Barkalow @ 2009-06-18 18:07 UTC (permalink / raw)
  To: Pete Wyckoff; +Cc: git, Junio C Hamano, Sverre Rabbelier, Johannes Schindelin

On Thu, 18 Jun 2009, Pete Wyckoff wrote:

> barkalow@iabervon.org wrote on Tue, 16 Jun 2009 13:29 -0400:
> > On Tue, 16 Jun 2009, Pete Wyckoff wrote:
> > > //depot/path/to/project/dir/foo.txt
> > > ... #1 change 555 branch on 2001/02/03 by x@y (ktext) 'Branch oldproj @444'
> > > ... ... branch into //depot/path/to/branch1/dir/foo.txt#1
> > > ... ... branch into //depot/path/to/branch2/dir/foo.txt#1
> > > ... ... branch into //depot/path/to/branch3/dir/foo.txt#1
> > > ... ... branch from //depot/oldproj/dir/main/foo.txt#1,#28
> > >
> > > The name and the "into" branches are in current use.  The last
> > > "from" branch is the historic location from where everything was
> > > moved recently.  Note the extra "main/" directory in there.  This
> > > pattern is common throughout the historic depot format, but it
> > > confuses get_related_file(), which complains and returns NULL.
> > 
> > So you've got a per-project mainline, which is integrated into branches. 
> > My depot used to have mainlines, but eventually ditched them in favor of 
> > only having branches, where 1.2 integrates into 1.3, and 1.3 integrates 
> > into 1.4, and so on.
> > 
> > I'm surprised that the "main" is confusing get_related_file(); I'd think 
> > that it would decide that //depot/oldproj/dir/main/foo.txt is foo.txt in 
> > the codeline //depot/oldproj/dir/main/.
> 
> Because I'm trying to get it to fetch //depot/path/to/project
> so the base->name is "/dir/foo.txt" and the path is
> "//depot/oldproj/dir/main/foo.txt".
> 
> I hacked the code to look for the basename as well.  It worked
> further, but added a codeline for every directory that had ever
> existed.

Yeah, that's going to be a problem. If everything in a project could
systematically move, there's no way to determine that the codeline used 
to be //depot/oldproj/ as opposed to anything else. But I suppose it could 
notice that /dir/foo.txt came from (...)/dir/(...)/foo.txt. That is, the 
base name matches, and there's a unique occurence of the name of the 
subdirectory of the root for the file.

> > > I don't particularly want to maintain this historic codeline
> > > information, even if vcs-p4 could figure it out.  Is there a way
> > > just to import the history of the file without having it be part of
> > > a discovered codeline?  Perhaps always call get_codeline() and let
> > > me use ignore_codelines to skip the oldproj ones by hand?  I do want
> > > git to know about the "branch1" and similar, though, eventually.
> > 
> > In general, I just write my fetch rules so that the old codelines get 
> > discovered and imported (so that the history is there), but then they 
> > aren't fetched. That is, they're junk in refs/p4/* (which lets git keep 
> > track of the fact that it imported them), but they don't end up in 
> > refs/remotes/origin/*.
> 
> I ended up using a codelineformat regex to pick out just the recent
> branches, ignoring all the history.  This did mostly work, like how
> the git-p4 script would do things.  That's okay for me, as I didn't
> need the gobs of history anyway.  I do need to handle cross-codeline
> integrates that happen in the future nicely, though.  Maybe now I
> can relax the regexp and have it discover new codelines only if
> they appear in future integrates.

You might be able to set up a codelineformat regex to match any codeline 
in the current depot layout, and then use the findBranches setting.

> Here's the combined patch that I ended up with.  Don't bother
> preserving this, just roll the good bits into your patches in the
> future.

Thanks!

> If you make any nice changes, please send out the patches again.  I
> would like to give it another spin once I have some recent
> integrates to play with.
> 
> 		-- Pete
> 
> 
> diff --git a/vcs-p4/vcs-p4.c b/vcs-p4/vcs-p4.c
> index b1cff2f..dcbda78 100644
> --- a/vcs-p4/vcs-p4.c
> +++ b/vcs-p4/vcs-p4.c
> @@ -96,6 +96,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
>  	struct p4_codeline **posn, *codeline;
>  	int i;
>  	unsigned char sha1[20];
> +	char *refname;
>  
>  	if (codeline_regex && regexec(codeline_regex, path, 0, NULL, 0))
>  		return NULL;
> @@ -103,6 +104,12 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
>  	for (posn = &depot->codelines; *posn; posn = &(*posn)->next)
>  		if (!strcmp(path, (*posn)->path))
>  			return *posn;
> +
> +	refname = codeline_to_refname(path);
> +	if (refname == NULL) {
> +		fprintf(stderr, "%s: invalid codeline %s\n", __func__, path);
> +		return NULL;
> +	}

What sorts of invalid codelines was this turning up?

>  	codeline = xcalloc(1, sizeof(*codeline));
>  	codeline->depot = depot;
>  	codeline->path = xstrdup(path);
> @@ -111,7 +118,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
>  		if (!strcmp(path, ignore_codelines[i]))
>  			codeline->ignore = 1;
>  
> -	codeline->refname = codeline_to_refname(path);
> +	codeline->refname = refname;
>  	if (!get_sha1(codeline->refname, sha1)) {
>  		struct commit *commit = lookup_commit(sha1);
>  		char *field;
> @@ -129,6 +136,7 @@ static struct p4_codeline *get_codeline(struct p4_depot *depot, const char *path
>  		}
>  	}
>  	*posn = codeline;
> +	fprintf(stderr, "added codeline %s\n", codeline->path);
>  	return codeline;
>  }
>  
> @@ -210,27 +218,51 @@ static struct p4_file *get_file_by_full(struct p4_codeline *codeline,
>  	return *posn;
>  }
>  
> +#define dfprintf(fp, ...)
> +/* #define dfprintf fprintf */
> +
>  static struct p4_file *get_related_file(struct p4_file *base, const char *path)
>  {
> -	int basenamelen = strlen(base->name);
> -	int reldirlen = strlen(path) - basenamelen;
> +	int basenamelen, reldirlen;
>  	struct p4_codeline *codeline;
> -	if (reldirlen > 0 && !strcmp(path + reldirlen, base->name)) {
> +	const char *cp;
> +
> +	cp = base->name;
> +strip:
> +	basenamelen = strlen(cp);
> +	reldirlen = strlen(path) - basenamelen;
> +	dfprintf(stderr, "consider base %s in path %s\n", cp, path);
> +	if (reldirlen > 0 && !strcmp(path + reldirlen, cp)) {
>  		/* File with the same name in another codeline */
> -		char *other = xstrndup(path, reldirlen);
> -		//printf("# find %s in %s\n", path, other);
> +		char *other;
> +
> +		/* else later filelog will add another / and fail */
> +		if (path[reldirlen-1] == '/')
> +			--reldirlen;
> +		other = xstrndup(path, reldirlen);
> +		dfprintf(stderr, "look for %s in %s\n", path, other);
>  		codeline = get_codeline(base->codeline->depot, other);
> -		free(other);
> -		if (codeline)
> +		if (codeline) {
> +			dfprintf(stderr, "found %s in %s\n", path, other);
> +			free(other);
>  			return get_file_by_full(codeline, path);
> -		return NULL;
> +		} else {
> +			free(other);
> +		}
> +	}
> +	/* strip just the basename and try again to find it */
> +	cp = strrchr(cp, '/');
> +	if (cp) {
> +		++cp;
> +		goto strip;
>  	}
> +	dfprintf(stderr, "look for %s directly\n", path);
>  	codeline = find_codeline(base->codeline->depot, path);
>  	if (codeline) {
>  		/* File with a different name in some known codeline */
>  		return get_file_by_full(codeline, path);
>  	}
> -	fprintf(stderr, "Trying to identify %s\n", path);
> +	dfprintf(stderr, "failed to find codeline for %s\n", path);
>  	/* Not in any known codeline; need to recheck this after
>  	 * discovering codelines completes.
>  	 */
> @@ -615,7 +647,7 @@ static void p4_filelog_cb(struct p4_filelog_data *data,
>  			rev = strtoul(posn, &posn, 10);
>  			if (*posn != ',')
>  				break;
> -			posn += 2;
> +			posn += 2;  /* skip ,# */
>  		} while (1);
>  		if (from) {
>  			struct p4_file *rel_file =
> @@ -1168,11 +1200,6 @@ int main(int argc, const char **argv)
>  		} else if (!strcmp(buf.buf, "list")) {
>  			int i;
>  
> -			git_config(handle_config, remote);
> -
> -			ALLOC_GROW(env, env_nr + 1, env_alloc);
> -			env[env_nr++] = NULL;
> -
>  			if (find_new_codelines) {
>  				struct p4_codeline *codeline;
>  				save_commit_buffer = 1;
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-06-18 18:07 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-27 18:15 [PATCH 0/7] Foreign VCS support Daniel Barkalow
2009-06-15 16:26 ` Pete Wyckoff
2009-06-15 18:50   ` Daniel Barkalow
2009-06-16 12:37     ` Pete Wyckoff
2009-06-16 17:29       ` Daniel Barkalow
2009-06-18 11:53         ` Pete Wyckoff
2009-06-18 18:07           ` Daniel Barkalow

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.