git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/19] Reroll of the remote-vcs-helper series
@ 2009-10-29 18:01 Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 01/19] Use a function to determine whether a remote is valid Sverre Rabbelier
                   ` (18 more replies)
  0 siblings, 19 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland

Reroll of the entire series that is currently in pu, rebased on
current master.

I did not squash the cvs makefile fixes since my Makefile foo is not
that great.

Daniel, please have a close look at the UNSTABLE patch, as I am not
sure at all that I rebased it properly.

I did not have time to work on my git-remote-hg patches yet, but I
might send those two out later today before I get on my flight back
home. If not it'll be tomorrow somewhere :).

Daniel Barkalow (7):
      Use a function to determine whether a remote is valid
      Allow fetch to modify refs
      Allow programs to not depend on remotes having urls
      Add a config option for remotes to specify a foreign vcs
      Add support for "import" helper command
      Allow helpers to report in "list" command that the ref is unchanged
      Fix memory leak in helper method for disconnect

Johan Herland (8):
      Allow helpers to request marks for fast-import
      Basic build infrastructure for Python scripts
      1/2: Add Python support library for CVS remote helper
      2/2: Add Python support library for CVS remote helper
      git-remote-cvs: Remote helper program for CVS repositories
      Add simple selftests of git-remote-cvs functionality
      Fix the Makefile-generated path to the git_remote_cvs package in git-remote-cvs
      More fixes to the git-remote-cvs installation procedure

Johannes Schindelin (1):
      Finally make remote helper support useful

Sverre Rabbelier (3):
      Factor ref updating out of fetch_with_import
      Refactor git_remote_cvs to a more generic git_remote_helpers
      .gitignore: add git-remote-cvs

 .gitignore                              |    1 +
 Documentation/config.txt                |    4 +
 Documentation/git-remote-cvs.txt        |   85 +++
 Documentation/git-remote-helpers.txt    |   22 +-
 Makefile                                |   53 ++
 builtin-clone.c                         |   12 +-
 builtin-fetch.c                         |   10 +-
 builtin-ls-remote.c                     |    4 +-
 builtin-push.c                          |   68 ++-
 configure.ac                            |    3 +
 git-remote-cvs.py                       |  683 +++++++++++++++++++++
 git_remote_helpers/.gitignore           |    2 +
 git_remote_helpers/Makefile             |   35 ++
 git_remote_helpers/__init__.py          |   27 +
 git_remote_helpers/cvs/changeset.py     |  126 ++++
 git_remote_helpers/cvs/commit_states.py |   62 ++
 git_remote_helpers/cvs/cvs.py           |  998 +++++++++++++++++++++++++++++++
 git_remote_helpers/cvs/revision_map.py  |  418 +++++++++++++
 git_remote_helpers/cvs/symbol_cache.py  |  313 ++++++++++
 git_remote_helpers/git/git.py           |  680 +++++++++++++++++++++
 git_remote_helpers/setup.py             |   17 +
 git_remote_helpers/util.py              |  194 ++++++
 remote.c                                |   15 +-
 remote.h                                |    2 +
 t/t9800-remote-cvs-basic.sh             |  524 ++++++++++++++++
 t/t9801-remote-cvs-fetch.sh             |  291 +++++++++
 t/test-lib.sh                           |   10 +
 transport-helper.c                      |  117 ++++-
 transport.c                             |   37 +-
 transport.h                             |   45 ++-
 30 files changed, 4810 insertions(+), 48 deletions(-)

^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH 01/19] Use a function to determine whether a remote is valid
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 02/19] Allow fetch to modify refs Sverre Rabbelier
                   ` (17 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow, Sverre Rabbelier

From: Daniel Barkalow <barkalow@iabervon.org>

Currently, it only checks url, but it will allow other things in the future.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	Signed off by me because I fixed the conflict with alias_url.

 remote.c |   13 +++++++++----
 1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/remote.c b/remote.c
index 73d33f2..15c9cec 100644
--- a/remote.c
+++ b/remote.c
@@ -52,6 +52,11 @@ static struct rewrites rewrites_push;
 #define BUF_SIZE (2048)
 static char buffer[BUF_SIZE];
 
+static int valid_remote(const struct remote *remote)
+{
+	return !!remote->url;
+}
+
 static const char *alias_url(const char *url, struct rewrites *r)
 {
 	int i, j;
@@ -688,14 +693,14 @@ struct remote *remote_get(const char *name)
 
 	ret = make_remote(name, 0);
 	if (valid_remote_nick(name)) {
-		if (!ret->url)
+		if (!valid_remote(ret))
 			read_remotes_file(ret);
-		if (!ret->url)
+		if (!valid_remote(ret))
 			read_branches_file(ret);
 	}
-	if (name_given && !ret->url)
+	if (name_given && !valid_remote(ret))
 		add_url_alias(ret, name);
-	if (!ret->url)
+	if (!valid_remote(ret))
 		return NULL;
 	ret->fetch = parse_fetch_refspec(ret->fetch_refspec_nr, ret->fetch_refspec);
 	ret->push = parse_push_refspec(ret->push_refspec_nr, ret->push_refspec);
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 02/19] Allow fetch to modify refs
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 01/19] Use a function to determine whether a remote is valid Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  5:56   ` Daniel Barkalow
  2009-10-29 18:01 ` [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls Sverre Rabbelier
                   ` (16 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow, Sverre Rabbelier

From: Daniel Barkalow <barkalow@iabervon.org>

This allows the transport to use the null sha1 for a ref reported to
be present in the remote repository to indicate that a ref exists but
its actual value is presently unknown and will be set if the objects
are fetched.

Also adds documentation to the API to specify exactly what the methods
should do and how they should interpret arguments.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	Signed off by me because I rebased it on master and fixed the
	conflicts with nico's patch.

 builtin-clone.c    |    5 +++--
 transport-helper.c |    4 ++--
 transport.c        |   13 +++++++------
 transport.h        |   41 +++++++++++++++++++++++++++++++++++++++--
 4 files changed, 51 insertions(+), 12 deletions(-)

diff --git a/builtin-clone.c b/builtin-clone.c
index 5762a6f..0042bee 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -526,8 +526,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 
 		refs = transport_get_remote_refs(transport);
 		if (refs) {
-			mapped_refs = wanted_peer_refs(refs, refspec);
-			transport_fetch_refs(transport, mapped_refs);
+			struct ref *ref_cpy = wanted_peer_refs(refs, refspec);
+			mapped_refs = ref_cpy;
+			transport_fetch_refs(transport, ref_cpy);
 		}
 	}
 
diff --git a/transport-helper.c b/transport-helper.c
index f57e84c..a0e4219 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -68,7 +68,7 @@ static int disconnect_helper(struct transport *transport)
 }
 
 static int fetch_with_fetch(struct transport *transport,
-			    int nr_heads, const struct ref **to_fetch)
+			    int nr_heads, struct ref **to_fetch)
 {
 	struct child_process *helper = get_helper(transport);
 	FILE *file = xfdopen(helper->out, "r");
@@ -92,7 +92,7 @@ static int fetch_with_fetch(struct transport *transport,
 }
 
 static int fetch(struct transport *transport,
-		 int nr_heads, const struct ref **to_fetch)
+		 int nr_heads, struct ref **to_fetch)
 {
 	struct helper_data *data = transport->data;
 	int i, count;
diff --git a/transport.c b/transport.c
index 644a30a..3822fc7 100644
--- a/transport.c
+++ b/transport.c
@@ -204,7 +204,7 @@ static struct ref *get_refs_via_rsync(struct transport *transport, int for_push)
 }
 
 static int fetch_objs_via_rsync(struct transport *transport,
-				int nr_objs, const struct ref **to_fetch)
+				int nr_objs, struct ref **to_fetch)
 {
 	struct strbuf buf = STRBUF_INIT;
 	struct child_process rsync;
@@ -408,7 +408,7 @@ static struct ref *get_refs_from_bundle(struct transport *transport, int for_pus
 }
 
 static int fetch_refs_from_bundle(struct transport *transport,
-			       int nr_heads, const struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch)
 {
 	struct bundle_transport_data *data = transport->data;
 	return unbundle(&data->header, data->fd);
@@ -486,7 +486,7 @@ static struct ref *get_refs_via_connect(struct transport *transport, int for_pus
 }
 
 static int fetch_refs_via_pack(struct transport *transport,
-			       int nr_heads, const struct ref **to_fetch)
+			       int nr_heads, struct ref **to_fetch)
 {
 	struct git_transport_data *data = transport->data;
 	char **heads = xmalloc(nr_heads * sizeof(*heads));
@@ -923,16 +923,17 @@ const struct ref *transport_get_remote_refs(struct transport *transport)
 	return transport->remote_refs;
 }
 
-int transport_fetch_refs(struct transport *transport, const struct ref *refs)
+int transport_fetch_refs(struct transport *transport, struct ref *refs)
 {
 	int rc;
 	int nr_heads = 0, nr_alloc = 0, nr_refs = 0;
-	const struct ref **heads = NULL;
-	const struct ref *rm;
+	struct ref **heads = NULL;
+	struct ref *rm;
 
 	for (rm = refs; rm; rm = rm->next) {
 		nr_refs++;
 		if (rm->peer_ref &&
+		    !is_null_sha1(rm->old_sha1) &&
 		    !hashcmp(rm->peer_ref->old_sha1, rm->old_sha1))
 			continue;
 		ALLOC_GROW(heads, nr_heads + 1, nr_alloc);
diff --git a/transport.h b/transport.h
index c14da6f..503db11 100644
--- a/transport.h
+++ b/transport.h
@@ -18,11 +18,48 @@ struct transport {
 	int (*set_option)(struct transport *connection, const char *name,
 			  const char *value);
 
+	/**
+	 * Returns a list of the remote side's refs. In order to allow
+	 * the transport to try to share connections, for_push is a
+	 * hint as to whether the ultimate operation is a push or a fetch.
+	 *
+	 * If the transport is able to determine the remote hash for
+	 * the ref without a huge amount of effort, it should store it
+	 * in the ref's old_sha1 field; otherwise it should be all 0.
+	 **/
 	struct ref *(*get_refs_list)(struct transport *transport, int for_push);
-	int (*fetch)(struct transport *transport, int refs_nr, const struct ref **refs);
+
+	/**
+	 * Fetch the objects for the given refs. Note that this gets
+	 * an array, and should ignore the list structure.
+	 *
+	 * If the transport did not get hashes for refs in
+	 * get_refs_list(), it should set the old_sha1 fields in the
+	 * provided refs now.
+	 **/
+	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs);
+
+	/**
+	 * Push the objects and refs. Send the necessary objects, and
+	 * then, for any refs where peer_ref is set and
+	 * peer_ref->new_sha1 is different from old_sha1, tell the
+	 * remote side to update each ref in the list from old_sha1 to
+	 * peer_ref->new_sha1.
+	 *
+	 * Where possible, set the status for each ref appropriately.
+	 *
+	 * The transport must modify new_sha1 in the ref to the new
+	 * value if the remote accepted the change. Note that this
+	 * could be a different value from peer_ref->new_sha1 if the
+	 * process involved generating new commits.
+	 **/
 	int (*push_refs)(struct transport *transport, struct ref *refs, int flags);
 	int (*push)(struct transport *connection, int refspec_nr, const char **refspec, int flags);
 
+	/** get_refs_list(), fetch(), and push_refs() can keep
+	 * resources (such as a connection) reserved for futher
+	 * use. disconnect() releases these resources.
+	 **/
 	int (*disconnect)(struct transport *connection);
 	char *pack_lockfile;
 	signed verbose : 2;
@@ -74,7 +111,7 @@ int transport_push(struct transport *connection,
 
 const struct ref *transport_get_remote_refs(struct transport *transport);
 
-int transport_fetch_refs(struct transport *transport, const struct ref *refs);
+int transport_fetch_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
 char *transport_anonymize_url(const char *url);
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 01/19] Use a function to determine whether a remote is valid Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 02/19] Allow fetch to modify refs Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  6:02   ` Daniel Barkalow
  2009-10-29 18:01 ` [PATCH 04/19] Add a config option for remotes to specify a foreign vcs Sverre Rabbelier
                   ` (15 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow, Sverre Rabbelier

From: Daniel Barkalow <barkalow@iabervon.org>

For fetch and ls-remote, which use the first url of a remote, have
transport_get() determine this by passing a remote and passing NULL
for the url. For push, which uses every url of a remote, use each url
in turn if there are any, and use NULL if there are none.

This will allow the transport code to do something different if the
location is not specified with a url.

Also, have the message for a fetch say "foreign" if there is no url.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	I rebased this on master and had major conflicts with the
	recent 'advice' series, Daniel, please have a look at this to
	see whether it is sane at all.

 builtin-fetch.c     |    7 +++-
 builtin-ls-remote.c |    4 +-
 builtin-push.c      |   68 +++++++++++++++++++++++++++++++-------------------
 transport.c         |    3 ++
 4 files changed, 52 insertions(+), 30 deletions(-)

diff --git a/builtin-fetch.c b/builtin-fetch.c
index a35a6f8..013a6ba 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -309,7 +309,10 @@ static int store_updated_refs(const char *raw_url, const char *remote_name,
 	if (!fp)
 		return error("cannot open %s: %s\n", filename, strerror(errno));
 
-	url = transport_anonymize_url(raw_url);
+	if (raw_url)
+		url = transport_anonymize_url(raw_url);
+	else
+		url = xstrdup("foreign");
 	for (rm = ref_map; rm; rm = rm->next) {
 		struct ref *ref = NULL;
 
@@ -704,7 +707,7 @@ int cmd_fetch(int argc, const char **argv, const char *prefix)
 	if (!remote)
 		die("Where do you want to fetch from today?");
 
-	transport = transport_get(remote, remote->url[0]);
+	transport = transport_get(remote, NULL);
 	if (verbosity >= 2)
 		transport->verbose = 1;
 	if (verbosity < 0)
diff --git a/builtin-ls-remote.c b/builtin-ls-remote.c
index 78a88f7..4c6fc58 100644
--- a/builtin-ls-remote.c
+++ b/builtin-ls-remote.c
@@ -87,9 +87,9 @@ int cmd_ls_remote(int argc, const char **argv, const char *prefix)
 		}
 	}
 	remote = nongit ? NULL : remote_get(dest);
-	if (remote && !remote->url_nr)
+	if (!nongit && !remote)
 		die("remote %s has no configured URL", dest);
-	transport = transport_get(remote, remote ? remote->url[0] : dest);
+	transport = transport_get(remote, remote ? NULL : dest);
 	if (uploadpack != NULL)
 		transport_set_option(transport, TRANS_OPT_UPLOADPACK, uploadpack);
 
diff --git a/builtin-push.c b/builtin-push.c
index b5cd2cd..d86b93d 100644
--- a/builtin-push.c
+++ b/builtin-push.c
@@ -88,6 +88,36 @@ static void setup_default_push_refspecs(void)
 	}
 }
 
+static int push_with_options(struct transport *transport, int flags)
+{
+	int err;
+	int nonfastforward;
+	if (receivepack)
+		transport_set_option(transport,
+				     TRANS_OPT_RECEIVEPACK, receivepack);
+	if (thin)
+		transport_set_option(transport, TRANS_OPT_THIN, "yes");
+
+	if (flags & TRANSPORT_PUSH_VERBOSE)
+		fprintf(stderr, "Pushing to %s\n", transport->url);
+	err = transport_push(transport, refspec_nr, refspec, flags,
+			     &nonfastforward);
+	err |= transport_disconnect(transport);
+
+	if (!err)
+		return 0;
+
+	error("failed to push some refs to '%s'", transport->url);
+
+	if (nonfastforward && advice_push_nonfastforward) {
+		printf("To prevent you from losing history, non-fast-forward updates were rejected\n"
+		       "Merge the remote changes before pushing again.  See the 'non-fast forward'\n"
+		       "section of 'git push --help' for details.\n");
+	}
+
+	return 1;
+}
+
 static int do_push(const char *repo, int flags)
 {
 	int i, errs;
@@ -136,33 +166,19 @@ static int do_push(const char *repo, int flags)
 		url = remote->url;
 		url_nr = remote->url_nr;
 	}
-	for (i = 0; i < url_nr; i++) {
-		struct transport *transport =
-			transport_get(remote, url[i]);
-		int err;
-		int nonfastforward;
-		if (receivepack)
-			transport_set_option(transport,
-					     TRANS_OPT_RECEIVEPACK, receivepack);
-		if (thin)
-			transport_set_option(transport, TRANS_OPT_THIN, "yes");
-
-		if (flags & TRANSPORT_PUSH_VERBOSE)
-			fprintf(stderr, "Pushing to %s\n", url[i]);
-		err = transport_push(transport, refspec_nr, refspec, flags,
-				     &nonfastforward);
-		err |= transport_disconnect(transport);
-
-		if (!err)
-			continue;
-
-		error("failed to push some refs to '%s'", url[i]);
-		if (nonfastforward && advice_push_nonfastforward) {
-			printf("To prevent you from losing history, non-fast-forward updates were rejected\n"
-			       "Merge the remote changes before pushing again.  See the 'non-fast forward'\n"
-			       "section of 'git push --help' for details.\n");
+	if (url_nr) {
+		for (i = 0; i < url_nr; i++) {
+			struct transport *transport =
+				transport_get(remote, url[i]);
+			if (push_with_options(transport, flags))
+				errs++;
 		}
-		errs++;
+	} else {
+		struct transport *transport =
+			transport_get(remote, NULL);
+
+		if (push_with_options(transport, flags))
+			errs++;
 	}
 	return !!errs;
 }
diff --git a/transport.c b/transport.c
index 3822fc7..5ae8db6 100644
--- a/transport.c
+++ b/transport.c
@@ -813,6 +813,9 @@ struct transport *transport_get(struct remote *remote, const char *url)
 	struct transport *ret = xcalloc(1, sizeof(*ret));
 
 	ret->remote = remote;
+
+	if (!url && remote && remote->url)
+		url = remote->url[0];
 	ret->url = url;
 
 	if (!prefixcmp(url, "rsync:")) {
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 04/19] Add a config option for remotes to specify a foreign vcs
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (2 preceding siblings ...)
  2009-10-29 18:01 ` [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 05/19] Add support for "import" helper command Sverre Rabbelier
                   ` (14 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow, Sverre Rabbelier

From: Daniel Barkalow <barkalow@iabervon.org>

If this is set, the url is not required, and the transport always uses
a helper named "git-remote-<value>".

It is a separate configuration option in order to allow a sensible
configuration for foreign systems which either have no meaningful urls
for repositories or which require urls that do not specify the system
used by the repository at that location. However, this only affects
how the name of the helper is determined, not anything about the
interaction with the helper, and the contruction is such that, if the
foreign scm does happen to use a co-named url method, a url with that
method may be used directly.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	This has my patch to fix valid_remote to allow for both url
	vcs squashed in.

 Documentation/config.txt |    4 ++++
 remote.c                 |    4 +++-
 remote.h                 |    2 ++
 transport.c              |    5 +++++
 4 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index d1e2120..0d9d369 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1408,6 +1408,10 @@ remote.<name>.tagopt::
 	Setting this value to \--no-tags disables automatic tag following when
 	fetching from remote <name>
 
+remote.<name>.vcs::
+	Setting this to a value <vcs> will cause git to interact with
+	the remote with the git-remote-<vcs> helper.
+
 remotes.<group>::
 	The list of remotes which are fetched by "git remote update
 	<group>".  See linkgit:git-remote[1].
diff --git a/remote.c b/remote.c
index 15c9cec..09bb79c 100644
--- a/remote.c
+++ b/remote.c
@@ -54,7 +54,7 @@ static char buffer[BUF_SIZE];
 
 static int valid_remote(const struct remote *remote)
 {
-	return !!remote->url;
+	return (!!remote->url) || (!!remote->foreign_vcs);
 }
 
 static const char *alias_url(const char *url, struct rewrites *r)
@@ -444,6 +444,8 @@ static int handle_config(const char *key, const char *value, void *cb)
 	} else if (!strcmp(subkey, ".proxy")) {
 		return git_config_string((const char **)&remote->http_proxy,
 					 key, value);
+	} else if (!strcmp(subkey, ".vcs")) {
+		return git_config_string(&remote->foreign_vcs, key, value);
 	}
 	return 0;
 }
diff --git a/remote.h b/remote.h
index 5db8420..ac0ce2f 100644
--- a/remote.h
+++ b/remote.h
@@ -11,6 +11,8 @@ struct remote {
 	const char *name;
 	int origin;
 
+	const char *foreign_vcs;
+
 	const char **url;
 	int url_nr;
 	int url_alloc;
diff --git a/transport.c b/transport.c
index 5ae8db6..13bab4e 100644
--- a/transport.c
+++ b/transport.c
@@ -818,6 +818,11 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		url = remote->url[0];
 	ret->url = url;
 
+	if (remote && remote->foreign_vcs) {
+		transport_helper_init(ret, remote->foreign_vcs);
+		return ret;
+	}
+
 	if (!prefixcmp(url, "rsync:")) {
 		ret->get_refs_list = get_refs_via_rsync;
 		ret->fetch = fetch_objs_via_rsync;
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 05/19] Add support for "import" helper command
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (3 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 04/19] Add a config option for remotes to specify a foreign vcs Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [RFC PATCH 06/19] Factor ref updating out of fetch_with_import Sverre Rabbelier
                   ` (13 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow, Sverre Rabbelier

From: Daniel Barkalow <barkalow@iabervon.org>

This command, supported if the "import" capability is advertized,
allows a helper to support fetching by outputting a git-fast-import
stream.

If both "fetch" and "import" are advertized, git itself will use
"fetch" (although other users may use "import" in this case).

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	This has my symref patch squashed in.

 Documentation/git-remote-helpers.txt |   10 ++++++
 transport-helper.c                   |   52 ++++++++++++++++++++++++++++++++++
 2 files changed, 62 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-remote-helpers.txt b/Documentation/git-remote-helpers.txt
index 173ee23..e9aa67e 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -43,6 +43,13 @@ Commands are given by the caller on the helper's standard input, one per line.
 +
 Supported if the helper has the "fetch" capability.
 
+'import' <name>::
+	Produces a fast-import stream which imports the current value
+	of the named ref. It may additionally import other refs as
+	needed to construct the history efficiently.
++
+Supported if the helper has the "import" capability.
+
 If a fatal error occurs, the program writes the error message to
 stderr and exits. The caller should expect that a suitable error
 message has been printed if the child closes the connection without
@@ -57,6 +64,9 @@ CAPABILITIES
 'fetch'::
 	This helper supports the 'fetch' command.
 
+'import'::
+	This helper supports the 'import' command.
+
 REF LIST ATTRIBUTES
 -------------------
 
diff --git a/transport-helper.c b/transport-helper.c
index a0e4219..f840842 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -11,6 +11,7 @@ struct helper_data
 	const char *name;
 	struct child_process *helper;
 	unsigned fetch : 1;
+	unsigned import : 1;
 };
 
 static struct child_process *get_helper(struct transport *transport)
@@ -48,6 +49,8 @@ static struct child_process *get_helper(struct transport *transport)
 			break;
 		if (!strcmp(buf.buf, "fetch"))
 			data->fetch = 1;
+		if (!strcmp(buf.buf, "import"))
+			data->import = 1;
 	}
 	return data->helper;
 }
@@ -91,6 +94,52 @@ static int fetch_with_fetch(struct transport *transport,
 	return 0;
 }
 
+static int get_importer(struct transport *transport, struct child_process *fastimport)
+{
+	struct child_process *helper = get_helper(transport);
+	memset(fastimport, 0, sizeof(*fastimport));
+	fastimport->in = helper->out;
+	fastimport->argv = xcalloc(5, sizeof(*fastimport->argv));
+	fastimport->argv[0] = "fast-import";
+	fastimport->argv[1] = "--quiet";
+
+	fastimport->git_cmd = 1;
+	return start_command(fastimport);
+}
+
+static int fetch_with_import(struct transport *transport,
+			     int nr_heads, struct ref **to_fetch)
+{
+	struct child_process fastimport;
+	struct child_process *helper = get_helper(transport);
+	int i;
+	struct ref *posn;
+	struct strbuf buf = STRBUF_INIT;
+
+	if (get_importer(transport, &fastimport))
+		die("Couldn't run fast-import");
+
+	for (i = 0; i < nr_heads; i++) {
+		posn = to_fetch[i];
+		if (posn->status & REF_STATUS_UPTODATE)
+			continue;
+
+		strbuf_addf(&buf, "import %s\n", posn->name);
+		write_in_full(helper->in, buf.buf, buf.len);
+		strbuf_reset(&buf);
+	}
+	disconnect_helper(transport);
+	finish_command(&fastimport);
+
+	for (i = 0; i < nr_heads; i++) {
+		posn = to_fetch[i];
+		if (posn->status & REF_STATUS_UPTODATE)
+			continue;
+		read_ref(posn->symref ? posn->symref : posn->name, posn->old_sha1);
+	}
+	return 0;
+}
+
 static int fetch(struct transport *transport,
 		 int nr_heads, struct ref **to_fetch)
 {
@@ -108,6 +157,9 @@ static int fetch(struct transport *transport,
 	if (data->fetch)
 		return fetch_with_fetch(transport, nr_heads, to_fetch);
 
+	if (data->import)
+		return fetch_with_import(transport, nr_heads, to_fetch);
+
 	return -1;
 }
 
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (4 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 05/19] Add support for "import" helper command Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  7:10   ` Daniel Barkalow
  2009-10-29 18:01 ` [PATCH 07/19] Allow helpers to report in "list" command that the ref is unchanged Sverre Rabbelier
                   ` (12 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Sverre Rabbelier

Also allow the new update_refs to actually update the refs set, this
way the remote helper can set the value of previously unknown refs.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	Daniel, if we can get wanted_peer_refs to keep HEAD as a
	wanted ref somehow this patch could be a lot simpler.

 builtin-clone.c    |    7 +++++++
 builtin-fetch.c    |    3 +++
 transport-helper.c |   15 +++++++++++++++
 transport.c        |    6 ++++++
 transport.h        |    8 ++++++--
 5 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/builtin-clone.c b/builtin-clone.c
index 0042bee..7c90ce2 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -529,6 +529,13 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
 			struct ref *ref_cpy = wanted_peer_refs(refs, refspec);
 			mapped_refs = ref_cpy;
 			transport_fetch_refs(transport, ref_cpy);
+			if (transport->update_refs)
+			{
+				ref_cpy = copy_ref_list(refs);
+				transport_update_refs(transport, ref_cpy);
+				refs = ref_cpy;
+				mapped_refs = wanted_peer_refs(refs, refspec);
+			}
 		}
 	}
 
diff --git a/builtin-fetch.c b/builtin-fetch.c
index 013a6ba..c35188b 100644
--- a/builtin-fetch.c
+++ b/builtin-fetch.c
@@ -479,7 +479,10 @@ static int fetch_refs(struct transport *transport, struct ref *ref_map)
 {
 	int ret = quickfetch(ref_map);
 	if (ret)
+	{
 		ret = transport_fetch_refs(transport, ref_map);
+		transport_update_refs(transport, ref_map);
+	}
 	if (!ret)
 		ret |= store_updated_refs(transport->url,
 				transport->remote->name,
diff --git a/transport-helper.c b/transport-helper.c
index f840842..ab40a9a 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -207,6 +207,20 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 	return ret;
 }
 
+static int update_refs(struct transport *transport, struct ref *refs)
+{
+	struct ref *ref = refs;
+
+	while (ref) {
+		if (ref->status & REF_STATUS_UPTODATE)
+			continue;
+		read_ref(ref->symref ? ref->symref : ref->name, ref->old_sha1);
+		ref = ref->next;
+	}
+
+	return 0;
+}
+
 int transport_helper_init(struct transport *transport, const char *name)
 {
 	struct helper_data *data = xcalloc(sizeof(*data), 1);
@@ -215,6 +229,7 @@ int transport_helper_init(struct transport *transport, const char *name)
 	transport->data = data;
 	transport->get_refs_list = get_refs_list;
 	transport->fetch = fetch;
+	transport->update_refs = update_refs;
 	transport->disconnect = disconnect_helper;
 	return 0;
 }
diff --git a/transport.c b/transport.c
index 13bab4e..741a3a7 100644
--- a/transport.c
+++ b/transport.c
@@ -966,6 +966,12 @@ int transport_fetch_refs(struct transport *transport, struct ref *refs)
 	return rc;
 }
 
+void transport_update_refs(struct transport *transport, struct ref *refs)
+{
+	if (transport->update_refs)
+		transport->update_refs(transport, refs);
+}
+
 void transport_unlock_pack(struct transport *transport)
 {
 	if (transport->pack_lockfile) {
diff --git a/transport.h b/transport.h
index 503db11..1aba16c 100644
--- a/transport.h
+++ b/transport.h
@@ -32,12 +32,15 @@ struct transport {
 	/**
 	 * Fetch the objects for the given refs. Note that this gets
 	 * an array, and should ignore the list structure.
-	 *
+	 **/
+	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs);
+
+	/**
 	 * If the transport did not get hashes for refs in
 	 * get_refs_list(), it should set the old_sha1 fields in the
 	 * provided refs now.
 	 **/
-	int (*fetch)(struct transport *transport, int refs_nr, struct ref **refs);
+	int (*update_refs)(struct transport *transport, struct ref *refs);
 
 	/**
 	 * Push the objects and refs. Send the necessary objects, and
@@ -112,6 +115,7 @@ int transport_push(struct transport *connection,
 const struct ref *transport_get_remote_refs(struct transport *transport);
 
 int transport_fetch_refs(struct transport *transport, struct ref *refs);
+void transport_update_refs(struct transport *transport, struct ref *refs);
 void transport_unlock_pack(struct transport *transport);
 int transport_disconnect(struct transport *transport);
 char *transport_anonymize_url(const char *url);
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 07/19] Allow helpers to report in "list" command that the ref is unchanged
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (5 preceding siblings ...)
  2009-10-29 18:01 ` [RFC PATCH 06/19] Factor ref updating out of fetch_with_import Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 08/19] Fix memory leak in helper method for disconnect Sverre Rabbelier
                   ` (11 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow

From: Daniel Barkalow <barkalow@iabervon.org>

Helpers may use a line like "? name unchanged" to specify that there
is nothing new at that name, without any git-specific code to
determine the correct response.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---

	Unchanged.

 Documentation/git-remote-helpers.txt |    4 +++-
 transport-helper.c                   |   22 ++++++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletions(-)

diff --git a/Documentation/git-remote-helpers.txt b/Documentation/git-remote-helpers.txt
index e9aa67e..2c5130f 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -70,7 +70,9 @@ CAPABILITIES
 REF LIST ATTRIBUTES
 -------------------
 
-None are defined yet, but the caller must accept any which are supplied.
+'unchanged'::
+	This ref is unchanged since the last import or fetch, although
+	the helper cannot necessarily determine what value that produced.
 
 Documentation
 -------------
diff --git a/transport-helper.c b/transport-helper.c
index ab40a9a..e093d05 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -163,6 +163,22 @@ static int fetch(struct transport *transport,
 	return -1;
 }
 
+static int has_attribute(const char *attrs, const char *attr) {
+	int len;
+	if (!attrs)
+		return 0;
+
+	len = strlen(attr);
+	for (;;) {
+		const char *space = strchrnul(attrs, ' ');
+		if (len == space - attrs && !strncmp(attrs, attr, len))
+			return 1;
+		if (!*space)
+			return 0;
+		attrs = space + 1;
+	}
+}
+
 static struct ref *get_refs_list(struct transport *transport, int for_push)
 {
 	struct child_process *helper;
@@ -197,6 +213,12 @@ static struct ref *get_refs_list(struct transport *transport, int for_push)
 			(*tail)->symref = xstrdup(buf.buf + 1);
 		else if (buf.buf[0] != '?')
 			get_sha1_hex(buf.buf, (*tail)->old_sha1);
+		if (eon) {
+			if (has_attribute(eon + 1, "unchanged")) {
+				(*tail)->status |= REF_STATUS_UPTODATE;
+				read_ref((*tail)->name, (*tail)->old_sha1);
+			}
+		}
 		tail = &((*tail)->next);
 	}
 	strbuf_release(&buf);
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 08/19] Fix memory leak in helper method for disconnect
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (6 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 07/19] Allow helpers to report in "list" command that the ref is unchanged Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 09/19] Finally make remote helper support useful Sverre Rabbelier
                   ` (10 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Daniel Barkalow

From: Daniel Barkalow <barkalow@iabervon.org>

Since some cases may need to disconnect from the helper and reconnect,
wrap the function that just disconnects in a function that also frees
transport->data.

Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---

	Unchanged.

 transport-helper.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/transport-helper.c b/transport-helper.c
index e093d05..36a265d 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -70,6 +70,13 @@ static int disconnect_helper(struct transport *transport)
 	return 0;
 }
 
+static int release_helper(struct transport *transport)
+{
+	disconnect_helper(transport);
+	free(transport->data);
+	return 0;
+}
+
 static int fetch_with_fetch(struct transport *transport,
 			    int nr_heads, struct ref **to_fetch)
 {
@@ -252,6 +259,6 @@ int transport_helper_init(struct transport *transport, const char *name)
 	transport->get_refs_list = get_refs_list;
 	transport->fetch = fetch;
 	transport->update_refs = update_refs;
-	transport->disconnect = disconnect_helper;
+	transport->disconnect = release_helper;
 	return 0;
 }
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 09/19] Finally make remote helper support useful
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (7 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 08/19] Fix memory leak in helper method for disconnect Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 10/19] Allow helpers to request marks for fast-import Sverre Rabbelier
                   ` (9 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johannes Schindelin, Sverre Rabbelier

From: Johannes Schindelin <johannes.schindelin@gmx.de>

The common case for remote helpers will be to import some repository which
can be specified by a single URL.  Rather than supporting those who say
that Git is really complicated, let's be nice to users so that they will
be able to say:

	git clone hg::https://soc.googlecode.com/hg/ soc

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	I decided to not allow 'hg+https' format, since clone does
	not deal well with that syntax (it checks to see whether
	there is a ':' in the url or not).

 transport.c |   10 ++++++++++
 1 files changed, 10 insertions(+), 0 deletions(-)

diff --git a/transport.c b/transport.c
index 741a3a7..e6a00b2 100644
--- a/transport.c
+++ b/transport.c
@@ -818,6 +818,16 @@ struct transport *transport_get(struct remote *remote, const char *url)
 		url = remote->url[0];
 	ret->url = url;
 
+	/* maybe it is a foreign URL? */
+	if (url) {
+		const char *p = url;
+
+		while (isalnum(*p))
+			p++;
+		if (!prefixcmp(p, "::"))
+			remote->foreign_vcs = xstrndup(url, p - url);
+	}
+
 	if (remote && remote->foreign_vcs) {
 		transport_helper_init(ret, remote->foreign_vcs);
 		return ret;
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 10/19] Allow helpers to request marks for fast-import
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (8 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 09/19] Finally make remote helper support useful Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  8:21   ` Johan Herland
  2009-10-29 18:01 ` [PATCH 11/19] Basic build infrastructure for Python scripts Sverre Rabbelier
                   ` (8 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland, Daniel Barkalow

From: Johan Herland <johan@herland.net>

The 'marks' capability is reported by the remote helper if it requires
the fast-import marks database to loaded/saved by any git-fast-import
process that is provided by the transport machinery. The feature is
advertised along with exactly one argument: the location of the file
containing the marks database.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
---

	Unchanged.

 Documentation/git-remote-helpers.txt |    8 ++++++++
 transport-helper.c                   |   15 +++++++++++++++
 2 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/Documentation/git-remote-helpers.txt b/Documentation/git-remote-helpers.txt
index 2c5130f..9a3c5bc 100644
--- a/Documentation/git-remote-helpers.txt
+++ b/Documentation/git-remote-helpers.txt
@@ -67,6 +67,14 @@ CAPABILITIES
 'import'::
 	This helper supports the 'import' command.
 
+'marks' filename::
+	Helper requires the marks from a git-fast-import run to be
+	loaded from, and saved to, the given filename. When this
+	"feature" is advertised, each git-fast-import run must load
+	and save the internal marks database (see the --import-marks
+	and --export-marks option to git-fast-import for more details)
+	located at the given filename.
+
 REF LIST ATTRIBUTES
 -------------------
 
diff --git a/transport-helper.c b/transport-helper.c
index 36a265d..cd3f0df 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -10,6 +10,7 @@ struct helper_data
 {
 	const char *name;
 	struct child_process *helper;
+	char *marks_filename;
 	unsigned fetch : 1;
 	unsigned import : 1;
 };
@@ -51,6 +52,8 @@ static struct child_process *get_helper(struct transport *transport)
 			data->fetch = 1;
 		if (!strcmp(buf.buf, "import"))
 			data->import = 1;
+		if (!prefixcmp(buf.buf, "marks "))
+			data->marks_filename = xstrdup(buf.buf + 6);
 	}
 	return data->helper;
 }
@@ -104,11 +107,21 @@ static int fetch_with_fetch(struct transport *transport,
 static int get_importer(struct transport *transport, struct child_process *fastimport)
 {
 	struct child_process *helper = get_helper(transport);
+	struct helper_data *data = transport->data;
 	memset(fastimport, 0, sizeof(*fastimport));
 	fastimport->in = helper->out;
 	fastimport->argv = xcalloc(5, sizeof(*fastimport->argv));
 	fastimport->argv[0] = "fast-import";
 	fastimport->argv[1] = "--quiet";
+	if (data->marks_filename) {
+		struct strbuf buf = STRBUF_INIT;
+		strbuf_addf(&buf, "--export-marks=%s", data->marks_filename);
+		fastimport->argv[2] = strbuf_detach(&buf, 0);
+		if (!access(data->marks_filename, R_OK)) {
+			strbuf_addf(&buf, "--import-marks=%s", data->marks_filename);
+			fastimport->argv[3] = strbuf_detach(&buf, 0);
+		}
+	}
 
 	fastimport->git_cmd = 1;
 	return start_command(fastimport);
@@ -137,6 +150,8 @@ static int fetch_with_import(struct transport *transport,
 	}
 	disconnect_helper(transport);
 	finish_command(&fastimport);
+	free((char *) fastimport.argv[2]);
+	free((char *) fastimport.argv[3]);
 
 	for (i = 0; i < nr_heads; i++) {
 		posn = to_fetch[i];
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 11/19] Basic build infrastructure for Python scripts
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (9 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 10/19] Allow helpers to request marks for fast-import Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 12/19] 1/2: Add Python support library for CVS remote helper Sverre Rabbelier
                   ` (7 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland

From: Johan Herland <johan@herland.net>

This patch adds basic boilerplate support (based on corresponding Perl
sections) for enabling the building and installation Python scripts.

There are currently no Python scripts being built, and when Python
scripts are added in future patches, their building and installation
can be disabled by defining NO_PYTHON.

Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 Makefile      |   13 +++++++++++++
 configure.ac  |    3 +++
 t/test-lib.sh |    1 +
 3 files changed, 17 insertions(+), 0 deletions(-)

diff --git a/Makefile b/Makefile
index 268aede..b27a7d6 100644
--- a/Makefile
+++ b/Makefile
@@ -164,6 +164,8 @@ all::
 #
 # Define NO_PERL if you do not want Perl scripts or libraries at all.
 #
+# Define NO_PYTHON if you do not want Python scripts or libraries at all.
+#
 # Define NO_TCLTK if you do not want Tcl/Tk GUI.
 #
 # The TCL_PATH variable governs the location of the Tcl interpreter
@@ -308,6 +310,7 @@ LIB_H =
 LIB_OBJS =
 PROGRAMS =
 SCRIPT_PERL =
+SCRIPT_PYTHON =
 SCRIPT_SH =
 TEST_PROGRAMS =
 
@@ -345,6 +348,7 @@ SCRIPT_PERL += git-svn.perl
 
 SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH)) \
 	  $(patsubst %.perl,%,$(SCRIPT_PERL)) \
+	  $(patsubst %.py,%,$(SCRIPT_PYTHON)) \
 	  git-instaweb
 
 # Empty...
@@ -398,8 +402,12 @@ endif
 ifndef PERL_PATH
 	PERL_PATH = /usr/bin/perl
 endif
+ifndef PYTHON_PATH
+	PYTHON_PATH = /usr/bin/python
+endif
 
 export PERL_PATH
+export PYTHON_PATH
 
 LIB_FILE=libgit.a
 XDIFF_LIB=xdiff/lib.a
@@ -1308,6 +1316,10 @@ ifeq ($(PERL_PATH),)
 NO_PERL=NoThanks
 endif
 
+ifeq ($(PYTHON_PATH),)
+NO_PYTHON=NoThanks
+endif
+
 QUIET_SUBDIR0  = +$(MAKE) -C # space to separate -C and subdir
 QUIET_SUBDIR1  =
 
@@ -1355,6 +1367,7 @@ prefix_SQ = $(subst ','\'',$(prefix))
 
 SHELL_PATH_SQ = $(subst ','\'',$(SHELL_PATH))
 PERL_PATH_SQ = $(subst ','\'',$(PERL_PATH))
+PYTHON_PATH_SQ = $(subst ','\'',$(PYTHON_PATH))
 TCLTK_PATH_SQ = $(subst ','\'',$(TCLTK_PATH))
 
 LIBS = $(GITLIBS) $(EXTLIBS)
diff --git a/configure.ac b/configure.ac
index b09b8e4..84b6cf4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -233,6 +233,9 @@ GIT_ARG_SET_PATH(shell)
 # Define PERL_PATH to provide path to Perl.
 GIT_ARG_SET_PATH(perl)
 #
+# Define PYTHON_PATH to provide path to Python.
+GIT_ARG_SET_PATH(python)
+#
 # Define ZLIB_PATH to provide path to zlib.
 GIT_ARG_SET_PATH(zlib)
 #
diff --git a/t/test-lib.sh b/t/test-lib.sh
index f2ca536..0b991db 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -729,6 +729,7 @@ case $(uname -s) in
 esac
 
 test -z "$NO_PERL" && test_set_prereq PERL
+test -z "$NO_PYTHON" && test_set_prereq PYTHON
 
 # test whether the filesystem supports symbolic links
 ln -s x y 2>/dev/null && test -h y 2>/dev/null && test_set_prereq SYMLINKS
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 12/19] 1/2: Add Python support library for CVS remote helper
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (10 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 11/19] Basic build infrastructure for Python scripts Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  8:33   ` Johan Herland
  2009-10-29 18:01 ` [PATCH 13/19] 2/2: " Sverre Rabbelier
                   ` (6 subsequent siblings)
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland, David Aguilar, Sverre Rabbelier

From: Johan Herland <johan@herland.net>

This patch introduces parts of a Python package called "git_remote_cvs"
containing the building blocks of the CVS remote helper.
The CVS remote helper itself is NOT part of this patch.

This patch has been improved by the following contributions:
- David Aguilar: Lots of Python coding style fixes

Cc: David Aguilar <davvid@gmail.com>
Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	This has my patch to util.py squashed in.

 git_remote_cvs/changeset.py        |  126 +++++
 git_remote_cvs/cvs.py              |  998 ++++++++++++++++++++++++++++++++++++
 git_remote_cvs/cvs_symbol_cache.py |  313 +++++++++++
 git_remote_cvs/util.py             |  194 +++++++
 4 files changed, 1631 insertions(+), 0 deletions(-)
 create mode 100644 git_remote_cvs/changeset.py
 create mode 100644 git_remote_cvs/cvs.py
 create mode 100644 git_remote_cvs/cvs_symbol_cache.py
 create mode 100644 git_remote_cvs/util.py

diff --git a/git_remote_cvs/changeset.py b/git_remote_cvs/changeset.py
new file mode 100644
index 0000000..9eea9d2
--- /dev/null
+++ b/git_remote_cvs/changeset.py
@@ -0,0 +1,126 @@
+#!/usr/bin/env python
+
+"""Code for collecting individual CVS revisions into "changesets"
+
+A changeset is a collection of CVSRev objects that belong together in
+the same "commit".  This is a somewhat artificial construct on top of
+CVS, which only stores changes at the per-file level.  Normally, CVS
+users create several CVS revisions simultaneously by applying the
+"cvs commit" command to several files with related changes.  This
+module tries to reconstruct this notion of related revisions.
+
+"""
+
+from git_remote_cvs.util import debug, error, die
+
+
+class Changeset(object):
+
+    """Encapsulate a single changeset/commit."""
+
+    __slots__ = ('revs', 'date', 'author', 'message')
+
+    # The maximum time between the changeset's date, and the date of a
+    # rev to included in that changeset.
+    MaxSecondsBetweenRevs = 8 * 60 * 60  # 8 hours
+
+    @classmethod
+    def from_rev (cls, rev):
+        """Return a Changeset based on the given CVSRev object."""
+        c = cls(rev.date, rev.author, rev.message)
+        result = c.add(rev)
+        assert result
+        return c
+
+    def __init__ (self, date, author, message):
+        """Create a new Changeset with the given metadata."""
+        self.revs = {}  # dict: path -> CVSRev object
+        self.date = date  # CVSDate object
+        self.author = author
+        self.message = message  # Lines of commit message
+
+    def __str__ (self):
+        """Stringify this Changeset object."""
+        msg = self.message[0]  # First line only
+        # Limit message to 25 chars
+        if len(msg) > 25:
+            msg = msg[:22] + "..."
+        return ("<Changeset @(%s) by %s (%s) updating %i files>" %
+                (self.date, self.author, msg, len(self.revs)))
+
+    def __iter__ (self):
+        """Return iterator traversing the CVSRevs in this Changeset."""
+        return self.revs.itervalues()
+
+    def __getitem__ (self, path):
+        """Look up a specific CVSRev in this Changeset."""
+        return self.revs[path]
+
+    def within_time_window (self, rev):
+        """Return True iff the rev is within the time window of self."""
+        return abs(rev.date.diff(self.date)) <= self.MaxSecondsBetweenRevs
+
+    def add (self, rev):
+        """Add the given CVSRev to this Changeset.
+
+        The addition will only succeed if the following holds:
+          - rev.author == self.author
+          - rev.message == self.message
+          - rev.path is not in self.revs
+          - rev.date is within MaxSecondsBetweenRevs of self.date
+        If the addition succeeds, True is returned; otherwise False.
+
+        """
+        if rev.author != self.author or \
+           rev.message != self.message or \
+           rev.path in self.revs or \
+           not self.within_time_window(rev):
+            return False
+
+        self.revs[rev.path] = rev
+        return True
+
+
+def build_changesets_from_revs (cvs_revs):
+    """Organize CVSRev objects into a chronological list of Changesets."""
+    # Construct chronological list of CVSRev objects
+    chron_revs = []
+    for path, d in cvs_revs.iteritems():
+        i = 0  # Current index into chronRevs
+        for revnum, cvsrev in sorted(d.iteritems()):
+            assert path == cvsrev.path
+            assert revnum == cvsrev.num
+            while i < len(chron_revs) and cvsrev.date > chron_revs[i].date:
+                i += 1
+            # Insert cvsRev at position i in chronRevs
+            chron_revs.insert(i, cvsrev)
+            i += 1
+
+    changesets = []  # Chronological list of Changeset objects
+    while len(chron_revs):
+        # There are still more revs to be added to Changesets
+        # Create Changeset based on the first rev in chronRevs
+        changeset = Changeset.from_rev(chron_revs.pop(0))
+        # Keep adding revs chronologically until MaxSecondsBetweenRevs
+        rejects = []  # Revs that cannot be added to this changeset
+        while len(chron_revs):
+            rev = chron_revs.pop(0)
+            reject = False
+            # First, if we have one of rev's parents in rejects, we
+            # must also reject rev
+            for r in rejects:
+                if r.path == rev.path:
+                    reject = True
+                    break
+            # Next, add rev to changeset, reject if add fails
+            if not reject:
+                reject = not changeset.add(rev)
+            if reject:
+                rejects.append(rev)
+                # stop trying when rev is too far in the future
+                if not changeset.within_time_window(rev):
+                    break
+        chron_revs = rejects + chron_revs  # Reconstruct remaining revs
+        changesets.append(changeset)
+
+    return changesets
diff --git a/git_remote_cvs/cvs.py b/git_remote_cvs/cvs.py
new file mode 100644
index 0000000..f870ae0
--- /dev/null
+++ b/git_remote_cvs/cvs.py
@@ -0,0 +1,998 @@
+#!/usr/bin/env python
+
+"""Functionality for interacting with CVS repositories.
+
+This module provides classes for interrogating a CVS repository via a
+CVS working directory (aka. checkout), or via direct queries using the
+"cvs rlog" command.
+
+Also, classes for encapsulating fundamental CVS concepts (like CVS
+revision/branch numbers) are provided.
+"""
+
+import sys
+import os
+import shutil
+import time
+from calendar import timegm
+import unittest
+
+from git_remote_cvs.util import (debug, error, die, ProgressIndicator,
+                                 start_command, run_command,
+                                 file_reader_method, file_writer_method)
+
+
+class CVSNum(object):
+
+    """Encapsulate a single CVS revision/branch number.
+
+    Provides functionality for common operations on CVS numbers.
+
+    A CVS number consists of a list of components separated by periods
+    ('.'), where each component is a decimal number.  Inspecting the
+    components from left to right, the odd-numbered (1st, 3rd, 5th,
+    etc.) components represent branches in the CVS history tree, while
+    the even-numbered (2nd, 4th, 6th, etc.) components represent
+    revisions on the branch specified in the previous position.
+    Thus "1.2" denotes the second revision on the first branch
+    (aka. trunk), while "1.2.4.6" denotes the sixth revision of the
+    fourth branch started from revision "1.2".
+
+    Therefore, in general, a CVS number with an even number of
+    components denotes a revision (we call this a "revision number"),
+    while an odd number of components denotes a branch (called a
+    "branch number").
+
+    There are a few complicating peculiarities: If there is an even
+    number of components, and the second-last component is 0, the
+    number is not a revision number, but is rather equivalent to the
+    branch number we get by removing the 0-component.  I.e. "1.2.0.4"
+    is equivalent to "1.2.4".
+
+    A branch number (except the trunk: "1") always has a "branch point"
+    revision, i.e. the revision from which the branch was started.
+    This revision is found by removing the last component of the branch
+    number.  For example the branch point of "1.2.4" is "1.2".
+
+    Conversely, all revision numbers belong to a corresponding branch,
+    whose branch number is found by removing the last component.
+    Examples: The "1.2.4.6" revision belong to the "1.2.4" branch,
+    while the "1.2" revision belongs to the "1" branch (the "trunk").
+
+    From this we can programatically determine the ancestry of any
+    revision number, by decrementing the last revision component until
+    it equals 1, and then trim off the last two components to get to
+    the branch point, and repeat the process from there until we reach
+    the initial revision (typically "1.1").  For example, recursively
+    enumerating the parent revisions of "1.2.4.6" yields the following
+    revisions:
+    "1.2.4.5", "1.2.4.4", "1.2.4.3", "1.2.4.2", "1.2.4.1", "1.2", "1.1"
+
+    """
+
+    __slots__ = ('c',)
+
+    @staticmethod
+    def decompose (cvsnum):
+        """Split the given CVS number into a list of int components.
+
+        Branch numbers are normalized to the odd-numbered components
+        form (i.e. removing the second last '0' component)
+
+        Examples:
+          '1.2.4.8' -> [1, 2, 4, 8]
+          '1.2.3'   -> [1, 2, 3]
+          '1.2.0.5' -> [1, 2, 5]
+
+        """
+        if cvsnum:
+            r = map(int, cvsnum.split('.'))
+        else:
+            r = []
+        if len(r) >= 2 and r[-2] == 0:
+            del r[-2]
+        if r[-1] == 0:
+            raise ValueError(cvsnum)
+        return tuple(r)
+
+    @staticmethod
+    def compose (c):
+        """Join the given list of integer components into a CVS number.
+
+        E.g.: (1, 2, 4, 8) -> '1.2.4.8'
+
+        """
+        if c[-1] == 0:
+            raise ValueError(str(c))
+        return ".".join(map(str, c))
+
+    @classmethod
+    def from_components (cls, args):
+        """Create a CVSNum from the given list of numerical components."""
+        return cls(cls.compose(args))
+
+    @classmethod
+    def disjoint (cls, a, b):
+        """Return True iff the CVS numbers are historically disjoint.
+
+        Two CVS numbers are disjoint if they do not share the same
+        historical line back to the initial revision.  In other words:
+        the two numbers are disjoint if the history (i.e. set of parent
+        revisions all the way back to the intial (1.1) revision) of
+        neither number is a superset of the other's history.
+        See test_disjoint() for practical examples:
+
+        """
+        if a.is_branch():
+            a = cls.from_components(a.c + (1,))
+        if b.is_branch():
+            b = cls.from_components(b.c + (1,))
+        if len(a.c) > len(b.c):
+            a, b = b, a  # a is now shortest
+        pairs = zip(a.c, b.c)
+        for pa, pb in pairs[:-1]:
+            if pa != pb:
+                return True
+        if len(a) == len(b):
+            return False
+        common_len = len(a)
+        if a.c[common_len - 1] <= b.c[common_len - 1]:
+            return False
+        return True
+
+
+    def __init__ (self, cvsnum):
+        """Create a CVSNum object from the given CVS number string."""
+        self.c = self.decompose(str(cvsnum))
+
+    def __repr__ (self):
+        """Return a string representation of this object."""
+        return self.compose(self.c)
+
+    def __str__ (self):
+        """Return a string representation of this object."""
+        return repr(self)
+
+    def __hash__ (self):
+        """Create a hash value for this CVS number."""
+        return hash(repr(self))
+
+    def __len__ (self):
+        """Return number of components in this CVS number."""
+        return len(self.c)
+
+    def __cmp__ (self, other):
+        """Comparison method for CVS numbers."""
+        try:
+            return cmp(self.c, other.c)
+        except AttributeError:
+            return 1
+
+    def __getitem__ (self, key):
+        """Return the Xth component of this CVS number."""
+        return self.c[key]
+
+    def is_rev (self):
+        """Return True iff this number is a CVS revision number."""
+        return len(self.c) % 2 == 0 and len(self.c) >= 2 and self.c[-2] != 0
+
+    def is_branch (self):
+        """Return True iff this number is a CVS branch number."""
+        return len(self.c) % 2 != 0 or (len(self.c) >= 2 and self.c[-2] == 0)
+
+    def components (self):
+        """Return a list of integer components in this CVS number."""
+        return list(self.c)
+
+    def branch (self):
+        """Return the branch on which the given number lives.
+
+        Revisions: chop the last component to find the branch, e.g.:
+            1.2.4.6 -> 1.2.4
+            1.1 -> 1
+        Branches: unchanged
+
+        """
+        if self.is_rev():
+            return self.from_components(self.c[:-1])
+        return self
+
+    def parent (self):
+        """Return the parent/previous revision number to this number.
+
+        For revisions, this is the previous revision, e.g.:
+            1.2.4.6 -> 1.2.4.5
+            1.2.4.1 -> 1.2
+            1.1 -> None
+            2.1 -> None
+        For branches, this is the branch point, e.g.:
+            1.2.4 -> 1.2
+            1 -> None
+            2 -> None
+
+        """
+        if len(self.c) < 2:
+            return None
+        elif len(self.c) % 2:  # Branch number
+            return self.from_components(self.c[:-1])
+        else:  # Revision number
+            assert self.c[-1] > 0
+            result = self.components()
+            result[-1] -= 1  # Decrement final component
+            if result[-1] == 0:  # We're at the start of the branch
+                del result[-2:]  # Make into branch point
+                if not result:
+                    return None
+            return self.from_components(result)
+
+    def follows (self, other):
+        """Return True iff self historically follows the given rev.
+
+        This iterates through the parents of self, and returns True iff
+        any of them equals the given rev.  Otherwise, it returns False.
+
+        """
+        assert other.is_rev()
+        cur = self
+        while cur:
+            if cur == other:
+                return True
+            cur = cur.parent()
+        return False
+
+    def on_branch (self, branch):
+        """Return True iff this rev is on the given branch.
+
+        The revs considered to be "on" a branch X also includes the
+        branch point of branch X.
+
+        """
+        return branch == self.branch() or branch.parent() == self
+
+
+class TestCVSNum(unittest.TestCase):
+
+    """CVSNum selftests."""
+
+    def test_basic (self):
+        """CVSNum basic selftests."""
+        self.assertEqual(CVSNum("1.2.4"), CVSNum("1.2.0.4"))
+        self.assert_(CVSNum("1.2.4").is_branch())
+        self.assert_(CVSNum("1.2").is_rev())
+        self.assert_(CVSNum("1").is_branch())
+        self.assert_(CVSNum("1.2.4.6").is_rev())
+        self.assertEqual(CVSNum("1.2.4.6").components(), [1, 2, 4, 6])
+        self.assertEqual(CVSNum.from_components([1, 2, 4, 6]),
+                         CVSNum("1.2.4.6"))
+        self.assertEqual(str(CVSNum.from_components([1, 2, 4, 6])), "1.2.4.6")
+        self.assertEqual(len(CVSNum("1.2.4.6")), 4)
+        self.assertEqual(CVSNum("1.2.4.6").branch(), CVSNum("1.2.4"))
+        self.assertEqual(CVSNum("1.2.4").branch(), CVSNum("1.2.4"))
+        self.assertEqual(CVSNum("1.1").branch(), CVSNum("1"))
+        self.assertEqual(CVSNum("1").branch(), CVSNum("1"))
+        self.assertEqual(CVSNum("1.2.4.6").parent(), CVSNum("1.2.4.5"))
+        self.assertEqual(CVSNum("1.2.4.1").parent(), CVSNum("1.2"))
+        self.assertEqual(CVSNum("1.2").parent(), CVSNum("1.1"))
+        self.assert_(CVSNum("1.1").parent() is None)
+        self.assert_(CVSNum("2.1").parent() is None)
+        self.assertEqual(CVSNum("1.2.4").parent(), CVSNum("1.2"))
+        self.assert_(CVSNum("1").parent() is None)
+        self.assert_(CVSNum("2").parent() is None)
+
+    def test_follows (self):
+        """CVSNum.follows() selftests."""
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.1")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.1")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.2")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.3")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.4")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.5")))
+        self.assert_(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.6")))
+        self.assertFalse(CVSNum("1.2.4.6").follows(CVSNum("1.2.4.7")))
+        self.assertFalse(CVSNum("1.2.4.6").follows(CVSNum("1.3")))
+        self.assertFalse(CVSNum("1.1").follows(CVSNum("1.2.4.6")))
+
+    def test_disjoint (self):
+        """CVSNum.disjoint() selftests."""
+        tests = [
+            ("1.2", "1.1", False),
+            ("1.2", "1.2", False),
+            ("1.2", "1.3", False),
+            ("1.2", "1.1.2", True),
+            ("1.2", "1.1.2.3", True),
+            ("1.2.4", "1.1", False),
+            ("1.2.4", "1.2", False),
+            ("1.2.4", "1.3", True),
+            ("1.2.4", "1.2.2", True),
+            ("1.2.4", "1.2.4", False),
+            ("1.2.4", "1.2.6", True),
+            ("1.2.4", "1.2.2.4", True),
+            ("1.2.4", "1.2.4.4", False),
+            ("1.2.4", "1.2.6.4", True),
+            ("1.2.4.6", "1.1", False),
+            ("1.2.4.6", "1.2", False),
+            ("1.2.4.6", "1.3", True),
+            ("1.2.4.6", "1.2.2", True),
+            ("1.2.4.6", "1.2.2.1", True),
+            ("1.2.4.6", "1.2.4", False),
+            ("1.2.4.6", "1.2.4.5", False),
+            ("1.2.4.6", "1.2.4.6", False),
+            ("1.2.4.6", "1.2.4.7", False),
+            ("1.2.4.6.8.10", "1.2.4.5", False),
+            ("1.2.4.6.8.10", "1.2.4.6", False),
+            ("1.2.4.6.8.10", "1.2.4.7", True),
+        ]
+        for a, b, result in tests:
+            self.assertEqual(CVSNum.disjoint(CVSNum(a), CVSNum(b)), result)
+            self.assertEqual(CVSNum.disjoint(CVSNum(b), CVSNum(a)), result)
+
+
+class CVSState(object):
+
+    """Encapsulate a historical state in CVS (a set of paths and nums).
+
+    This class is a container of CVS pathnames and associated CVSNum
+    objects.
+
+    No communication with a CVS working directory or repository is done
+    in this class, hence only basic sanity checks are performed:
+      - A path may only appear once in a CVSState.
+      - When adding a path:num pair, path may not already exist in self
+      - When replacing a path:num pair, path must already exist in self
+      - When removing a path:num pair, both path and num must be given
+
+    IMPORTANT: Objects of this class are hash()able (to support being
+    used as keys in a dict), but they are also mutable.  It is
+    therefore up to the caller to make sure that the object is not
+    changed after being stored in a data structure indexed by its hash
+    value.
+
+    """
+
+    __slots__ = ('revs', '_hash')
+
+    def __init__ (self):
+        """Create a new, empty CVSState."""
+        self.revs = {}  # dict: path -> CVSNum object
+        self._hash = None
+
+    def __iter__ (self):
+        """Return iterator traversing the (path, CVSNum)s in this CVSState."""
+        return self.revs.iteritems()
+
+    def __cmp__ (self, other):
+        """Comparison method for CVSState objects."""
+        return cmp(self.revs, other.revs)
+
+    def __str__ (self):
+        """Stringify this CVSState by listing the contained revisions."""
+        return "".join(["%s:%s\n" % (p, n) for p, n in sorted(self)])
+
+    def __hash__ (self):
+        """Create a hash value for this CVSState."""
+        if self._hash is None:
+            self._hash = hash(str(self))
+        return self._hash
+
+    def __getitem__ (self, path):
+        """Return the CVSNum associated with the given path in self."""
+        return self.revs[path]
+
+    def get (self, path, default = None):
+        """Return the CVSNum associated with the given path in self."""
+        return self.revs.get(path, default)
+
+    def paths (self):
+        """Return the path names contained within this CVSState."""
+        return self.revs.iterkeys()
+
+    def add (self, path, revnum):
+        """Add the given path:revnum to this CVSState."""
+        assert path not in self.revs
+        self._hash = None
+        self.revs[path] = revnum
+
+    def replace (self, path, revnum):
+        """Replace the revnum associated with the given path."""
+        assert path in self.revs
+        self._hash = None
+        self.revs[path] = revnum
+
+    def remove (self, path, revnum):
+        """Remove the given path:revnum association from this CVSState."""
+        assert path in self.revs and self.revs[path] == revnum
+        self._hash = None
+        del self.revs[path]
+
+    def copy (self):
+        """Create and return a copy of this object."""
+        ret = CVSState()
+        ret.revs = self.revs.copy()
+        ret._hash = self._hash
+        return ret
+
+    def load_data (self, note_data):
+        """Load note data as formatted by self.__str__()."""
+        for line in note_data.split("\n"):
+            line = line.strip()
+            if not line:
+                continue
+            path, num = line.rsplit(':', 1)
+            self.add(path, CVSNum(num))
+        self._hash = hash(note_data)
+
+    def print_members (self, f = sys.stdout, prefix = ""):
+        """Write the members of this CVSState to the given file object."""
+        for path, num in sorted(self):
+            print >> f, "%s%s:%s" % (prefix, path, num)
+
+    @file_reader_method(missing_ok = True)
+    def load (self, f):
+        """Load CVS state from the given file name/object."""
+        if f:
+            self.load_data(f.read())
+
+    @file_writer_method
+    def save (self, f):
+        """Save CVS state to the given file name/object."""
+        assert f
+        print >> f, str(self),
+
+
+class CVSDate(object):
+
+    """Encapsulate a timestamp, as reported by CVS.
+
+    The internal representation of a timestamp is two integers, the
+    first representing the timestamp as #seconds since epoch (UTC),
+    and the second representing the timezone as #minutes offset from
+    UTC.
+
+    Example: "2007-09-05 17:26:28 -0200" is converted to
+             (1189013188, -120)
+
+    """
+
+    __slots__ = ('ts', 'tz')
+
+    def __init__ (self, date_str = None, in_utc = False):
+        """Convert CVS date string into a CVSDate object.
+
+        A CVS timestamp string has one of the following forms:
+          - "YYYY-MM-DD hh:mm:ss SZZZZ"
+          - "YYYY/MM/DD hh:mm:ss" (with timezone assumed to be UTC)
+        The in_utc parameter determines whether the timestamp part of
+        the given string (the "YYYY-MM-DD hh:mm:ss" part) is given in
+        local time or UTC (normally CVS dates are given in local time.
+        If given in local time, the timezone offset is subtracted from
+        the timestamp in order to make the time in UTC format.
+
+        """
+        if date_str is None:
+            self.ts, self.tz = 0, 0
+            return
+        if date_str == "now":
+            self.ts, self.tz = time.time(), 0
+            return
+        date_str = date_str.strip()
+        # Set up self.ts and self.tz
+        if date_str.count(" ") == 2:
+            # Assume format "YYYY-MM-DD hh:mm:ss SZZZZ"
+            t, z = date_str.rsplit(" ", 1)
+            # Convert timestamp to #secs since epoch (UTC)
+            self.ts = timegm(time.strptime(t, "%Y-%m-%d %H:%M:%S"))
+            # Convert timezone into #mins offset from UTC
+            self.tz = int(z[1:3]) * 60 + int(z[3:5])
+            # Incorporate timezone sign
+            if z[0] == '-':
+                self.tz *= -1
+        else:
+            assert date_str.count(" ") == 1
+            # Assume format "YYYY/MM/DD hh:mm:ss"
+            self.ts = timegm(time.strptime(date_str, "%Y/%m/%d %H:%M:%S"))
+            self.tz = 0
+        # Adjust timestamp if not already in UTC
+        if not in_utc:
+            self.ts -= self.tz * 60
+
+    def tz_str (self):
+        """Return timezone part of self in string format."""
+        sign = '+'
+        if self.tz < 0:
+            sign = '-'
+        hours, minutes = divmod(abs(self.tz), 60)
+        return "%s%02d%02d" % (sign, hours, minutes)
+
+    def __str__ (self):
+        """Reconstruct date string from members."""
+        s = time.strftime("%Y-%m-%d %H:%M:%S", time.gmtime(self.ts))
+        return "%s %s" % (s, self.tz_str())
+
+    def __repr__ (self):
+        """Create a string representation of self."""
+        return "CVSDate('%s')" % (str(self))
+
+    def __hash__ (self):
+        """Create a hash value from self."""
+        return hash((self.ts, self.tz))
+
+    def __nonzero__ (self):
+        """Provide interpretation of self in a boolean context."""
+        return bool(self.ts or self.tz)
+
+    def __cmp__ (self, other):
+        """Comparison method for CVSDate objects."""
+        return cmp(self.ts, other.ts) or cmp(self.tz, other.tz)
+
+    def __eq__ (self, other):
+        """Return True iff self and other is considered equal."""
+        return self.ts == other.ts and self.tz == other.tz
+
+    def diff (self, other):
+        """Return difference between self and other in #seconds.
+
+        Invariant: self == other.add(self.diff(other))
+
+        """
+        return self.ts - other.ts
+
+
+class TestCVSDate(unittest.TestCase):
+
+    """CVSDate selftests."""
+
+    def test_basic (self):
+        """CVSDate basic selftests."""
+        a = CVSDate("2009-05-10 14:34:56 +0200")
+        b = CVSDate("2009/05/10 12:34:56")
+        self.assert_(a)
+        self.assert_(b)
+        self.assertEqual(str(a), "2009-05-10 12:34:56 +0200", str(a))
+        self.assertEqual(str(b), "2009-05-10 12:34:56 +0000", str(b))
+        self.assertNotEqual(a, b)
+        self.assertEqual(a.diff(b), 0)
+        c = CVSDate("2009-05-10 16:34:56 +0200")
+        self.assert_(c)
+        self.assertEqual(str(c), "2009-05-10 14:34:56 +0200", str(c))
+        self.assertNotEqual(c, a)
+        self.assertEqual(c.diff(a), 2 * 60 * 60)
+        self.assertEqual(a.diff(c), -2 * 60 * 60)
+
+
+class CVSRev(object):
+
+    """Encapsulate metadata on a CVS revision."""
+
+    __slots__ = ('path', 'num', 'date', 'author', 'deleted', 'message')
+
+    def __init__ (self, path, num):
+        """Create a CVSRev object for the given path:num revision."""
+        self.path = path
+        self.num = num
+        self.date = None  # CVSDate object
+        self.author = ""
+        self.deleted = None  # True or False
+        self.message = []  # Lines of commit message
+
+    def __str__ (self):
+        """Return a string listing the metadata in this CVS revision."""
+        return ("<%s:%s on %s by %s%s>" %
+                (self.path, self.num, self.date, self.author,
+                 self.deleted and ", deleted" or ""))
+
+    def __cmp__ (self, other):
+        """Comparison method for CVSRev objects."""
+        return cmp(self.path, other.path) or cmp(self.num, other.num)
+
+
+class CVSWorkDir(object):
+
+    """Encapsulate a CVS working directory.
+
+    This class auto-creates a CVS workdir/checkout in the directory
+    given to the constructor, and provides various methods for
+    interacting with this workdir.
+
+    """
+
+    def __init__ (self, workdir, cvs_repo):
+        """Create a new CVSWorkDir.
+
+        The cvs_repo argument must be a (cvs_root, cvs_module) tuple
+
+        """
+        self.d = workdir
+        self.cvs_root, self.cvs_module = cvs_repo
+        parent_dir = os.path.dirname(self.d)
+        if not os.path.isdir(parent_dir):
+            os.makedirs(parent_dir)
+        self._valid = None
+
+    def makepath(self, *args):
+        """Create path relative to working directory."""
+        return os.path.join(self.d, *args)
+
+    def valid (self):
+        """Return True iff this workdir is present and valid."""
+        if self._valid is not None:
+            return self._valid
+        try:
+            f = open(self.makepath("CVS", "Root"), 'r')
+            assert f.read().strip() == self.cvs_root
+            f.close()
+            f = open(self.makepath("CVS", "Repository"), 'r')
+            assert f.read().strip() == self.cvs_module
+            f.close()
+            self._valid = True
+        except (IOError, AssertionError):
+            self._valid = False
+        return self._valid
+
+    def remove (self):
+        """Remove this checkout."""
+        shutil.rmtree(self.d, True)
+        assert not os.path.exists(self.d)
+        self._valid = False
+
+    def checkout (self, revision = "HEAD"):
+        """Create a checkout of the given revision."""
+        self.remove()
+        parent_dir, co_dir = os.path.split(self.d)
+        args = ["cvs", "-f", "-Q", "-d", self.cvs_root, "checkout"]
+        if str(revision) != "HEAD":
+            args.extend(["-r", str(revision)])
+        args.extend(["-d", co_dir, self.cvs_module])
+        exit_code, output, errors = run_command(args, cwd = parent_dir)
+        if exit_code:
+            die("Failed to checkout CVS working directory")
+        assert not errors
+        assert not output, "output = '%s'" % (output)
+        self._valid = None
+        assert self.valid()
+
+    def update (self, revision = "HEAD", paths = None):
+        """Update the given paths to the given revision."""
+        if not self.valid():
+            self.checkout()
+        args = ["cvs", "-f", "-Q", "update", "-kk"]
+        if str(revision) == "HEAD":
+            args.append("-A")
+        else:
+            args.extend(["-r", str(revision)])
+        if paths is not None:
+            args.extend(paths)
+        exit_code, output, errors = run_command(args, cwd = self.d)
+        if exit_code:
+            die("Failed to checkout CVS working directory")
+        assert not errors
+        assert not output, "output = '%s'" % (output)
+
+    def get_revision_data (self, path, revision):
+        """Return the contents of the given CVS path:revision."""
+        if not self.valid():
+            self.checkout()
+        args = ["cvs", "-f", "-Q", "update", "-p", "-kk"]
+        if str(revision) == "HEAD":
+            args.append("-A")
+        else:
+            args.extend(["-r", str(revision)])
+        args.append(path)
+        exit_code, output, errors = run_command(args, cwd = self.d)
+        if exit_code:
+            die("Failed to checkout CVS working directory")
+        assert not errors
+        return output
+
+    def get_modeinfo (self, paths = None):
+        """Return mode information for the given paths.
+
+        Returns a dict of path -> mode number mappings.  If paths are
+        not specified, mode information for all files in the current
+        checkout will be returned.  No checkout/update will be done.
+
+        """
+        result = {}
+        if paths is not None:
+            for path in paths:
+                fullpath = os.path.join(self.d, path)
+                mode = 644
+                if os.access(fullpath, os.X_OK):
+                    mode = 755
+                assert path not in result
+                result[path] = mode
+        else:  # Return mode information for all paths
+            for dirpath, dirnames, filenames in os.walk(self.d):
+                # Don't descend into CVS subdirs
+                try:
+                    dirnames.remove('CVS')
+                except ValueError:
+                    pass
+                assert dirpath.startswith(self.d)
+                directory = dirpath[len(self.d):].lstrip("/")
+                for fname in filenames:
+                    path = os.path.join(directory, fname)
+                    fullpath = os.path.join(dirpath, fname)
+                    mode = 644
+                    if os.access(fullpath, os.X_OK):
+                        mode = 755
+                    assert path not in result
+                    result[path] = mode
+        return result
+
+    @classmethod
+    def parse_entries (cls, entries, prefix, directory = ""):
+        """Recursively parse CVS/Entries files.
+
+        Return a dict of CVS paths found by parsing the CVS/Entries
+        files rooted at the given directory.
+
+        See http://ximbiot.com/cvs/manual/feature/cvs_2.html#SEC19 for
+        information on the format of the CVS/Entries file.
+
+        """
+        fname = os.path.join(prefix, directory, "CVS", "Entries")
+        subdirs = []
+        f = open(fname, 'r')
+        for line in f:
+            line = line.strip()
+            if line == "D":
+                continue  # There are no subdirectories
+            t, path, revnum, date, options, tag = line.split("/")
+            if t == "D":
+                subdirs.append(path)
+                continue
+            assert line.startswith("/")
+            path = os.path.join(directory, path)
+            revnum = CVSNum(revnum)
+            assert path not in entries
+            entries[path] = (revnum, date, options, tag)
+        f.close()
+        for d in subdirs:
+            d = os.path.join(directory, d)
+            cls.parse_entries(entries, prefix, d)
+
+    def get_state (self):
+        """Return CVSState reflecting current state of this checkout.
+
+        Note that the resulting CVSState will never contain any
+        deleted/dead files.  Other CVSStates to be compared to the one
+        returned from here should remove deleted/dead entries first.
+
+        """
+        assert self.valid()
+        entries = {}
+        result = CVSState()
+        self.parse_entries(entries, self.d)
+        for path, info in entries.iteritems():
+            result.add(path, info[0])
+        return result
+
+
+class CVSLogParser(object):
+
+    """Encapsulate the execution of a "cvs rlog" command."""
+
+    def __init__ (self, cvs_repo):
+        """Create a new CVSLogParser.
+
+        The cvs_repo argument must be a (cvs_root, cvs_module) tuple
+
+        """
+        self.cvs_root, self.cvs_module = cvs_repo
+
+    def cleanup_path (self, cvs_path):
+        """Utility method for parsing CVS paths from CVS log."""
+        cvsprefix = "/".join((self.cvs_root[self.cvs_root.index("/"):],
+                              self.cvs_module))
+        assert cvs_path.startswith(cvsprefix)
+        assert cvs_path.endswith(",v")
+        # Drop cvsprefix and ,v-extension
+        cvs_path = cvs_path[len(cvsprefix):-2]
+        # Split the remaining path into components
+        path_comps = filter(None, cvs_path.strip().split('/'))
+        # Remove 'Attic' from CVS paths
+        if len(path_comps) >= 2 and path_comps[-2] == "Attic":
+            del path_comps[-2]
+        # Reconstruct resulting "cleaned" path
+        return "/".join(path_comps)
+
+    def __call__ (self, line):
+        """Parse the given line from the CVS log.
+
+        Must be reimplemented by subclass
+
+        """
+        pass
+
+    def finish (self):
+        """This method is invoked after the last line has been parsed.
+
+        May be reimplemented by subclass
+
+        """
+        pass
+
+    def run (self, paths = None, no_symbols = False, revisions = None):
+        """Execute "cvs rlog" with the given arguments.
+
+        self.__call__() is invoked once for each line in the CVS log.
+        self.finish() is invoked exactly once after the CVS log.
+
+        """
+        args = ["cvs", "-f", "-q", "-d", self.cvs_root, "rlog"]
+        if no_symbols:
+            args.append("-N")
+        if revisions:
+            args.append("-r%s" % (revisions))
+        if paths is not None:
+            for p in paths:
+                args.append("%s/%s" % (self.cvs_module, p))
+        else:
+            args.append(self.cvs_module)
+        proc = start_command(args)
+        proc.stdin.close()
+        while True:
+            for line in proc.stdout:
+                self(line.rstrip())  # Call self's line parser
+            if proc.poll() is not None:
+                break
+        assert proc.stdout.read() == ""
+        self.finish()  # Notify subclass that parsing is finished
+        exit_code = proc.returncode
+        if exit_code:
+            error("'%s' returned exit code %i, and errors:\n---\n%s---",
+                  " ".join(args), exit_code, proc.stderr.read())
+        return exit_code
+
+
+class CVSRevLister(CVSLogParser):
+
+    """Extract CVSRev objects (with revision metadata) from a CVS log."""
+
+    def __init__ (self, cvs_repo, show_progress = False):
+        """Create a new CVSRevLister.
+
+        The cvs_repo argument must be a (cvs_root, cvs_module) tuple
+        show_progress determines whether progress indication is shown.
+
+        """
+        super(CVSRevLister, self).__init__(cvs_repo)
+        self.cur_file = None  # Current CVS file being processed
+        self.cur_file_numrevs = 0  # #revs in current CVS file
+        self.cur_rev = None  # Current CVSRev under construction
+        self.progress = None
+        if show_progress:
+            self.progress = ProgressIndicator("\t", sys.stderr)
+        # Store found revs in a two-level dict structure:
+        # filename -> revnum -> CVSRev
+        self.revs = {}
+        # Possible states:
+        # - BeforeRevs  - waiting for "total revisions:"
+        # - BetweenRevs - waiting for "----------------------------"
+        # - ReadingRev  - reading CVS revision details
+        self.state = 'BeforeRevs'
+
+    def __call__ (self, line):
+        """Line parser; this method is invoked for each line in the log."""
+        if self.state == 'BeforeRevs':
+            if line.startswith("RCS file: "):
+                self.cur_file = self.cleanup_path(line[10:])
+                assert self.cur_file not in self.revs
+                self.revs[self.cur_file] = {}
+            elif line.startswith("total revisions: "):
+                assert self.cur_file
+                totalrevs_unused, selectedrevs = line.split(";")
+                self.cur_file_numrevs = int(selectedrevs.split(":")[1].strip())
+                self.state = 'BetweenRevs'
+        elif self.state == 'BetweenRevs':
+            if (line == "----------------------------" or
+                line == "======================================"
+                        "======================================="):
+                if self.cur_rev:
+                    # Finished current revision
+                    f = self.revs[self.cur_file]
+                    assert self.cur_rev.num not in f
+                    f[self.cur_rev.num] = self.cur_rev
+                    self.cur_rev = None
+                    if self.progress:
+                        self.progress()
+                if line == "----------------------------":
+                    self.state = 'ReadingRev'
+                else:
+                    # Finalize current CVS file
+                    assert len(self.revs[self.cur_file]) == \
+                           self.cur_file_numrevs
+                    self.cur_file = None
+                    self.state = 'BeforeRevs'
+            elif self.cur_rev:
+                # Currently in the middle of a revision.
+                if line.startswith("branches:  %s" % (self.cur_rev.num)) and \
+                   line.endswith(";"):
+                    return  # Skip 'branches:' lines
+                # This line is part of the commit message.
+                self.cur_rev.message.append(line)
+        elif self.state == 'ReadingRev':
+            if line.startswith("revision "):
+                self.cur_rev = CVSRev(self.cur_file, CVSNum(line.split()[1]))
+            else:
+                date, author, state, dummy = line.split(";", 3)
+                assert date.startswith("date: ")
+                self.cur_rev.date = CVSDate(date[6:])
+                assert author.strip().startswith("author: ")
+                self.cur_rev.author = author.strip()[8:]
+                assert state.strip().startswith("state: ")
+                state = state.strip()[7:]
+                self.cur_rev.deleted = state == "dead"
+                self.state = 'BetweenRevs'
+
+    def finish (self):
+        """This method is invoked after the last line has been parsed."""
+        assert self.state == 'BeforeRevs'
+        if self.progress:
+            self.progress.finish("Parsed %i revs in %i files" %
+                                 (self.progress.n, len(self.revs)))
+
+
+def fetch_revs (path, from_rev, to_rev, symbol, cvs_repo):
+    """Fetch CVSRevs for each rev in <path:from_rev, path:symbol].
+
+    Return a dict of CVSRev objects (revnum -> CVSRev), where each
+    CVSRev encapsulates a CVS revision in the range from
+    path:from_rev to path:symbol (inclusive).  If symbol currently
+    refers to from_rev (i.e. nothing has happened since the last
+    import), the returned dict will have exactly one entry (from_rev).
+    If there is no valid revision range between from_rev and symbol,
+    the returned dict will be empty.  Situations in which an empty dict
+    is returned, include:
+    - symbol is no longer defined on this path
+    - symbol refers to a revision that is disjoint from from_rev
+
+    from_rev may be None, meaning that all revisions from the initial
+    version of path up to the revision currently referenced by symbol
+    should be fetched.
+
+    If the revision currently referenced by symbol is disjoint from
+    from_rev, the returned dict will be empty.
+
+    Note that there is lots of unexpected behaviour in the handling of
+    the 'cvs rlog -r' parameter: Say you have a branch, called
+    'my_branch', that points to branch number 1.1.2 of a file.  Say
+    there are 3 revisions on this branch: 1.1.2.1 -> 1.1.2.3 (in
+    additions to the branch point 1.1).  Now, observe the following
+    'cvs rlog' executions:
+    - cvs rlog -r0:my_branch ... returns 1.1, 1.1.2.1, 1.1.2.2, 1.1.2.3
+    - cvs rlog -r1.1:my_branch ... returns the same revs
+    - cvs rlog -rmy_branch ... returns 1.1.2.1, 1.1.2.2, 1.1.2.3
+    - cvs rlog -rmy_branch: ... returns the same revs
+    - cvs rlog -r:my_branch ... returns the same revs
+    - cvs rlog -r::my_branch ... returns the same revs
+    - cvs rlog -r1.1.2.1: ... returns the same revs
+    Here is where it gets really weird:
+    - cvs rlog -r1.1.2.1:my_branch ... returns 1.1.2.1 only
+    - cvs rlog -r1.1.2.2:my_branch ... returns 1.1.2.1, 1.1.2.2
+    - cvs rlog -r1.1.2.3:my_branch ... returns 1.1.2.1, 1.1.2.2, 1.1.2.3
+
+    In other words the 'cvs rlog -rfrom_rev:symbol' scheme that we
+    normally use will not work in the case where from_rev is _on_ the
+    branch pointed at by the symbol.
+
+    Therefore, we need an extra parameter, to_rev, which we can use to:
+    1. Detect when this situation is present.
+    2. Work around by using 'cvs rlog -rfrom_ref:to_rev' instead.
+
+    """
+    if from_rev is None:  # Initial import
+        from_rev = "0"  # "cvs rlog -r0:X" fetches from initial revision
+    elif to_rev and to_rev.branch() == from_rev.branch():
+        symbol = to_rev  # Use to_rev instead of given symbol
+    # Run 'cvs rlog' on range [from_rev, symbol] and parse CVSRev objects
+    parser = CVSRevLister(cvs_repo)
+    parser.run((path,), True, "%s:%s" % (from_rev, symbol))
+    assert len(parser.revs) == 1
+    assert path in parser.revs
+    return parser.revs[path]
+
+
+if __name__ == '__main__':
+    unittest.main()
diff --git a/git_remote_cvs/cvs_symbol_cache.py b/git_remote_cvs/cvs_symbol_cache.py
new file mode 100644
index 0000000..cc8d88b
--- /dev/null
+++ b/git_remote_cvs/cvs_symbol_cache.py
@@ -0,0 +1,313 @@
+#!/usr/bin/env python
+
+"""Implementation of a local CVS symbol cache.
+
+A CVS symbol cache stores a list of CVS symbols and the CVS state
+associated with each of those CVS symbols at some point in time.
+
+Keeping a local cache of CVS symbols is often needed because the
+design of CVS makes it potentially very expensive to query the CVS
+server directly for CVS symbols and associated states.
+
+In these cases, a local CVS symbol cache can provide equivalent
+(although possibly out-of-date) information immediatele.
+
+Synchronization with the current state on the CVS server can be
+done on a symbol-by-symbol basis (by checking out a given symbol
+and extracting the CVS state from the CVS work tree), or by
+synchronizing _all_ CVS symbols in one operation (by executing
+'cvs rlog' and parsing CVS states from its output).
+
+"""
+
+import sys
+import os
+
+from git_remote_cvs.util import debug, error, die, ProgressIndicator
+from git_remote_cvs.cvs import CVSNum, CVSState, CVSLogParser
+
+
+class CVSSymbolStateLister(CVSLogParser):
+
+    """Extract current CVSStates for all CVS symbols from a CVS log."""
+
+    def __init__ (self, cvs_repo, show_progress = False):
+        """Create a new CVSSymbolStateLister.
+
+        The cvs_repo argument must be a (cvs_root, cvs_module) tuple
+        show_progress determines whether a progress indicator should
+        be displayed.
+
+        """
+        super(CVSSymbolStateLister, self).__init__(cvs_repo)
+        self.symbols = {}  # CVS symbol name -> CVSState object
+        self.cur_file = None  # current CVS file being processed
+        self.cur_file_numrevs = 0  # #revs in current CVS file
+        self.cur_revnum = None  # current revision number
+        self.rev2syms = {}  # CVSNum -> [CVS symbol names]
+        self.cur_revs = {}  # CVSNum -> True/False (deleted)
+        self.head_num = None  # CVSNum of the HEAD rev or branch
+
+        # Possible states:
+        # - BeforeSymbols - waiting for "symbolic names:"
+        # - WithinSymbols - reading CVS symbol names
+        # - BeforeRevs  - waiting for "total revisions:"
+        # - BetweenRevs - waiting for "----------------------------"
+        # - ReadingRev  - reading CVS revision details
+        self.state = 'BeforeSymbols'
+
+        self.progress = None
+        if show_progress:
+            self.progress = ProgressIndicator("\t", sys.stderr)
+
+    def finalize_symbol_states (self):
+        """Adjust CVSStates in self.symbols based on revision data.
+
+        Based on the information found in self.rev2syms and
+        self.cur_revs, remove deleted revisions and turn branch numbers
+        into corresponding revisions in the CVSStates found in
+        self.symbols.
+
+        """
+        # Create a mapping from branch numbers to the last existing
+        # revision number on those branches
+        branch2lastrev = {}  # branch number -> revision number
+        for revnum in self.cur_revs.iterkeys():
+            branchnum = revnum.branch()
+            if (branchnum not in branch2lastrev) or \
+               (revnum > branch2lastrev[branchnum]):
+                branch2lastrev[branchnum] = revnum
+
+        for cvsnum, symbols in self.rev2syms.iteritems():
+            if cvsnum.is_branch():
+                # Turn into corresponding revision number
+                revnum = branch2lastrev.get(cvsnum, cvsnum.parent())
+                for s in symbols:
+                    state = self.symbols[s]
+                    assert state[self.cur_file] == cvsnum
+                    state.replace(self.cur_file, revnum)
+                cvsnum = revnum
+            assert cvsnum.is_rev()
+            assert cvsnum in self.cur_revs
+            if self.cur_revs[cvsnum]:  # cvsnum is a deleted rev
+                # Remove from CVSStates
+                for s in symbols:
+                    state = self.symbols[s]
+                    state.remove(self.cur_file, cvsnum)
+
+        self.rev2syms = {}
+        self.cur_revs = {}
+        self.cur_file = None
+
+    def __call__ (self, line):
+        """Line parser; this method is invoked for each line in the log."""
+        if self.state == 'BeforeSymbols':
+            if line.startswith("RCS file: "):
+                self.cur_file = self.cleanup_path(line[10:])
+                if self.progress:
+                    self.progress("%5i symbols found - Parsing CVS file #%i: "
+                                  "%s " % (len(self.symbols), self.progress.n,
+                                           self.cur_file,))
+            if line.startswith("head: "):
+                self.head_num = CVSNum(line[6:])
+            if line.startswith("branch: "):
+                self.head_num = CVSNum(line[8:])
+            elif line == "symbolic names:":
+                assert self.head_num
+                s = self.symbols.setdefault("HEAD", CVSState())
+                s.add(self.cur_file, self.head_num)
+                r = self.rev2syms.setdefault(self.head_num, [])
+                r.append("HEAD")
+                self.head_num = None
+                self.state = 'WithinSymbols'
+        elif self.state == 'WithinSymbols':
+            if line.startswith("\t"):
+                symbol, cvsnum = line.split(":", 1)
+                symbol = symbol.strip()
+                cvsnum = CVSNum(cvsnum)
+                s = self.symbols.setdefault(symbol, CVSState())
+                s.add(self.cur_file, cvsnum)
+                r = self.rev2syms.setdefault(cvsnum, [])
+                r.append(symbol)
+            else:
+                self.state = 'BeforeRevs'
+        elif self.state == 'BeforeRevs':
+            if line.startswith("total revisions: "):
+                assert self.cur_file
+                totalrevs_unused, selectedrevs = line.split(";")
+                self.cur_file_numrevs = int(selectedrevs.split(":")[1].strip())
+                self.state = 'BetweenRevs'
+        elif self.state == 'BetweenRevs':
+            if (line == "----------------------------" or
+                line == "======================================"
+                        "======================================="):
+                if self.cur_revnum:
+                    assert self.cur_revnum in self.cur_revs
+                    self.cur_revnum = None
+                if line == "----------------------------":
+                    self.state = 'ReadingRev'
+                else:
+                    # Finalize current CVS file
+                    assert len(self.cur_revs) == self.cur_file_numrevs
+                    self.finalize_symbol_states()
+                    self.state = 'BeforeSymbols'
+        elif self.state == 'ReadingRev':
+            if line.startswith("revision "):
+                self.cur_revnum = CVSNum(line.split()[1])
+            else:
+                date, author, state, dummy = line.split(";", 3)
+                assert date.startswith("date: ")
+                assert author.strip().startswith("author: ")
+                assert state.strip().startswith("state: ")
+                state = state.strip()[7:]
+                assert self.cur_revnum not in self.cur_revs
+                deleted = state == "dead"
+                self.cur_revs[self.cur_revnum] = deleted
+                self.state = 'BetweenRevs'
+
+    def finish (self):
+        """This method is invoked after the last line has been parsed."""
+        assert self.state == 'BeforeSymbols'
+        if self.progress:
+            self.progress.finish("Parsed %i symbols in %i files" %
+                                 (len(self.symbols), self.progress.n))
+
+
+class CVSSymbolCache(object):
+
+    """Local cache of the current CVSState of CVS symbols.
+
+    Simulates a dictionary of CVS symbol -> CVSState mappings.
+
+    """
+
+    def __init__ (self, symbols_dir):
+        """Create a new CVS symbol cache, located in the given directory."""
+        self.symbols_dir = symbols_dir
+        if not os.path.isdir(self.symbols_dir):
+            os.makedirs(self.symbols_dir)
+
+    def __len__ (self):
+        """Return the number of CVS symbols stored in this cache."""
+        return len(os.listdir(self.symbols_dir))
+
+    def __iter__ (self):
+        """Return an iterator traversing symbol names stored in this cache."""
+        for filename in os.listdir(self.symbols_dir):
+            yield filename
+
+    def __contains__ (self, symbol):
+        """Return True if the given symbol is present in this cache."""
+        return os.access(os.path.join(self.symbols_dir, symbol),
+                         os.F_OK | os.R_OK)
+
+    def __getitem__ (self, symbol):
+        """Return the cached CVSState of the given CVS symbol."""
+        try:
+            f = open(os.path.join(self.symbols_dir, symbol), 'r')
+        except IOError:
+            raise KeyError("'%s'" % (symbol))
+        ret = CVSState()
+        ret.load(f)
+        f.close()
+        return ret
+
+    def __setitem__ (self, symbol, cvs_state):
+        """Store the given CVS symbol and CVSState into the cache."""
+        cvs_state.save(os.path.join(self.symbols_dir, symbol))
+
+    def __delitem__ (self, symbol):
+        """Remove the the given CVS symbol from the cache."""
+        os.remove(os.path.join(self.symbols_dir, symbol))
+
+    def get (self, symbol, default = None):
+        """Return the cached CVSState of the given CVS symbol."""
+        try:
+            return self[symbol]
+        except KeyError:
+            return default
+
+    def items (self):
+        """Return list of (CVS symbol, CVSState) tuples saved in this cache."""
+        for filename in self:
+            yield (filename, self[filename])
+
+    def clear (self):
+        """Remove all entries from this CVS symbol cache."""
+        for filename in os.listdir(self.symbols_dir):
+            os.remove(os.path.join(self.symbols_dir, filename))
+
+    def sync_symbol (self, symbol, cvs, progress):
+        """Synchronize the given CVS symbol with the CVS server.
+
+        The given CVS workdir is used for the synchronization.
+        The retrieved CVSState is also returned
+
+        """
+        progress("Retrieving state of CVS symbol '%s'..." % (symbol))
+        cvs.update(symbol)
+        state = cvs.get_state()
+
+        progress("Saving state of '%s' to symbol cache..." % (symbol))
+        self[symbol] = state
+
+    def sync_all_symbols (self, cvs_repo, progress, symbol_filter = None):
+        """Synchronize this entire CVS symbol cache with the CVS server.
+
+        This may be very expensive if the CVS repository is large, or
+        has many symbols.  After this method returns, the symbol cache
+        will be in sync with the current state on the server.
+
+        This method returns a dict with the keys 'unchanged',
+        'changed', 'added', and 'deleted', where each map to a list of
+        CVS symbols.  Each CVS symbol appears in exactly one of these
+        lists.
+
+        If symbol_filter is given, it specifies functions that takes
+        one parameter - a CVS symbol name - and returns True if that
+        symbol should be synchronized, and False if that symbol should
+        be skipped.  Otherwise all CVS symbols are synchronized.
+
+        """
+        if symbol_filter is None:
+            symbol_filter = lambda symbol: True
+
+        # Run cvs rlog to fetch current CVSState for all CVS symbols
+        progress("Retrieving current state of all CVS symbols from CVS "
+                 "server...", lf = True)
+        parser = CVSSymbolStateLister(cvs_repo, True)
+        retcode = parser.run()
+        if retcode:
+            raise EnvironmentError(retcode, "cvs rlog exit code %i" % retcode)
+
+        # Update symbol cache with new states from the CVS server
+        progress("Updating symbol cache with current CVS state...")
+        results = {}
+        result_keys = ("unchanged", "changed", "added", "deleted")
+        for k in result_keys:
+            results[k] = []
+        # Classify existing symbols as unchanged, changed, or deleted
+        for symbol in filter(symbol_filter, self):
+            if symbol not in parser.symbols:  # Deleted
+                results["deleted"].append(symbol)
+                del self[symbol]
+            elif self[symbol] != parser.symbols[symbol]:  # Changed
+                results["changed"].append(symbol)
+                self[symbol] = parser.symbols[symbol]
+            else:  # Unchanged
+                results["unchanged"].append(symbol)
+            progress()
+        # Add symbols that are not in self
+        for symbol, state in parser.symbols.iteritems():
+            if not symbol_filter(symbol):
+                debug("Skipping CVS symbol '%s'...", symbol)
+            elif symbol in self:
+                assert state == self[symbol]
+            else:  # Added
+                results["added"].append(symbol)
+                self[symbol] = state
+            progress()
+        progress("Synchronized local symbol cache (%s)" %
+                 (", ".join(["%i %s" % (len(results[k]), k)
+                             for k in result_keys])), True)
+        return results
diff --git a/git_remote_cvs/util.py b/git_remote_cvs/util.py
new file mode 100644
index 0000000..d3ca487
--- /dev/null
+++ b/git_remote_cvs/util.py
@@ -0,0 +1,194 @@
+#!/usr/bin/env python
+
+"""Misc. useful functionality used by the rest of this package.
+
+This module provides common functionality used by the other modules in
+this package.
+
+"""
+
+import sys
+import os
+import subprocess
+
+
+# Whether or not to show debug messages
+DEBUG = False
+
+def notify(msg, *args):
+	"""Print a message to stderr."""
+	print >> sys.stderr, msg % args
+
+def debug (msg, *args):
+    """Print a debug message to stderr when DEBUG is enabled."""
+    if DEBUG:
+        print >> sys.stderr, msg % args
+
+def error (msg, *args):
+    """Print an error message to stderr."""
+    print >> sys.stderr, "ERROR:", msg % args
+
+def warn(msg, *args):
+	"""Print a warning message to stderr."""
+	print >> sys.stderr, "warning:", msg % args
+
+def die (msg, *args):
+    """Print as error message to stderr and exit the program."""
+    error(msg, *args)
+    sys.exit(1)
+
+
+class ProgressIndicator(object):
+
+    """Simple progress indicator.
+
+    Displayed as a spinning character by default, but can be customized
+    by passing custom messages that overrides the spinning character.
+
+    """
+
+    States = ("|", "/", "-", "\\")
+
+    def __init__ (self, prefix = "", f = sys.stdout):
+        """Create a new ProgressIndicator, bound to the given file object."""
+        self.n = 0  # Simple progress counter
+        self.f = f  # Progress is written to this file object
+        self.prev_len = 0  # Length of previous msg (to be overwritten)
+        self.prefix = prefix  # Prefix prepended to each progress message
+        self.prefix_lens = [] # Stack of prefix string lengths
+
+    def pushprefix (self, prefix):
+        """Append the given prefix onto the prefix stack."""
+        self.prefix_lens.append(len(self.prefix))
+        self.prefix += prefix
+
+    def popprefix (self):
+        """Remove the last prefix from the prefix stack."""
+        prev_len = self.prefix_lens.pop()
+        self.prefix = self.prefix[:prev_len]
+
+    def __call__ (self, msg = None, lf = False):
+        """Indicate progress, possibly with a custom message."""
+        if msg is None:
+            msg = self.States[self.n % len(self.States)]
+        msg = self.prefix + msg
+        print >> self.f, "\r%-*s" % (self.prev_len, msg),
+        self.prev_len = len(msg.expandtabs())
+        if lf:
+            print >> self.f
+            self.prev_len = 0
+        self.n += 1
+
+    def finish (self, msg = "done", noprefix = False):
+        """Finalize progress indication with the given message."""
+        if noprefix:
+            self.prefix = ""
+        self(msg, True)
+
+
+def start_command (args, cwd = None, shell = False, add_env = None,
+                   stdin = subprocess.PIPE, stdout = subprocess.PIPE,
+                   stderr = subprocess.PIPE):
+    """Start the given command, and return a subprocess object.
+
+    This provides a simpler interface to the subprocess module.
+
+    """
+    env = None
+    if add_env is not None:
+        env = os.environ.copy()
+        env.update(add_env)
+    return subprocess.Popen(args, bufsize = 1, stdin = stdin, stdout = stdout,
+                            stderr = stderr, cwd = cwd, shell = shell,
+                            env = env, universal_newlines = True)
+
+
+def run_command (args, cwd = None, shell = False, add_env = None,
+                 flag_error = True):
+    """Run the given command to completion, and return its results.
+
+    This provides a simpler interface to the subprocess module.
+
+    The results are formatted as a 3-tuple: (exit_code, output, errors)
+
+    If flag_error is enabled, Error messages will be produced if the
+    subprocess terminated with a non-zero exit code and/or stderr
+    output.
+
+    The other arguments are passed on to start_command().
+
+    """
+    process = start_command(args, cwd, shell, add_env)
+    (output, errors) = process.communicate()
+    exit_code = process.returncode
+    if flag_error and errors:
+        error("'%s' returned errors:\n---\n%s---", " ".join(args), errors)
+    if flag_error and exit_code:
+        error("'%s' returned exit code %i", " ".join(args), exit_code)
+    return (exit_code, output, errors)
+
+
+def file_reader_method (missing_ok = False):
+    """Decorator for simplifying reading of files.
+
+    If missing_ok is True, a failure to open a file for reading will
+    not raise the usual IOError, but instead the wrapped method will be
+    called with f == None.  The method must in this case properly
+    handle f == None.
+
+    """
+    def _wrap (method):
+        """Teach given method to handle both filenames and file objects.
+
+        The given method must take a file object as its second argument
+        (the first argument being 'self', of course).  This decorator
+        will take a filename given as the second argument and promote
+        it to a file object.
+
+        """
+        def _wrapped_method (self, filename, *args, **kwargs):
+            if isinstance(filename, file):
+                f = filename
+            else:
+                try:
+                    f = open(filename, 'r')
+                except IOError:
+                    if missing_ok:
+                        f = None
+                    else:
+                        raise
+            try:
+                return method(self, f, *args, **kwargs)
+            finally:
+                if not isinstance(filename, file) and f:
+                    f.close()
+        return _wrapped_method
+    return _wrap
+
+
+def file_writer_method (method):
+    """Decorator for simplifying writing of files.
+
+    Enables the given method to handle both filenames and file objects.
+
+    The given method must take a file object as its second argument
+    (the first argument being 'self', of course).  This decorator will
+    take a filename given as the second argument and promote it to a
+    file object.
+
+    """
+    def _new_method (self, filename, *args, **kwargs):
+        if isinstance(filename, file):
+            f = filename
+        else:
+            # Make sure the containing directory exists
+            parent_dir = os.path.dirname(filename)
+            if not os.path.isdir(parent_dir):
+                os.makedirs(parent_dir)
+            f = open(filename, 'w')
+        try:
+            return method(self, f, *args, **kwargs)
+        finally:
+            if not isinstance(filename, file):
+                f.close()
+    return _new_method
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 13/19] 2/2: Add Python support library for CVS remote helper
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (11 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 12/19] 1/2: Add Python support library for CVS remote helper Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 14/19] git-remote-cvs: Remote helper program for CVS repositories Sverre Rabbelier
                   ` (5 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland, David Aguilar

From: Johan Herland <johan@herland.net>

This patch introduces the rest of a Python package called "git_remote_cvs"
containing the building blocks of the CVS remote helper. The CVS remote
helper itself is NOT part of this patch.

The patch includes the necessary Makefile additions to build and install
the git_remote_cvs Python package along with the rest of Git.

This patch has been improved by the following contributions:
- David Aguilar: Lots of Python coding style fixes
- David Aguilar: DESTDIR support in Makefile

Cc: David Aguilar <davvid@gmail.com>
Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 Makefile                           |    9 +
 git_remote_cvs/.gitignore          |    2 +
 git_remote_cvs/Makefile            |   35 ++
 git_remote_cvs/__init__.py         |   27 ++
 git_remote_cvs/commit_states.py    |   62 ++++
 git_remote_cvs/cvs_revision_map.py |  418 ++++++++++++++++++++++
 git_remote_cvs/git.py              |  680 ++++++++++++++++++++++++++++++++++++
 git_remote_cvs/setup.py            |   17 +
 8 files changed, 1250 insertions(+), 0 deletions(-)
 create mode 100644 git_remote_cvs/.gitignore
 create mode 100644 git_remote_cvs/Makefile
 create mode 100644 git_remote_cvs/__init__.py
 create mode 100644 git_remote_cvs/commit_states.py
 create mode 100644 git_remote_cvs/cvs_revision_map.py
 create mode 100644 git_remote_cvs/git.py
 create mode 100644 git_remote_cvs/setup.py

diff --git a/Makefile b/Makefile
index b27a7d6..944f4d0 100644
--- a/Makefile
+++ b/Makefile
@@ -1399,6 +1399,9 @@ endif
 ifndef NO_PERL
 	$(QUIET_SUBDIR0)perl $(QUIET_SUBDIR1) PERL_PATH='$(PERL_PATH_SQ)' prefix='$(prefix_SQ)' all
 endif
+ifndef NO_PYTHON
+	$(QUIET_SUBDIR0)git_remote_cvs $(QUIET_SUBDIR1) PYTHON_PATH='$(PYTHON_PATH_SQ)' prefix='$(prefix_SQ)' all
+endif
 	$(QUIET_SUBDIR0)templates $(QUIET_SUBDIR1)
 
 please_set_SHELL_PATH_to_a_more_modern_shell:
@@ -1741,6 +1744,9 @@ install: all
 ifndef NO_PERL
 	$(MAKE) -C perl prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' install
 endif
+ifndef NO_PYTHON
+	$(MAKE) -C git_remote_cvs prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' install
+endif
 ifndef NO_TCLTK
 	$(MAKE) -C gitk-git install
 	$(MAKE) -C git-gui gitexecdir='$(gitexec_instdir_SQ)' install
@@ -1855,6 +1861,9 @@ ifndef NO_PERL
 	$(RM) gitweb/gitweb.cgi
 	$(MAKE) -C perl clean
 endif
+ifndef NO_PYTHON
+	$(MAKE) -C git_remote_cvs clean
+endif
 	$(MAKE) -C templates/ clean
 	$(MAKE) -C t/ clean
 ifndef NO_TCLTK
diff --git a/git_remote_cvs/.gitignore b/git_remote_cvs/.gitignore
new file mode 100644
index 0000000..2247d5f
--- /dev/null
+++ b/git_remote_cvs/.gitignore
@@ -0,0 +1,2 @@
+/build
+/dist
diff --git a/git_remote_cvs/Makefile b/git_remote_cvs/Makefile
new file mode 100644
index 0000000..061c247
--- /dev/null
+++ b/git_remote_cvs/Makefile
@@ -0,0 +1,35 @@
+#
+# Makefile for the git_remote_cvs python support modules
+#
+pysetupfile:=setup.py
+
+# Shell quote (do not use $(call) to accommodate ancient setups);
+DESTDIR_SQ = $(subst ','\'',$(DESTDIR))
+
+ifndef PYTHON_PATH
+	PYTHON_PATH = /usr/bin/python
+endif
+ifndef prefix
+	prefix = $(HOME)
+endif
+ifndef V
+	QUIET = @
+	QUIETSETUP = --quiet
+endif
+
+PYLIBDIR=$(shell $(PYTHON_PATH) -c \
+	 "import sys; \
+	 print 'lib/python%i.%i/site-packages' % sys.version_info[:2]")
+
+all: $(pysetupfile)
+	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) build
+
+install: $(pysetupfile)
+	$(PYTHON_PATH) $(pysetupfile) install --prefix $(DESTDIR_SQ)$(prefix)
+
+instlibdir: $(pysetupfile)
+	@echo "$(prefix)/$(PYLIBDIR)"
+
+clean:
+	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) clean -a
+	$(RM) *.pyo *.pyc
diff --git a/git_remote_cvs/__init__.py b/git_remote_cvs/__init__.py
new file mode 100644
index 0000000..4956d2d
--- /dev/null
+++ b/git_remote_cvs/__init__.py
@@ -0,0 +1,27 @@
+#!/usr/bin/env python
+
+"""Support library package for git-remote-cvs.
+
+git-remote-cvs is a Git remote helper command that interfaces with a
+CVS repository to provide automatic import of CVS history into a Git
+repository.
+
+This package provides the support library needed by git-remote-cvs.
+The following modules are included:
+
+- cvs - Interaction with CVS repositories
+
+- cvs_symbol_cache - Local CVS symbol cache
+
+- changeset - Collect individual CVS revisions into commits
+
+- git - Interaction with Git repositories
+
+- commit_states - Map Git commits to CVS states
+
+- cvs_revision_map - Map CVS revisions to various metainformation
+
+- util - General utility functionality use by the other modules in
+         this package, and also used directly by git-remote-cvs.
+
+"""
diff --git a/git_remote_cvs/commit_states.py b/git_remote_cvs/commit_states.py
new file mode 100644
index 0000000..2e0af6c
--- /dev/null
+++ b/git_remote_cvs/commit_states.py
@@ -0,0 +1,62 @@
+#!/usr/bin/env python
+
+"""Code for relating Git commits to corresponding CVSState objects."""
+
+from git_remote_cvs.util import debug, error, die
+from git_remote_cvs.cvs import CVSState
+
+
+class CommitStates(object):
+
+    """Provide a mapping from Git commits to CVSState objects.
+
+    Behaves like a dictionary of Git commit names -> CVSState mappings.
+
+    Every Git commit converted from CVS has a corresponding CVSState,
+    which describes exactly which CVS revisions are present in a
+    checkout of that commit.
+
+    This class provides the interface to map from a Git commit to its
+    corresponding CVSState.  The mapping uses GitNotes as a persistent
+    storage backend.
+
+    """
+
+    def __init__ (self, notes):
+        """Create a new Git commit -> CVSState map."""
+        self.notes = notes
+        self._cache = {}  # commitname -> CVSState
+
+    def _load (self, commit):
+        """Retrieve the CVSState associated with the given Git commit."""
+        if commit is None:
+            return None
+        if commit in self._cache:
+            return self._cache[commit]
+        notedata = self.notes.get(commit)
+        if notedata is None:  # Given commit has no associated note
+            return None
+        state = CVSState()
+        state.load_data(notedata)
+        self._cache[commit] = state
+        return state
+
+    def add (self, commit, state, gfi):
+        """Add the given Git commit -> CVSState mapping."""
+        assert commit not in self._cache
+        self._cache[commit] = state
+        self.notes.import_note(commit, str(state), gfi)
+
+    def __getitem__ (self, commit):
+        """Return the CVSState associated with the given commit."""
+        state = self._load(commit)
+        if state is None:
+            raise KeyError("Unknown commit '%s'" % (commit))
+        return state
+
+    def get (self, commit, default = None):
+        """Return the CVSState associated with the given commit."""
+        state = self._load(commit)
+        if state is None:
+            return default
+        return state
diff --git a/git_remote_cvs/cvs_revision_map.py b/git_remote_cvs/cvs_revision_map.py
new file mode 100644
index 0000000..0e65ba6
--- /dev/null
+++ b/git_remote_cvs/cvs_revision_map.py
@@ -0,0 +1,418 @@
+#!/usr/bin/env python
+
+"""Functionality for mapping CVS revisions to associated metainfo.
+
+This modules provides the following main classes:
+
+CVSRevisionMap - provides a mapping from CVS revisions to the
+                 the following associated metainfo:
+                 - Mode/permission of the associated CVS path
+                 - The Git blob holding the revision data
+                 - The Git commits that correspond to the CVS
+                   states in which this CVS revision is present
+
+CVSStateMap - provides a mapping from CVS states to corresponding
+              Git commits (i.e. Git commits, whose tree state is
+              identical to a given CVS state)
+"""
+
+import os
+
+from git_remote_cvs.util import debug, error, die, file_reader_method
+from git_remote_cvs.cvs import CVSNum, CVSDate
+from git_remote_cvs.git import GitFICommit
+
+
+class _CVSPathInfo(object):
+
+    """Information on a single CVS path."""
+
+    __slots__ = ('revs', 'mode')
+
+    def __init__ (self, mode = None):
+        self.revs = {}
+        self.mode = mode
+
+
+class _CVSRevInfo(object):
+
+    """Information on a single CVS revision."""
+
+    __slots__ = ('blob', 'commits')
+
+    def __init__ (self, blob):
+        self.blob = blob
+        self.commits = []
+
+
+class CVSRevisionMap(object):
+
+    """Encapsulate the mapping of CVS revisions to associated metainfo.
+
+    This container maps CVS revisions (a combination of a CVS path and
+    a CVS revision number) into Git blob names, Git commit names, and
+    CVS path information.  Git object (blob/commit) names are either
+    40-char SHA1 strings, or "git fast-import" mark numbers if the form
+    ":<num>".
+
+    The data structure is organized as follows:
+    - A CVSRevisionMap instance knows about a set of CVS paths.
+      For each CVS path, the following is known:
+      - The mode (permission bits) of that CVS path (644 or 755)
+      - The CVS revision numbers that exist for that CVS path.
+        For each revision number, the following is known:
+        - Exactly 1 blob name; the Git blob containing the contents of
+          the revision (the contents of the CVS path at that CVS
+          revision).
+        - One or more commit names; the Git commits which encapsulate a
+          CVS state in which this CVS revision
+
+    To store this data structure persistently, this class uses a Git
+    ref that points to a tree structure containing the above
+    information.  When changes to the structure are made, this class
+    will produce git-fast-import commands to update that tree structure
+    accordingly.
+
+    IMPORTANT: No changes to the CVSRevisionMap will be stored unless
+    self.commit_map() is called with a valid GitFastImport instance.
+
+    NOTES: Mark numbers are only transient references bound to the
+    current "git fast-import" process (assumed to be running alongside
+    this process).  Therefore, when the "git fast-import" process ends,
+    it must write out a mark number -> SHA1 mapping (see the
+    "--export-marks" argument to "git fast-import").  Subsequently,
+    this mapping must be parsed, and the mark numbers in this
+    CVSRevisionMap must be resolved into persistent SHA1 names.
+    In order to quickly find all the unresolved mark number entries in
+    the data structure, and index mapping mark numbers to revisions is
+    kept in a separate file in the tree structure.
+    See the loadMarksFile() method for more details.
+
+    """
+
+    MarkNumIndexFile = "CVS/marks"  # Invalid CVS path name
+
+    def __init__ (self, git_ref, obj_fetcher):
+        """Create a new CVS revision map bound to the given Git ref."""
+        self.git_ref = git_ref
+        self.obj_fetcher = obj_fetcher
+
+        # The following data structure is a cache of the persistent
+        # data structure found at self.git_ref.
+        # It is structured as follows:
+        # - self.d is a mapping from CVS paths to _CVSPathInfo objects.
+        #   _CVSPathInfo object have two fields: revs, mode:
+        #   - mode is either 644 (non-executable) or 755 (executable).
+        #   - revs is a dict, mapping CVS revision numbers (CVSNum
+        #     instances) to _CVSRevInfo objects.  _CVSRevInfo objects
+        #     have two fields: blob, commits:
+        #     - blob is the name of the Git blob object holding the
+        #       contents of that revision.
+        #     - commits is a collection of zero or more Git commit
+        #       names where the commit contains this revision.
+        self.d = {}  # dict: path -> _CVSPathInfo
+        self.mods = set()  # (path, revnum) pairs for all modified revs
+        self.marknum_idx = {}  # dict: mark num -> [(path, revnum), ...]
+        self._load_marknum_idx()
+
+    def __del__ (self):
+        """Verify that self.commit_map() was called before destruction."""
+        if self.mods:
+            error("Missing call to self.commit_map().")
+            error("%i revision changes are lost!", len(self.mods))
+
+    def __nonzero__ (self):
+        """Return True iff any information is currently stored here."""
+        return bool(self.d)
+
+    # Private methods:
+
+    def _add_to_marknum_idx(self, marknum, path, revnum):
+        """Add the given marknum -> (path, revnum) association."""
+        entry = self.marknum_idx.setdefault(marknum, [])
+        entry.append((path, revnum))
+
+    def _load_marknum_idx(self):
+        """Load contents of MarkNumIndexFile into self.marknum_idx."""
+        blobspec = "%s:%s" % (self.git_ref, self.MarkNumIndexFile)
+        try:
+            f = self.obj_fetcher.open_obj(blobspec)
+        except KeyError:
+            return  # Missing object; nothing to do
+
+        for line in f:
+            # Format of line is "<marknum> <path>:<revnum>"
+            mark, rest = line.strip().split(' ', 1)
+            path, rev = rest.rsplit(':', 1)
+            self._add_to_marknum_idx(int(mark), path, CVSNum(rev))
+        f.close()
+
+    def _save_marknum_idx(self):
+        """Prepare data for storage into MarkNumIndexFile.
+
+        The returned string contains the current contents of
+        self.marknum_idx, formatted to be stored verbatim in
+        self.MarkNumIndexFile.
+
+        """
+        lines = []
+        for marknum, revs in sorted(self.marknum_idx.iteritems()):
+            for path, revnum in revs:
+                # Format of line is "<marknum> <path>:<revnum>"
+                line = "%i %s:%s\n" % (marknum, path, revnum)
+                lines.append(line)
+        return "".join(lines)
+
+    def _save_rev(self, path, revnum):
+        """Return blob data for storing the given revision persistently.
+
+        Generate the blob contents that will reconstitute the same
+        revision entry when read back in with _fetch_path().
+
+        """
+        lines = []
+        rev_info = self.d[path].revs[revnum]
+        lines.append("blob %s\n" % (rev_info.blob))
+        for commit in rev_info.commits:
+            lines.append("commit %s\n" % (commit))
+        return "".join(lines)
+
+    @staticmethod
+    def _valid_objname (objname):
+        """Return the argument as a SHA1 (string) or mark num (int)."""
+        # Blob names are either 40-char SHA1 strings, or mark numbers
+        if isinstance(objname, int) or len(objname) != 40:  # Mark number
+            return int(objname)
+        return objname
+
+    def _load (self, path, mode, data):
+        """GitObjectFetcher.walk_tree() callback."""
+        assert mode in (644, 755)
+        cvs_path, revnum = os.path.split(path)
+        revnum = CVSNum(revnum)
+        if cvs_path in self.d:
+            assert mode == self.d[cvs_path].mode
+        else:
+            self.d[cvs_path] = _CVSPathInfo(mode)
+        assert revnum not in self.d[cvs_path].revs
+        rev_info = None
+        for line in data.split("\n"):
+            if not line:
+                continue
+            t, objname = line.split()
+            objname = self._valid_objname(objname)
+            if t == "blob":
+                assert rev_info is None
+                rev_info = _CVSRevInfo(objname)
+            elif t == "commit":
+                assert rev_info is not None
+                rev_info.commits.append(objname)
+            else:
+                assert False, "Unknown type '%s'" % (t)
+        assert rev_info.commits  # Each rev is in at least one commit
+        self.d[cvs_path].revs[revnum] = rev_info
+
+    def _fetch_path (self, path):
+        """If the given path exists, create a path entry in self.d."""
+        tree_spec = "%s:%s" % (self.git_ref, path)
+        self.obj_fetcher.walk_tree(tree_spec, self._load, path)
+        # TODO: Don't load entire tree at once?
+
+    # Public methods:
+
+    def has_path (self, path):
+        """Return True iff the given path is present."""
+        if path not in self.d:
+            self._fetch_path(path)
+        return path in self.d
+
+    def has_rev (self, path, revnum):
+        """Return True iff the given path:revnum is present."""
+        if path not in self.d:
+            self._fetch_path(path)
+        return path in self.d and revnum in self.d[path].revs
+
+    def get_mode (self, path):
+        """Return mode bits for the given path."""
+        if path not in self.d:
+            self._fetch_path(path)
+        return self.d[path].mode
+
+    def get_blob (self, path, revnum):
+        """Return the blob name for the given revision."""
+        if path not in self.d:
+            self._fetch_path(path)
+        return self.d[path].revs[revnum].blob
+
+    def get_commits (self, path, revnum):
+        """Return the commit names containing the given revision."""
+        if path not in self.d:
+            self._fetch_path(path)
+        return self.d[path].revs[revnum].commits
+
+    def has_unresolved_marks (self):
+        """Return True iff there are mark numbers in the data structure."""
+        return self.marknum_idx
+
+    # Public non-const methods
+
+    def add_path (self, path, mode):
+        """Add the given path and associated mode bits to this map."""
+        if path not in self.d:
+            self._fetch_path(path)
+        if path in self.d:
+            if self.d[path].mode:
+                assert mode == self.d[path].mode, \
+                    "The mode of %s has changed from %s " \
+                    "to %s since the last import.  This " \
+                    "is not supported." % (
+                        path, self.d[path].mode, mode)
+            else:
+                self.d[path].mode = mode
+        else:
+            self.d[path] = _CVSPathInfo(mode)
+        # Do not add to self.mods yet, as we expect revisions to be
+        # added before commit_map() is called
+
+    def add_blob (self, path, revnum, blobname):
+        """Add the given path:revnum -> blobname association."""
+        assert blobname
+        if path not in self.d:
+            self._fetch_path(path)
+        blobname = self._valid_objname(blobname)
+        if isinstance(blobname, int):  # Mark number
+            self._add_to_marknum_idx(blobname, path, revnum)
+        entry = self.d.setdefault(path, _CVSPathInfo())
+        assert revnum not in entry.revs
+        entry.revs[revnum] = _CVSRevInfo(blobname)
+        self.mods.add((path, revnum))
+
+    def add_commit (self, path, revnum, commitname):
+        """Add the given path:revnum -> commitname association."""
+        if path not in self.d:
+            self._fetch_path(path)
+        commitname = self._valid_objname(commitname)
+        if isinstance(commitname, int):  # Mark number
+            self._add_to_marknum_idx(commitname, path, revnum)
+        assert revnum in self.d[path].revs
+        assert commitname not in self.d[path].revs[revnum].commits
+        self.d[path].revs[revnum].commits.append(commitname)
+        self.mods.add((path, revnum))
+
+    @file_reader_method(missing_ok = True)
+    def load_marks_file (self, f):
+        """Load mark -> SHA1 mappings from git-fast-import marks file.
+
+        Replace all mark numbers with proper SHA1 names in this data
+        structure (using self.marknum_idx to find them quickly).
+
+        """
+        if not f:
+            return 0
+        marks = {}
+        last_mark = 0
+        for line in f:
+            (mark, sha1) = line.strip().split()
+            assert mark.startswith(":")
+            mark = int(mark[1:])
+            assert mark not in marks
+            marks[mark] = sha1
+            if mark > last_mark:
+                last_mark = mark
+        for marknum, revs in self.marknum_idx.iteritems():
+            sha1 = marks[marknum]
+            for path, revnum in revs:
+                if path not in self.d:
+                    self._fetch_path(path)
+                rev_info = self.d[path].revs[revnum]
+                if rev_info.blob == marknum:  # Replace blobname
+                    rev_info.blob = sha1
+                else:  # Replace commitname
+                    assert marknum in rev_info.commits
+                    i = rev_info.commits.index(marknum)
+                    rev_info.commits[i] = sha1
+                    assert marknum not in rev_info.commits
+                self.mods.add((path, revnum))
+        self.marknum_idx = {}  # Resolved all transient mark numbers
+        return last_mark
+
+    def sync_modeinfo_from_cvs (self, cvs):
+        """Update with mode information from current CVS checkout.
+
+        This method will retrieve mode information on all paths in the
+        current CVS checkout, and update this data structure
+        correspondingly.  In the case where mode information is already
+        present for a given CVS path, this method will verify that such
+        information is correct.
+
+        """
+        for path, mode in cvs.get_modeinfo().iteritems():
+            self.add_path(path, mode)
+
+    def commit_map (self, gfi, author, message):
+        """Produce git-fast-import commands for storing changes.
+
+        The given GitFastImport object is used to produce a commit
+        making the changes done to this data structure persistent.
+
+        """
+        now = CVSDate("now")
+        commitdata = GitFICommit(
+            author[0], author[1], now.ts, now.tz_str(), message)
+
+        # Add updated MarkNumIndexFile to commit
+        mark = gfi.blob(self._save_marknum_idx())
+        commitdata.modify(644, mark, self.MarkNumIndexFile)
+
+        for path, revnum in self.mods:
+            mark = gfi.blob(self._save_rev(path, revnum))
+            mode = self.d[path].mode
+            assert mode in (644, 755)
+            commitdata.modify(mode, mark, "%s/%s" % (path, revnum))
+
+        gfi.commit(self.git_ref, commitdata)
+        self.mods = set()
+
+
+class CVSStateMap(object):
+
+    """Map CVSState object to the commit names which produces that state."""
+
+    def __init__ (self, cvs_rev_map):
+        """Create a CVSStateMap object, bound to the given CVS revision map."""
+        self.cvs_rev_map = cvs_rev_map
+
+    def get_commits (self, state):
+        """Map the given CVSState to commits that contain this state.
+
+        Return all commits where the given state is a subset of the
+        state produced by that commit.
+
+        Returns a set of commit names.  The set may be empty.
+
+        """
+        candidate_commits = None
+        for path, revnum in state:
+            commits = self.cvs_rev_map.get_commits(path, revnum)
+            if candidate_commits is None:
+                candidate_commits = set(commits)
+            else:
+                candidate_commits.intersection_update(commits)
+        return candidate_commits
+
+    def get_exact_commit (self, state, commit_map):
+        """Map the given CVSState to the commit with this exact state.
+
+        The given commit_map must be a CommitStates object.
+
+        Return the only commit (if any) that produces the exact given
+        CVSState.
+
+        Returns a commit name, or None if no matching commit is found.
+
+        """
+        commits = self.get_commits(state)
+        for c in commits:
+            if state == commit_map.get(c):
+                return c
+        return None
diff --git a/git_remote_cvs/git.py b/git_remote_cvs/git.py
new file mode 100644
index 0000000..cf037e3
--- /dev/null
+++ b/git_remote_cvs/git.py
@@ -0,0 +1,680 @@
+#!/usr/bin/env python
+
+"""Functionality for interacting with Git repositories.
+
+This module provides classes for interfacing with a Git repository.
+
+"""
+
+import os
+import re
+from binascii import hexlify
+from cStringIO import StringIO
+import unittest
+
+from git_remote_cvs.util import debug, error, die, start_command, run_command
+from git_remote_cvs.cvs import CVSDate
+
+
+def get_git_dir ():
+    """Return the path to the GIT_DIR for this repo."""
+    args = ("git", "rev-parse", "--git-dir")
+    exit_code, output, errors = run_command(args)
+    if exit_code:
+        die("Failed to retrieve git dir")
+    assert not errors
+    return output.strip()
+
+
+def parse_git_config ():
+    """Return a dict containing the parsed version of 'git config -l'."""
+    exit_code, output, errors = run_command(("git", "config", "-z", "-l"))
+    if exit_code:
+        die("Failed to retrieve git configuration")
+    assert not errors
+    return dict([e.split('\n', 1) for e in output.split("\0") if e])
+
+
+def git_config_bool (value):
+    """Convert the given git config string value to True or False.
+
+    Raise ValueError if the given string was not recognized as a
+    boolean value.
+
+    """
+    norm_value = str(value).strip().lower()
+    if norm_value in ("true", "1", "yes", "on", ""):
+        return True
+    if norm_value in ("false", "0", "no", "off", "none"):
+        return False
+    raise ValueError("Failed to parse '%s' into a boolean value" % (value))
+
+
+def valid_git_ref (ref_name):
+    """Return True iff the given ref name is a valid git ref name."""
+    # The following is a reimplementation of the git check-ref-format
+    # command.  The rules were derived from the git check-ref-format(1)
+    # manual page.  This code should be replaced by a call to
+    # check_ref_format() in the git library, when such is available.
+    if ref_name.endswith('/') or \
+       ref_name.startswith('.') or \
+       ref_name.count('/.') or \
+       ref_name.count('..') or \
+       ref_name.endswith('.lock'):
+        return False
+    for c in ref_name:
+        if ord(c) < 0x20 or ord(c) == 0x7f or c in " ~^:?*[":
+            return False
+    return True
+
+
+class GitObjectFetcher(object):
+
+    """Provide parsed access to 'git cat-file --batch'.
+
+    This provides a read-only interface to the Git object database.
+
+    """
+
+    def __init__ (self):
+        """Initiate a 'git cat-file --batch' session."""
+        self.queue = []  # List of object names to be submitted
+        self.in_transit = None  # Object name currently in transit
+
+        # 'git cat-file --batch' produces binary output which is likely
+        # to be corrupted by the default "rU"-mode pipe opened by
+        # start_command.  (Mode == "rU" does universal new-line
+        # conversion, which mangles carriage returns.) Therefore, we
+        # open an explicitly binary-safe pipe for transferring the
+        # output from 'git cat-file --batch'.
+        pipe_r_fd, pipe_w_fd = os.pipe()
+        pipe_r = os.fdopen(pipe_r_fd, "rb")
+        pipe_w = os.fdopen(pipe_w_fd, "wb")
+        self.proc = start_command(("git", "cat-file", "--batch"),
+                                  stdout = pipe_w)
+        self.f = pipe_r
+
+    def __del__ (self):
+        """Verify completed communication with 'git cat-file --batch'."""
+        assert not self.queue
+        assert self.in_transit is None
+        self.proc.stdin.close()
+        assert self.proc.wait() == 0  # Zero exit code
+        assert self.f.read() == ""  # No remaining output
+
+    def _submit_next_object (self):
+        """Submit queue items to the 'git cat-file --batch' process.
+
+        If there are items in the queue, and there is currently no item
+        currently in 'transit', then pop the first item off the queue,
+        and submit it.
+
+        """
+        if self.queue and self.in_transit is None:
+            self.in_transit = self.queue.pop(0)
+            print >> self.proc.stdin, self.in_transit[0]
+
+    def push (self, obj, callback):
+        """Push the given object name onto the queue.
+
+        The given callback function will at some point in the future
+        be called exactly once with the following arguments:
+        - self - this GitObjectFetcher instance
+        - obj  - the object name provided to push()
+        - sha1 - the SHA1 of the object, if 'None' obj is missing
+        - t    - the type of the object (tag/commit/tree/blob)
+        - size - the size of the object in bytes
+        - data - the object contents
+
+        """
+        self.queue.append((obj, callback))
+        self._submit_next_object()  # (Re)start queue processing
+
+    def process_next_entry (self):
+        """Read the next entry off the queue and invoke callback."""
+        obj, cb = self.in_transit
+        self.in_transit = None
+        header = self.f.readline()
+        if header == "%s missing\n" % (obj):
+            cb(self, obj, None, None, None, None)
+            return
+        sha1, t, size = header.split(" ")
+        assert len(sha1) == 40
+        assert t in ("tag", "commit", "tree", "blob")
+        assert size.endswith("\n")
+        size = int(size.strip())
+        data = self.f.read(size)
+        assert self.f.read(1) == "\n"
+        cb(self, obj, sha1, t, size, data)
+        self._submit_next_object()
+
+    def process (self):
+        """Process the current queue until empty."""
+        while self.in_transit is not None:
+            self.process_next_entry()
+
+    # High-level convenience methods:
+
+    def get_sha1 (self, objspec):
+        """Return the SHA1 of the object specified by 'objspec'.
+
+        Return None if 'objspec' does not specify an existing object.
+
+        """
+        class _ObjHandler(object):
+            """Helper class for getting the returned SHA1."""
+            def __init__ (self, parser):
+                self.parser = parser
+                self.sha1 = None
+
+            def __call__ (self, parser, obj, sha1, t, size, data):
+                # FIXME: Many unused arguments. Could this be cheaper?
+                assert parser == self.parser
+                self.sha1 = sha1
+
+        handler = _ObjHandler(self)
+        self.push(objspec, handler)
+        self.process()
+        return handler.sha1
+
+    def open_obj (self, objspec):
+        """Return a file object wrapping the contents of a named object.
+
+        The caller is responsible for calling .close() on the returned
+        file object.
+
+        Raise KeyError if 'objspec' does not exist in the repo.
+
+        """
+        class _ObjHandler(object):
+            """Helper class for parsing the returned git object."""
+            def __init__ (self, parser):
+                """Set up helper."""
+                self.parser = parser
+                self.contents = StringIO()
+                self.err = None
+
+            def __call__ (self, parser, obj, sha1, t, size, data):
+                """Git object callback (see GitObjectFetcher documentation)."""
+                assert parser == self.parser
+                if not sha1:  # Missing object
+                    self.err = "Missing object '%s'" % obj
+                else:
+                    assert size == len(data)
+                    self.contents.write(data)
+
+        handler = _ObjHandler(self)
+        self.push(objspec, handler)
+        self.process()
+        if handler.err:
+            raise KeyError(handler.err)
+        handler.contents.seek(0)
+        return handler.contents
+
+    def walk_tree (self, tree_objspec, callback, prefix = ""):
+        """Recursively walk the given Git tree object.
+
+        Recursively walk all subtrees of the given tree object, and
+        invoke the given callback passing three arguments:
+        (path, mode, data) with the path, permission bits, and contents
+        of all the blobs found in the entire tree structure.
+
+        """
+        class _ObjHandler(object):
+            """Helper class for walking a git tree structure."""
+            def __init__ (self, parser, cb, path, mode = None):
+                """Set up helper."""
+                self.parser = parser
+                self.cb = cb
+                self.path = path
+                self.mode = mode
+                self.err = None
+
+            def parse_tree (self, treedata):
+                """Parse tree object data, yield tree entries.
+
+                Each tree entry is a 3-tuple (mode, sha1, path)
+
+                self.path is prepended to all paths yielded
+                from this method.
+
+                """
+                while treedata:
+                    mode = int(treedata[:6], 10)
+                    # Turn 100xxx into xxx
+                    if mode > 100000:
+                        mode -= 100000
+                    assert treedata[6] == " "
+                    i = treedata.find("\0", 7)
+                    assert i > 0
+                    path = treedata[7:i]
+                    sha1 = hexlify(treedata[i + 1: i + 21])
+                    yield (mode, sha1, self.path + path)
+                    treedata = treedata[i + 21:]
+
+            def __call__ (self, parser, obj, sha1, t, size, data):
+                """Git object callback (see GitObjectFetcher documentation)."""
+                assert parser == self.parser
+                if not sha1:  # Missing object
+                    self.err = "Missing object '%s'" % (obj)
+                    return
+                assert size == len(data)
+                if t == "tree":
+                    if self.path:
+                        self.path += "/"
+                    # Recurse into all blobs and subtrees
+                    for m, s, p in self.parse_tree(data):
+                        parser.push(s,
+                                    self.__class__(self.parser, self.cb, p, m))
+                elif t == "blob":
+                    self.cb(self.path, self.mode, data)
+                else:
+                    raise ValueError("Unknown object type '%s'" % (t))
+
+        self.push(tree_objspec, _ObjHandler(self, callback, prefix))
+        self.process()
+
+
+class GitRefMap(object):
+
+    """Map Git ref names to the Git object names they currently point to.
+
+    Behaves like a dictionary of Git ref names -> Git object names.
+
+    """
+
+    def __init__ (self, obj_fetcher):
+        """Create a new Git ref -> object map."""
+        self.obj_fetcher = obj_fetcher
+        self._cache = {}  # dict: refname -> objname
+
+    def _load (self, ref):
+        """Retrieve the object currently bound to the given ref.
+
+        The name of the object pointed to by the given ref is stored
+        into this mapping, and also returned.
+
+        """
+        if ref not in self._cache:
+            self._cache[ref] = self.obj_fetcher.get_sha1(ref)
+        return self._cache[ref]
+
+    def __contains__ (self, refname):
+        """Return True if the given refname is present in this cache."""
+        return bool(self._load(refname))
+
+    def __getitem__ (self, refname):
+        """Return the git object name pointed to by the given refname."""
+        commit = self._load(refname)
+        if commit is None:
+            raise KeyError("Unknown ref '%s'" % (refname))
+        return commit
+
+    def get (self, refname, default = None):
+        """Return the git object name pointed to by the given refname."""
+        commit = self._load(refname)
+        if commit is None:
+            return default
+        return commit
+
+
+class GitFICommit(object):
+
+    """Encapsulate the data in a Git fast-import commit command."""
+
+    SHA1RE = re.compile(r'^[0-9a-f]{40}$')
+
+    @classmethod
+    def parse_mode (cls, mode):
+        """Verify the given git file mode, and return it as a string."""
+        assert mode in (644, 755, 100644, 100755, 120000)
+        return "%i" % (mode)
+
+    @classmethod
+    def parse_objname (cls, objname):
+        """Return the given object name (or mark number) as a string."""
+        if isinstance(objname, int):  # Object name is a mark number
+            assert objname > 0
+            return ":%i" % (objname)
+
+        # No existence check is done, only checks for valid format
+        assert cls.SHA1RE.match(objname)  # Object name is valid SHA1
+        return objname
+
+    @classmethod
+    def quote_path (cls, path):
+        """Return a quoted version of the given path."""
+        path = path.replace("\\", "\\\\")
+        path = path.replace("\n", "\\n")
+        path = path.replace('"', '\\"')
+        return '"%s"' % (path)
+
+    @classmethod
+    def parse_path (cls, path):
+        """Verify that the given path is valid, and quote it, if needed."""
+        assert not isinstance(path, int)  # Cannot be a mark number
+
+        # These checks verify the rules on the fast-import man page
+        assert not path.count("//")
+        assert not path.endswith("/")
+        assert not path.startswith("/")
+        assert not path.count("/./")
+        assert not path.count("/../")
+        assert not path.endswith("/.")
+        assert not path.endswith("/..")
+        assert not path.startswith("./")
+        assert not path.startswith("../")
+
+        if path.count('"') + path.count('\n') + path.count('\\'):
+            return cls.quote_path(path)
+        return path
+
+    def __init__ (self, name, email, timestamp, timezone, message):
+        """Create a new Git fast-import commit, with the given metadata."""
+        self.name = name
+        self.email = email
+        self.timestamp = timestamp
+        self.timezone = timezone
+        self.message = message
+        self.pathops = []  # List of path operations in this commit
+
+    def modify (self, mode, blobname, path):
+        """Add a file modification to this Git fast-import commit."""
+        self.pathops.append(("M",
+                             self.parse_mode(mode),
+                             self.parse_objname(blobname),
+                             self.parse_path(path)))
+
+    def delete (self, path):
+        """Add a file deletion to this Git fast-import commit."""
+        self.pathops.append(("D", self.parse_path(path)))
+
+    def copy (self, path, newpath):
+        """Add a file copy to this Git fast-import commit."""
+        self.pathops.append(("C",
+                             self.parse_path(path),
+                             self.parse_path(newpath)))
+
+    def rename (self, path, newpath):
+        """Add a file rename to this Git fast-import commit."""
+        self.pathops.append(("R",
+                             self.parse_path(path),
+                             self.parse_path(newpath)))
+
+    def note (self, blobname, commit):
+        """Add a note object to this Git fast-import commit."""
+        self.pathops.append(("N",
+                             self.parse_objname(blobname),
+                             self.parse_objname(commit)))
+
+    def deleteall (self):
+        """Delete all files in this Git fast-import commit."""
+        self.pathops.append("deleteall")
+
+
+class TestGitFICommit(unittest.TestCase):
+
+    """GitFICommit selftests."""
+
+    def test_basic (self):
+        """GitFICommit basic selftests."""
+
+        def expect_fail (method, data):
+            """Verify that the method(data) raises an AssertionError."""
+            try:
+                method(data)
+            except AssertionError:
+                return
+            raise AssertionError("Failed test for invalid data '%s(%s)'" %
+                                 (method.__name__, repr(data)))
+
+    def test_parse_mode (self):
+        """GitFICommit.parse_mode() selftests."""
+        self.assertEqual(GitFICommit.parse_mode(644), "644")
+        self.assertEqual(GitFICommit.parse_mode(755), "755")
+        self.assertEqual(GitFICommit.parse_mode(100644), "100644")
+        self.assertEqual(GitFICommit.parse_mode(100755), "100755")
+        self.assertEqual(GitFICommit.parse_mode(120000), "120000")
+        self.assertRaises(AssertionError, GitFICommit.parse_mode, 0)
+        self.assertRaises(AssertionError, GitFICommit.parse_mode, 123)
+        self.assertRaises(AssertionError, GitFICommit.parse_mode, 600)
+        self.assertRaises(AssertionError, GitFICommit.parse_mode, "644")
+        self.assertRaises(AssertionError, GitFICommit.parse_mode, "abc")
+
+    def test_parse_objname (self):
+        """GitFICommit.parse_objname() selftests."""
+        self.assertEqual(GitFICommit.parse_objname(1), ":1")
+        self.assertRaises(AssertionError, GitFICommit.parse_objname, 0)
+        self.assertRaises(AssertionError, GitFICommit.parse_objname, -1)
+        self.assertEqual(GitFICommit.parse_objname("0123456789" * 4),
+                         "0123456789" * 4)
+        self.assertEqual(GitFICommit.parse_objname("2468abcdef" * 4),
+                         "2468abcdef" * 4)
+        self.assertRaises(AssertionError, GitFICommit.parse_objname,
+                          "abcdefghij" * 4)
+
+    def test_parse_path (self):
+        """GitFICommit.parse_path() selftests."""
+        self.assertEqual(GitFICommit.parse_path("foo/bar"), "foo/bar")
+        self.assertEqual(GitFICommit.parse_path("path/with\n and \" in it"),
+                         '"path/with\\n and \\" in it"')
+        self.assertRaises(AssertionError, GitFICommit.parse_path, 1)
+        self.assertRaises(AssertionError, GitFICommit.parse_path, 0)
+        self.assertRaises(AssertionError, GitFICommit.parse_path, -1)
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo//bar")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo/bar/")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "/foo/bar")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo/./bar")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo/../bar")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo/bar/.")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "foo/bar/..")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "./foo/bar")
+        self.assertRaises(AssertionError, GitFICommit.parse_path, "../foo/bar")
+
+
+class GitFastImport(object):
+
+    """Encapsulate communication with git fast-import."""
+
+    def __init__ (self, f, obj_fetcher, last_mark = 0):
+        """Set up self to communicate with a fast-import process through f."""
+        self.f = f  # File object where fast-import stream is written
+        self.obj_fetcher = obj_fetcher  # GitObjectFetcher instance
+        self.next_mark = last_mark + 1  # Next mark number
+        self.refs = set()  # Keep track of the refnames we've seen
+
+    def comment (self, s):
+        """Write the given comment in the fast-import stream."""
+        assert "\n" not in s, "Malformed comment: '%s'" % (s)
+        self.f.write("# %s\n" % (s))
+
+    def commit (self, ref, commitdata):
+        """Make a commit on the given ref, with the given GitFICommit.
+
+        Return the mark number identifying this commit.
+
+        """
+        self.f.write("""\
+commit %(ref)s
+mark :%(mark)i
+committer %(name)s <%(email)s> %(timestamp)i %(timezone)s
+data %(msgLength)i
+%(msg)s
+""" % {
+    'ref': ref,
+    'mark': self.next_mark,
+    'name': commitdata.name,
+    'email': commitdata.email,
+    'timestamp': commitdata.timestamp,
+    'timezone': commitdata.timezone,
+    'msgLength': len(commitdata.message),
+    'msg': commitdata.message,
+})
+
+        if ref not in self.refs:
+            self.refs.add(ref)
+            parent = ref + "^0"
+            if self.obj_fetcher.get_sha1(parent):
+                self.f.write("from %s\n" % (parent))
+
+        for op in commitdata.pathops:
+            self.f.write(" ".join(op))
+            self.f.write("\n")
+        self.f.write("\n")
+        retval = self.next_mark
+        self.next_mark += 1
+        return retval
+
+    def blob (self, data):
+        """Import the given blob.
+
+        Return the mark number identifying this blob.
+
+        """
+        self.f.write("blob\nmark :%i\ndata %i\n%s\n" %
+                     (self.next_mark, len(data), data))
+        retval = self.next_mark
+        self.next_mark += 1
+        return retval
+
+    def reset (self, ref, objname):
+        """Reset the given ref to point at the given Git object."""
+        self.f.write("reset %s\nfrom %s\n\n" %
+                     (ref, GitFICommit.parse_objname(objname)))
+        if ref not in self.refs:
+            self.refs.add(ref)
+
+
+class GitNotes(object):
+
+    """Encapsulate access to Git notes.
+
+    Simulates a dictionary of object name (SHA1) -> Git note mappings.
+
+    """
+
+    def __init__ (self, notes_ref, obj_fetcher):
+        """Create a new Git notes interface, bound to the given notes ref."""
+        self.notes_ref = notes_ref
+        self.obj_fetcher = obj_fetcher  # Used to get objects from repo
+        self.imports = []  # list: (objname, note data blob name) tuples
+
+    def __del__ (self):
+        """Verify that self.commit_notes() was called before destruction."""
+        if self.imports:
+            error("Missing call to self.commit_notes().")
+            error("%i notes are not committed!", len(self.imports))
+
+    def _load (self, objname):
+        """Return the note data associated with the given git object.
+
+        The note data is returned in string form. If no note is found
+        for the given object, None is returned.
+
+        """
+        try:
+            f = self.obj_fetcher.open_obj("%s:%s" % (self.notes_ref, objname))
+            ret = f.read()
+            f.close()
+        except KeyError:
+            ret = None
+        return ret
+
+    def __getitem__ (self, objname):
+        """Return the note contents associated with the given object.
+
+        Raise KeyError if given object has no associated note.
+
+        """
+        blobdata = self._load(objname)
+        if blobdata is None:
+            raise KeyError("Object '%s' has no note" % (objname))
+        return blobdata
+
+    def get (self, objname, default = None):
+        """Return the note contents associated with the given object.
+
+        Return given default if given object has no associated note.
+
+        """
+        blobdata = self._load(objname)
+        if blobdata is None:
+            return default
+        return blobdata
+
+    def import_note (self, objname, data, gfi):
+        """Tell git fast-import to store data as a note for objname.
+
+        This method uses the given GitFastImport object to create a
+        blob containing the given note data.  Also an entry mapping the
+        given object name to the created blob is stored until
+        commit_notes() is called.
+
+        Note that this method only works if it is later followed by a
+        call to self.commit_notes() (which produces the note commit
+        that refers to the blob produced here).
+
+        """
+        if not data.endswith("\n"):
+            data += "\n"
+        gfi.comment("Importing note for object %s" % (objname))
+        mark = gfi.blob(data)
+        self.imports.append((objname, mark))
+
+    def commit_notes (self, gfi, author, message):
+        """Produce a git fast-import note commit for the imported notes.
+
+        This method uses the given GitFastImport object to create a
+        commit on the notes ref, introducing the notes previously
+        submitted to import_note().
+
+        """
+        if not self.imports:
+            return
+        now = CVSDate("now")
+        commitdata = GitFICommit(author[0], author[1],
+                                 now.ts, now.tz_str(), message)
+        for objname, blobname in self.imports:
+            assert isinstance(objname, int) and objname > 0
+            assert isinstance(blobname, int) and blobname > 0
+            commitdata.note(blobname, objname)
+        gfi.commit(self.notes_ref, commitdata)
+        self.imports = []
+
+
+class GitCachedNotes(GitNotes):
+
+    """Encapsulate access to Git notes (cached version).
+
+    Only use this class if no caching is done at a higher level.
+
+    Simulates a dictionary of object name (SHA1) -> Git note mappings.
+
+    """
+
+    def __init__ (self, notes_ref, obj_fetcher):
+        """Set up a caching wrapper around GitNotes."""
+        GitNotes.__init__(self, notes_ref, obj_fetcher)
+        self._cache = {}  # Cache: object name -> note data
+
+    def __del__ (self):
+        """Verify that GitNotes' destructor is called."""
+        GitNotes.__del__(self)
+
+    def _load (self, objname):
+        """Extend GitNotes._load() with a local objname -> note cache."""
+        if objname not in self._cache:
+            self._cache[objname] = GitNotes._load(self, objname)
+        return self._cache[objname]
+
+    def import_note (self, objname, data, gfi):
+        """Extend GitNotes.import_note() with a local objname -> note cache."""
+        if not data.endswith("\n"):
+            data += "\n"
+        assert objname not in self._cache
+        self._cache[objname] = data
+        GitNotes.import_note(self, objname, data, gfi)
+
+
+if __name__ == '__main__':
+    unittest.main()
diff --git a/git_remote_cvs/setup.py b/git_remote_cvs/setup.py
new file mode 100644
index 0000000..21f521a
--- /dev/null
+++ b/git_remote_cvs/setup.py
@@ -0,0 +1,17 @@
+#!/usr/bin/env python
+
+"""Distutils build/install script for the git_remote_cvs package."""
+
+from distutils.core import setup
+
+setup(
+    name = 'git_remote_cvs',
+    version = '0.1.0',
+    description = 'Git remote helper program for CVS repositories',
+    license = 'GPLv2',
+    author = 'The Git Community',
+    author_email = 'git@vger.kernel.org',
+    url = 'http://www.git-scm.com/',
+    package_dir = {'git_remote_cvs': ''},
+    packages = ['git_remote_cvs'],
+)
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 14/19] git-remote-cvs: Remote helper program for CVS repositories
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (12 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 13/19] 2/2: " Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 15/19] Add simple selftests of git-remote-cvs functionality Sverre Rabbelier
                   ` (4 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland, Daniel Barkalow, David Aguilar

From: Johan Herland <johan@herland.net>

Implements the import of objects from a local or remote CVS repository.

This helper program uses the "git_remote_cvs" Python package introduced
earlier, and provides a working draft implementation of the remote helper
API, as described in Documentation/git-remote-helpers.txt. Further details
about this specific helper are described in the new
Documentation/git-remote-cvs.txt.

This patch has been improved by the following contributions:
- Daniel Barkalow: Updates reflecting changes in remote helper API
- David Aguilar: Lots of Python coding style fixes

Cc: Daniel Barkalow <barkalow@iabervon.org>
Cc: David Aguilar <davvid@gmail.com>
Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 Documentation/git-remote-cvs.txt |   85 +++++
 Makefile                         |   24 ++
 git-remote-cvs.py                |  683 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 792 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/git-remote-cvs.txt
 create mode 100755 git-remote-cvs.py

diff --git a/Documentation/git-remote-cvs.txt b/Documentation/git-remote-cvs.txt
new file mode 100644
index 0000000..309ee7f
--- /dev/null
+++ b/Documentation/git-remote-cvs.txt
@@ -0,0 +1,85 @@
+git-remote-cvs(1)
+=================
+
+NAME
+----
+git-remote-cvs - Helper program for interoperation with CVS repositories
+
+SYNOPSIS
+--------
+'git remote-cvs' <remote>
+
+DESCRIPTION
+-----------
+
+Please see the linkgit:git-remote-helpers[1] documentation for general
+information about remote helper programs.
+
+CONFIGURATION
+-------------
+
+remote.*.cvsRoot::
+	The URL of the CVS repository (as found in a `CVSROOT` variable, or
+	in a `CVS/Root` file).
+	Example: "`:pserver:user@server/var/cvs/cvsroot`".
+
+remote.*.cvsModule::
+	The path of the CVS module (as found in a `CVS/Repository` file)
+	within the CVS repository specified in `remote.*.cvsRoot`.
+	Example: "`foo/bar`"
+
+remote.*.cachedSymbolsOnly::
+	When 'true', a cache of CVS symbols is used instead of querying the
+	CVS server for all existing symbols (potentially expensive). In this
+	mode, git-remote-cvs will not discover new CVS symbols unless you add
+	them explicitly with the "`addsymbol <symbol>`" command (on
+	git-remote-cvs's stdin), or request an explicit symbol cache update
+	from the CVS server with the "`syncsymbols`" command (on
+	git-remote-cvs's stdin). When 'false' (the default), the CVS server
+	will be queried whenever a list of CVS symbols is required.
+
+remote.*.usernameMap::
+	The path (absolute, or relative to the repository (NOT the worktree))
+	to the file that contains the mapping from CVS usernames to the
+	corresponding full names and email addresses, as used by Git in the
+	Author and Committer fields of commit objects. When this config
+	variable is set, CVS usernames will be resolved against this file.
+	If no match is found in the file, or if this config variable is unset,
+	or if the variable points to a non-existing file, the original CVS
+	username will be used as the Author/Committer name, and the
+	corresponding email address will be set to "`<username>@example.com`".
++
+The format of the usernameMap file is one entry per line, where each line is
+of the form "`username: Full Name <email@address>`".
+Example: `johndoe: John Doe <johndoe@example.com>`
+Blank lines and lines starting with '#' are ignored.
+
+COMMANDS
+--------
+
+In addition to the commands that constitute the git-remote-helpers API, the
+following extra commands are supported for managing the local symbol cache when
+the `remote.*.cachedSymbolsOnly` config variable is true. The following
+commands can be given on the standard input of git-remote-cvs:
+
+'addsymbol'::
+	Takes one CVS symbol name as argument. The given CVS symbol is
+	fetched from the CVS server and stored into the local CVS symbol
+	cache. If `remote.*.cachedSymbolsOnly` is enabled, this can be used
+	to introduce a new CVS symbol to the CVS helper application.
+
+'syncsymbols'::
+	All CVS symbols that are available from the given remote are
+	fetched from the CVS server and stored into the local CVS symbol
+	cache. This is equivalent to disabling `remote.{asterisk}.cachedSymbolsOnly`,
+	running the "list" command, and then finally re-enabling the
+	`remote.*.cachedSymbolsOnly` config variable. I.e. this command can
+	be used to manually synchronize the CVS symbols available to the
+	CVS helper application.
+
+'verify'::
+	Takes one CVS symbol name as argument. Verifies that the CVS symbol
+	has been successfully imported be checking out the CVS symbol from
+	the CVS server, and comparing the CVS working tree against the Git
+	tree object identified by `refs/cvs/<remote>/<symbol>`. This can be
+	used to verify the correctness of a preceding 'import' command.
diff --git a/Makefile b/Makefile
index 944f4d0..a63f44d 100644
--- a/Makefile
+++ b/Makefile
@@ -346,6 +346,8 @@ SCRIPT_PERL += git-relink.perl
 SCRIPT_PERL += git-send-email.perl
 SCRIPT_PERL += git-svn.perl
 
+SCRIPT_PYTHON += git-remote-cvs.py
+
 SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH)) \
 	  $(patsubst %.perl,%,$(SCRIPT_PERL)) \
 	  $(patsubst %.py,%,$(SCRIPT_PYTHON)) \
@@ -1520,6 +1522,28 @@ $(patsubst %.perl,%,$(SCRIPT_PERL)) git-instaweb: % : unimplemented.sh
 	mv $@+ $@
 endif # NO_PERL
 
+ifndef NO_PYTHON
+$(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py
+	$(QUIET_GEN)$(RM) $@ $@+ && \
+	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s --no-print-directory instlibdir` && \
+	sed -e '1{' \
+	    -e '	s|#!.*python|#!$(PYTHON_PATH_SQ)|' \
+	    -e '}' \
+	    -e 's|^import sys.*|&; sys.path.insert(0, "@@INSTLIBDIR@@")|' \
+	    -e 's|@@INSTLIBDIR@@|'"$$INSTLIBDIR"'|g' \
+	    $@.py >$@+ && \
+	chmod +x $@+ && \
+	mv $@+ $@
+else # NO_PYTHON
+$(patsubst %.py,%,$(SCRIPT_PYTHON)): % : unimplemented.sh
+	$(QUIET_GEN)$(RM) $@ $@+ && \
+	sed -e '1s|#!.*/sh|#!$(SHELL_PATH_SQ)|' \
+	    -e 's|@@REASON@@|NO_PYTHON=$(NO_PYTHON)|g' \
+	    unimplemented.sh >$@+ && \
+	chmod +x $@+ && \
+	mv $@+ $@
+endif # NO_PYTHON
+
 configure: configure.ac
 	$(QUIET_GEN)$(RM) $@ $<+ && \
 	sed -e 's/@@GIT_VERSION@@/$(GIT_VERSION)/g' \
diff --git a/git-remote-cvs.py b/git-remote-cvs.py
new file mode 100755
index 0000000..94b61e7
--- /dev/null
+++ b/git-remote-cvs.py
@@ -0,0 +1,683 @@
+#!/usr/bin/env python
+
+"""Usage: git-remote-cvs <remote> [<url>]
+
+Git remote helper command for interacting with CVS repositories
+
+See git-remote-helpers documentation for details on external interface,
+usage, etc.  See git-remote-cvs documentation for specific
+configuration details of this remote helper.
+
+"""
+
+# PRINCIPLES:
+# -----------
+# - Importing same symbol twice (with no CVS changes in between) should
+#   yield the exact same Git state (and the second import should have
+#   no commits).
+# - Importing several CVS symbols pointing to the same state should
+#   yield corresponding refs pointing to the _same_ commit in Git.
+# - Importing a CVS symbol which has received only "regular commits"
+#   since last import should yield a fast-forward straight line of
+#   commits.
+
+# TODO / KNOWN PROBLEMS:
+# ----------------------
+# - Remove cachedSymbolsOnly config variable for now?
+# - Author map handling; map CVS usernames to Git full name + email
+# - Handle files that have been created AND deleted since last import
+# - How to handle CVS tags vs. branches.  Turn CVS tags into Git tags?
+# - Better CVS branch handling: When a branch as a super/subset of
+#   files/revs compared to another branch, find a way to base one branch
+#   on the other instead of creating parallel lines of development with
+#   roughly the same commits.
+# - Profiling, optimizations...
+
+import sys
+import os
+
+from git_remote_cvs.util import (debug, error, die, ProgressIndicator,
+                                 run_command)
+from git_remote_cvs.cvs import CVSState, CVSWorkDir, fetch_revs
+from git_remote_cvs.git import (get_git_dir, parse_git_config, git_config_bool,
+                                valid_git_ref, GitObjectFetcher, GitRefMap,
+                                GitFICommit, GitFastImport, GitNotes)
+from git_remote_cvs.cvs_symbol_cache import CVSSymbolCache
+from git_remote_cvs.commit_states import CommitStates
+from git_remote_cvs.cvs_revision_map import CVSRevisionMap, CVSStateMap
+from git_remote_cvs.changeset import build_changesets_from_revs
+
+
+class Config(object):
+
+    """Encapsulation of configuration variables."""
+
+    # Author name/email tuple for commits created by this tool
+    Author = ("git remote-cvs", "invalid@example.com")
+    # Git remote name
+    Remote = None
+    # CVS symbols are imported into this refs namespace/directory
+    RefSpace = None
+    # Git notes ref, the refname pointing to our git notes
+    NotesRef = None
+    # CVS repository identifier, a 2-tuple (cvs_root, cvs_module),
+    # where cvs_root is the CVS server/repository URL (as found in
+    # $CVSROOT, or in a CVS/Root file), and cvs_module is the path
+    # to a CVS module relative to the CVS repository (as found in a
+    # CVS/Repository file)
+    CVSRepo = (None, None)
+    # Path to the git-remote-cvs cache/work directory
+    # (normally "info/cvs/$remote" within $GIT_DIR)
+    WorkDir = None
+    # If False, the list of CVS symbols will always be retrieved from
+    # the CVS server using 'cvs rlog'.  If True, only the cached
+    # symbols within the "symbols" subdirectory of WorkDir are
+    # consulted.
+    CachedSymbolsOnly = False
+
+    @classmethod
+    def init (cls, remote):
+        """Fetch and setup configuration for the given remote."""
+        git_config = parse_git_config()
+        assert git_config["remote.%s.vcs" % (remote)] == "cvs"
+
+        cls.Author = (git_config["user.name"], git_config["user.email"])
+        cls.Remote = remote
+        cls.RefSpace = "refs/cvs/%s/" % (remote)
+        cls.NotesRef = "refs/notes/cvs/%s" % (remote)
+        cls.CVSRepo = (git_config["remote.%s.cvsroot" % (remote)],
+                       git_config["remote.%s.cvsmodule" % (remote)])
+        cls.WorkDir = os.path.join(get_git_dir(), "info/cvs", remote)
+        cls.CachedSymbolsOnly = git_config_bool(git_config.get(
+            "remote.%s.cachedsymbolsonly" % (remote), "false"))
+
+
+def work_path (*args):
+    """Return the given path appended to git-remote-cvs's cache/work dir."""
+    return os.path.join(Config.WorkDir, *args)
+
+
+def cvs_to_refname (cvsname):
+    """Return the git ref name for the given CVS symbolic name."""
+    if cvsname.startswith(Config.RefSpace):  # Already converted
+        return cvsname
+    return Config.RefSpace + cvsname
+
+
+def ref_to_cvsname (refname):
+    """Return the CVS symbolic name for the given git ref name."""
+    if refname.startswith(Config.RefSpace):
+        return refname[len(Config.RefSpace):]
+    return refname
+
+
+def valid_cvs_symbol (symbol):
+    """Return True iff the given CVS symbol can be imported into Git."""
+    return valid_git_ref(cvs_to_refname(symbol))
+
+
+def die_usage (msg, *args):
+    """Abort program with a helpful usage message."""
+    # Use this file's docstring as a usage string
+    print >> sys.stderr, __doc__
+    die(msg, *args)
+
+
+def import_cvs_revs (symbol, prev_state, cur_state, progress):
+    """Import the CVS revisions needed to satisfy the given CVS symbol.
+
+    This method will determine the CVS revisions involved in moving
+    from the given prev_state to the given cur_state.  This includes
+    looking at revision metadata in CVS, and importing needed blobs
+    from CVS.
+
+    The revision metadata is returned as a 2-level dict of CVSRev
+    objects: mapping path -> revnum -> CVSRev object.
+
+    """
+    # Calculate the revisions involved in moving from prev_state to
+    # cur_state, and fetch CVSRev objects for these revisions.
+    progress.pushprefix("Importing CVS revisions: ")
+    paths = set(prev_state.paths()).union(cur_state.paths())
+    num_fetched_revs = 0  # Number of CVSRev objects involved
+    num_imported_blobs = 0  # Number of blobs actually imported
+    cvs_revs = {}  # path -> revnum -> CVSRev
+    for i, path in enumerate(sorted(paths)):
+        progress.pushprefix("(%i/%i) %s: " % (i + 1, len(paths), path))
+        progress("")
+        prev_rev = prev_state.get(path)
+        cur_rev = cur_state.get(path)
+        if prev_rev and cur_rev and prev_rev == cur_rev:
+            # No changes since last import
+            progress.popprefix()
+            continue
+
+        # Fetch CVSRev objects for range [path:prev_rev, path:symbol]
+        path_revs = fetch_revs(path, prev_rev, cur_rev, symbol, Config.CVSRepo)
+        if not path_revs:
+            # Failed to find revs between prev_rev and symbol
+            if cur_rev:
+                assert not cur_rev.follows(prev_rev)
+                # The CVS symbol has been moved/reset since the
+                # last import in such a way that we cannot
+                # deduce the history between the last import
+                # and the current import.
+                # FIXME: Can we can work around this?
+                die("CVS symbol %s has been moved/reset from %s:%s to %s:%s "
+                    "since the last import.  This is not supported",
+                    symbol, path, prev_rev, path, cur_rev)
+            else:
+                # CVS symbol has been removed from this path.
+                # We cannot conclusively determine the history
+                # of this path following prev_rev.
+                # FIXME: Can we can work around this?
+                die("CVS symbol %s has been removed from %s since the last "
+                    "import.  This is not supported",
+                    symbol, path)
+
+        # OK.  We've got the revs in range [prev_rev, symbol]
+        # Verify/determine cur_rev
+        real_cur_rev = max(path_revs.keys())
+        if cur_rev:
+            assert cur_rev == real_cur_rev
+        else: cur_rev = real_cur_rev
+        # No need to re-import prev_rev if already imported
+        if prev_rev:
+            assert cur_rev.follows(prev_rev)
+            assert prev_rev in path_revs
+            del path_revs[prev_rev]
+        assert path_revs  # There should be more revs than just prev_rev
+
+        # Sanity checks:
+        # All revs from prev_rev to cur_rev are about to be imported
+        check_rev = cur_rev
+        while check_rev and check_rev != prev_rev:
+            assert check_rev in path_revs
+            check_rev = check_rev.parent()
+        # All previous revs have already been imported
+        check_rev = prev_rev
+        while check_rev:
+            assert Globals.CVSRevisionMap.has_rev(path, check_rev)
+            check_rev = check_rev.parent()
+
+        # Import CVS revisions as Git blobs
+        j = 0
+        for num, rev in sorted(path_revs.iteritems(), reverse = True):
+            j += 1
+            progress("(%i/%i) %s" % (j, len(path_revs), num))
+            assert num == rev.num
+            # Skip if already imported
+            if Globals.CVSRevisionMap.has_rev(rev.path, rev.num):
+                continue
+            # ...or if rev is a deletion
+            elif rev.deleted:
+                continue
+            # Import blob for reals
+            data = Globals.CVSWorkDir.get_revision_data(rev.path, rev.num)
+            Globals.GitFastImport.comment("Importing CVS revision %s:%s" %
+                                          (rev.path, rev.num))
+            mark = Globals.GitFastImport.blob(data)
+            Globals.CVSRevisionMap.add_blob(rev.path, rev.num, mark)
+            num_imported_blobs += 1
+
+        # Add path_revs to the overall structure of revs to be imported
+        assert path not in cvs_revs
+        cvs_revs[path] = path_revs
+        num_fetched_revs += len(path_revs)
+        progress.popprefix()
+    progress.popprefix()
+    progress("Imported %i blobs (reused %i existing blobs)" %
+             (num_imported_blobs, num_fetched_revs - num_imported_blobs), True)
+
+    return cvs_revs
+
+
+def advance_state (state, changeset):
+    """Advance the given state by applying the given changeset."""
+    # Verify that the given changeset "fits" on top of the given state
+    for rev in changeset:
+        prev_num = rev.num.parent()
+        state_num = state.get(rev.path)
+        if prev_num is None and state_num is None:
+            # rev is the first revision of this path being added
+            state.add(rev.path, rev.num)
+        elif prev_num and state_num and prev_num == state_num:
+            if rev.deleted:  # rev deletes path from state
+                state.remove(rev.path, prev_num)
+            else:  # rev follows state's revision of this path
+                state.replace(rev.path, rev.num)
+        else:
+            error("Cannot apply changeset with %s:%s on top of CVS state "
+                  "with %s:%s.",
+                  rev.path, changeset[rev.path].num,
+                  rev.path, state.get(rev.path))
+            error("    changeset: %s", changeset)
+            error("    CVS state: \n---\n%s---", state)
+            die("Failed to apply changeset.  Aborting.")
+
+
+def revert_state (state, changeset):
+    """Revert the given state to before the given changeset is applied.
+
+    This is the reverse of the above advance_state() function.
+
+    """
+    for rev in changeset:
+        prev_num = rev.num.parent()
+        state_num = state.get(rev.path)
+        if state_num is None:  # Revert deletion of file
+            assert rev.deleted
+            state.add(rev.path, prev_num)
+        else:
+            assert state_num == rev.num
+            if prev_num is None:  # Revert addition of file
+                state.remove(rev.path, rev.num)
+            else:  # Regular revert to previous version
+                state.replace(rev.path, prev_num)
+
+
+def import_changesets (ref, changesets, from_state, to_state, progress):
+    """Apply the given list of Changeset objects to the given ref.
+
+    Also verify that the changesets bring us from the given from_state
+    to the given to_state.
+
+    """
+    state = from_state
+    for i, c in enumerate(changesets):
+        advance_state(state, c)
+        progress("(%i/%i) Committing %s" % (i + 1, len(changesets), c))
+        # Make a git commit from changeset c
+        commitdata = GitFICommit(
+            c.author, c.author + "@example.com",  # TODO: author_map handling
+            c.date.ts, c.date.tz_str(),
+            "".join(["%s\n" % (line) for line in c.message]))
+
+        for rev in c:
+            p, n = rev.path, rev.num
+            if rev.deleted:
+                commitdata.delete(p)
+                continue
+            blobname = Globals.CVSRevisionMap.get_blob(p, n)
+            mode = Globals.CVSRevisionMap.get_mode(p)
+            if mode is None:  # Must retrieve mode from CVS checkout
+                debug("Retrieving mode info for '%s'" % (p))
+                Globals.CVSWorkDir.update(n, [p])
+                mode = Globals.CVSWorkDir.get_modeinfo([p])[p]
+                Globals.CVSRevisionMap.add_path(p, mode)
+            commitdata.modify(mode, blobname, p)
+
+        commitname = Globals.GitFastImport.commit(ref, commitdata)
+        Globals.CommitStates.add(commitname, state, Globals.GitFastImport)
+        for path, revnum in state:
+            Globals.CVSRevisionMap.add_commit(path, revnum, commitname)
+        assert commitname in Globals.CVSStateMap.get_commits(state)
+    assert state == to_state
+    return len(changesets)
+
+
+def import_cvs_symbol (cvs_symbol, progress):
+    """Import the given CVS symbol from CVS to Git.
+
+    Return False if nothing was imported, True otherwise.
+
+    """
+    git_ref = cvs_to_refname(cvs_symbol)
+    progress.pushprefix("%s: " % (cvs_symbol))
+
+    # Verify that we are asked to import valid git ref names
+    if not valid_git_ref(git_ref):
+        progress("Invalid git ref '%s'.  Skipping." % (git_ref), True)
+        progress.popprefix()
+        return False
+
+    # Retrieve previously imported CVS state
+    progress("Loading previously imported state of %s..." % (git_ref))
+    prev_commit = Globals.GitRefMap.get(git_ref)
+    prev_state = Globals.CommitStates.get(prev_commit, CVSState())
+
+    # Retrieve current CVS state of symbol
+    # Also: At some point we will need mode information for all CVS
+    # paths (stored in CVSRevisionMap).  This information can be added
+    # for each path on demand (using CVSWorkDir.get_modeinfo()), but
+    # doing so may be an expensive process.  It is much cheaper to load
+    # mode information for as many paths as possible in a _single_
+    # operation.  We do this below, by calling
+    # CVSRevisionMap.sync_modeinfo_from_cvs() in appropriate places.
+    if Config.CachedSymbolsOnly:
+        progress("Synchronizing local CVS symbol cache for symbol...")
+        # The symbol cache is likely not up-to-date.  Synchronize the
+        # given CVS symbol explicitly, to make sure we get the version
+        # current with the CVS server.
+        Globals.CVSSymbolCache.sync_symbol(cvs_symbol, Globals.CVSWorkDir,
+                                           progress)
+        # The above method updates the CVS workdir to the current CVS
+        # version.  Hence, now is a convenient time to preload mode
+        # info from the currently checked-out CVS files.  There may be
+        # more files for which we'll need mode information, but we'll
+        # deal with those when needed.
+        progress("Updating path mode info from current CVS checkout.")
+        Globals.CVSRevisionMap.sync_modeinfo_from_cvs(Globals.CVSWorkDir)
+    elif not Globals.CVSRevisionMap:  # No info for any paths, yet
+        # Pure optimization: We didn't get to preload all the mode info
+        # above.  Normally, the only alternative is load mode info for
+        # each path on-demand.  However, if our CVSRevisionMap is
+        # currently empty, that's probably going to be very expensive.
+        # Therefore, in this case, do an explicit CVS update here, and
+        # preload mode info for all paths.
+        progress("Updating CVS checkout to sync path mode info.")
+        Globals.CVSWorkDir.update(cvs_symbol)
+        Globals.CVSRevisionMap.sync_modeinfo_from_cvs(Globals.CVSWorkDir)
+
+    progress("Loading current CVS state...")
+    try:
+        cur_state = Globals.CVSSymbolCache[cvs_symbol]
+    except KeyError:
+        progress("Couldn't find symbol '%s'.  Skipping." % (cvs_symbol), True)
+        progress.popprefix()
+        return False
+
+    # Optimization: Check if the previous import of this symbol is
+    # still up-to-date.  If so, there's nothing more to be done.
+    progress("Checking if we're already up-to-date...")
+    if cur_state == prev_state:
+        progress("Already up-to-date.  Skipping.", True)
+        progress.popprefix()
+        return False
+
+    progress("Fetching CVS revisions...")
+    cvs_revs = import_cvs_revs(cvs_symbol, prev_state, cur_state, progress)
+    progress("Organizing revisions into chronological list of changesets...")
+    changesets = build_changesets_from_revs(cvs_revs)
+
+    # When importing a new branch, try to optimize branch start point,
+    # instead of importing entire branch from scratch
+    if prev_commit is None:
+        progress("Finding startpoint for new symbol...")
+        i = len(changesets)
+        state = cur_state.copy()
+        for c in reversed(changesets):
+            commit = Globals.CVSStateMap.get_exact_commit(state,
+                                                          Globals.CommitStates)
+            if commit is not None:
+                # We have found a commit that exactly matches the state
+                # after commit #i (changesets[i - 1])
+                Globals.GitFastImport.reset(git_ref, commit)
+                changesets = changesets[i:]
+                break
+            revert_state(state, c)
+            i -= 1
+    num_changesets = len(changesets)
+    num_applied = 0
+
+    # Apply changesets, bringing git_ref from prev_state to cur_state
+    if num_changesets:
+        progress("Importing changesets...")
+        num_applied = import_changesets(git_ref, changesets, prev_state,
+                                        cur_state, progress)
+    progress("Imported %i changesets (reused %i existing changesets)" %
+             (num_applied, num_changesets - num_applied), True)
+    progress.popprefix()
+    return True
+
+
+def do_import (*args):
+    """Do the 'import' command; import refs from a remote."""
+    if not args:
+        die_usage("'import' takes at least one parameter: ref...")
+
+    progress = ProgressIndicator("    ", sys.stderr)
+    cvs_symbols = map(ref_to_cvsname, args)
+    empty_import = True
+    for symbol in cvs_symbols:
+        if import_cvs_symbol(symbol, progress):
+            empty_import = False
+    if empty_import:
+        progress.finish("Everything up-to-date", True)
+        return 0
+    progress.finish("Finished importing %i CVS symbols to Git" %
+                    (len(cvs_symbols)), True)
+    return 0
+
+
+def do_list (*args):
+    """Do the 'list' command; list refs available from a CVS remote."""
+    if args:
+        die_usage("'list' takes no parameters")
+
+    progress = ProgressIndicator("    ", sys.stderr)
+    if Config.CachedSymbolsOnly:
+        progress("Listing symbols in local symbol cache...", True)
+        for symbol in sorted(Globals.CVSSymbolCache):
+            print cvs_to_refname(symbol)
+        progress.finish()
+        print  # Terminate output with a blank line
+        return 0
+
+    # Synchronize local symbol cache with CVS server
+    progress("Synchronizing local symbol cache with CVS server...")
+    Globals.CVSSymbolCache.sync_all_symbols(Config.CVSRepo, progress,
+                                            valid_cvs_symbol)
+    # Load current states of Git refs
+    progress("Loading current state of Git refs...")
+    changed, unchanged = 0, 0
+    for cvs_symbol, cvs_state in sorted(Globals.CVSSymbolCache.items()):
+        git_ref = cvs_to_refname(cvs_symbol)
+        progress("\tChecking if Git ref is up-to-date: %s" % (git_ref))
+        git_commit = Globals.GitRefMap.get(git_ref)
+        git_state = Globals.CommitStates.get(git_commit)
+        attrs = ""
+        if git_state and git_state == cvs_state:
+            attrs = " unchanged"
+            unchanged += 1
+        else:
+            git_commit = "?"
+            changed += 1
+        print "%s %s%s" % (git_commit, git_ref, attrs)
+    progress.finish("Found %i CVS symbols (%i changed, %i unchanged)" %
+                    (changed + unchanged, changed, unchanged))
+    print  # Terminate with a blank line
+    return 0
+
+
+def do_capabilities (*args):
+    """Do the 'capabilities' command; report supported features."""
+    if args:
+        die_usage("'capabilities' takes no parameters")
+    print "import"
+    print "marks %s" % (work_path("marks"))
+#   print "export"
+    print  # Terminate with a blank line
+    return 0
+
+
+def do_addsymbol (*args):
+    """Do the 'addsymbol' command; add given CVS symbol to local cache."""
+    if len(args) != 1:
+        die_usage("'addsymbol' takes one parameter: symbol")
+    symbol = args[0]
+    progress = ProgressIndicator("    ", sys.stderr)
+    if valid_cvs_symbol(symbol):
+        Globals.CVSSymbolCache.sync_symbol(symbol, Globals.CVSWorkDir,
+                                           progress)
+        progress.finish("Added '%s' to CVS symbol cache" % (symbol), True)
+    else:
+        error("Skipping CVS symbol '%s'; it is not a valid git ref", symbol)
+    print  # Terminate with a blank line
+    return 0
+
+
+def do_syncsymbols (*args):
+    """Do the 'syncsymbols' command; sync all symbols with CVS server."""
+    if args:
+        die_usage("'syncsymbols' takes no parameters")
+    progress = ProgressIndicator("    ", sys.stderr)
+    Globals.CVSSymbolCache.sync_all_symbols(Config.CVSRepo, progress,
+                                            valid_cvs_symbol)
+    progress.finish()
+    print  # Terminate with a blank line
+    return 0
+
+
+def do_verify (*args):
+    """Do the 'verify' command; Compare CVS checkout and Git tree."""
+    if len(args) != 1:
+        die_usage("'verify' takes one parameter: symbol")
+    symbol = args[0]
+    gitref = cvs_to_refname(symbol)
+    assert valid_git_ref(gitref)
+
+    progress = ProgressIndicator("    ", sys.stderr)
+    progress("Checking out '%s' from CVS..." % (symbol))
+    Globals.CVSWorkDir.update(symbol)
+
+    add_env = {"GIT_INDEX_FILE": os.path.abspath(work_path("temp_index"))}
+    progress("Creating Git index from tree object @ '%s'..." % (gitref))
+    cmd = ("git", "read-tree", gitref)
+    assert run_command(cmd, add_env = add_env)[0] == 0
+
+    progress("Comparing CVS checkout to Git index...", True)
+    cmd = ("git", "--work-tree=%s" % (os.path.abspath(work_path("cvs"))),
+           "ls-files", "--exclude=CVS", "--deleted", "--modified", "--others",
+           "-t")
+    exit_code, output, errors = run_command(cmd, add_env = add_env)
+    assert exit_code == 0 and not errors
+
+    if output:
+        progress.finish("Failed verification of '%s'" % (symbol), True)
+        error("The '%s' command returned:\n---\n%s---", " ".join(cmd), output)
+    else:
+        progress.finish("Successfully verified '%s'" % (symbol), True)
+    print  # Terminate with a blank line
+    return exit_code
+
+
+def not_implemented (*args):
+    """Abort, while informing user that this command is not yet implemented."""
+    die_usage("Command not implemented")
+
+
+COMMANDS = {
+    "capabilities": do_capabilities,
+    "list": do_list,
+    # Special handling of 'import' in main()
+    # "import": do_import,
+    "export": not_implemented,
+    # Custom commands
+    "addsymbol": do_addsymbol,
+    "syncsymbols": do_syncsymbols,
+    "verify": do_verify,
+}
+
+
+class Globals(object):
+
+    """Global variables are placed here at the start of main()."""
+
+    pass
+
+
+def main (*args):
+    """Main program logic; execution starts here."""
+    debug("Invoked '%s'", " ".join(args))
+
+    # Initialization of subsystems
+    assert len(args) >= 2
+    # Read config for the given remote
+    Config.init(args[1])
+    # Local CVS symbol cache (CVS symbol -> CVS state mapping)
+    Globals.CVSSymbolCache = CVSSymbolCache(work_path("symbols"))
+    # Local CVS checkout
+    Globals.CVSWorkDir = CVSWorkDir(work_path("cvs"), Config.CVSRepo)
+    # Interface to 'git cat-file --batch'
+    Globals.GitObjectFetcher = GitObjectFetcher()
+    # Interface to Git object notes
+    Globals.GitNotes = GitNotes(Config.NotesRef, Globals.GitObjectFetcher)
+    # Mapping from Git commit objects to CVS states
+    Globals.CommitStates = CommitStates(Globals.GitNotes)
+    # Mapping from Git ref names to Git object names
+    Globals.GitRefMap = GitRefMap(Globals.GitObjectFetcher)
+    # Mapping from CVS revision to Git blob and commit objects
+    Globals.CVSRevisionMap = CVSRevisionMap(cvs_to_refname("_metadata"),
+                                            Globals.GitObjectFetcher)
+    last_mark = 0
+    if Globals.CVSRevisionMap.has_unresolved_marks():
+        # Update with marks from last import
+        last_mark = Globals.CVSRevisionMap.load_marks_file(work_path("marks"))
+    else:
+        # Truncate marks file.  We cannot automatically do this after
+        # .load_marks_file() above, since we cannot yet guarantee that
+        # we will be able to save the revision map persistently.  (That
+        # can only happen if we are given one or more import commands
+        # below.) We can only truncate this file when we know there are
+        # no unresolved marks in the revision map.
+        open(work_path("marks"), "w").close()
+    # Mapping from CVS states to commit objects that contain said state
+    Globals.CVSStateMap = CVSStateMap(Globals.CVSRevisionMap)
+
+    # Main program loop
+    import_refs = []  # Accumulate import commands here
+    # Cannot use "for line in sys.stdin" for buffering (?) reasons
+    line = sys.stdin.readline()
+    while (line):
+        cmdline = line.strip().split()
+        if not cmdline:
+            break  # Blank line means we're about to quit
+        debug("Got command '%s'", " ".join(cmdline))
+        cmd = cmdline.pop(0)
+        if cmd == "import":
+            import_refs.extend(cmdline)
+        else:
+            if cmd not in COMMANDS:
+                die_usage("Unknown command '%s'", cmd)
+            if COMMANDS[cmd](*cmdline):
+                die("Command '%s' failed", line.strip())
+            sys.stdout.flush()
+        line = sys.stdin.readline()
+
+    # Trigger import processing after last import command
+    ret = 0
+    if import_refs:
+        # Init producer of output in the git-fast-import format
+        Globals.GitFastImport = GitFastImport(
+            sys.stdout, Globals.GitObjectFetcher, last_mark)
+
+        # Perform import of given refs
+        ret = do_import(*import_refs)
+
+        # Notes on persistent storage of subsystems' data structures:
+        #
+        # Because the "import" command has been called, we here _know_
+        # that there is a fast-import process running in parallel.
+        # (This is NOT the case when there are no "import" commands).
+        # We can therefore now (and only now) safely commit the extra
+        # information that we store in the Git repo.
+        # In other words, the data structures that we commit to
+        # persistent storage with the following calls will NOT be
+        # committed if there are no "import" commands.  The data
+        # structures must handle this in one of two ways:
+        # - In the no-"import" scenario, there is simply nothing to
+        #   commit, so it can safely be skipped.
+        # - Any information that should have been committed in the
+        #   no-"import" scenario can be reconstructed repeatedly in
+        #   subsequent executions of this program, until the next
+        #   invocation of an "import" command provides an opportunity
+        #   to commit the data structure to persistent storage.
+
+        # Write out commit notes (mapping git commits to CVSStates)
+        # The following call would be a no-op in the no-"import" case
+        Globals.GitNotes.commit_notes(Globals.GitFastImport, Config.Author,
+            'Annotate commits imported by "git remote-cvs"\n')
+
+        # Save CVS revision metadata
+        # This data structure can handle the no-"import" case as long
+        # as the marks file from the last fast-import run is still
+        # present upon the next execution of this program.
+        Globals.CVSRevisionMap.commit_map(Globals.GitFastImport, Config.Author,
+            'Updated metadata used by "git remote-cvs"\n')
+
+    return ret
+
+
+if __name__ == '__main__':
+    sys.exit(main(*sys.argv))
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 15/19] Add simple selftests of git-remote-cvs functionality
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (13 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 14/19] git-remote-cvs: Remote helper program for CVS repositories Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 16/19] Fix the Makefile-generated path to the git_remote_cvs package in git-remote-cvs Sverre Rabbelier
                   ` (3 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland

From: Johan Herland <johan@herland.net>

Add two new selftests:

- t9800-remote-cvs-basic: Test the git-remote-cvs implementation of the
  remote helper API, by verifying the expected output of the git-remote-cvs
  program when invoking remote helper API commands while doing some simple
  CVS operations.

- t9801-remote-cvs-fetch: A more high-level test of the fetch/import-side
  of the git-remote-cvs implementation, by verifying the expected repository
  state after doing "git fetch" from a CVS remote where a variety of CVS
  operations are being performed.

Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 t/t9800-remote-cvs-basic.sh |  524 +++++++++++++++++++++++++++++++++++++++++++
 t/t9801-remote-cvs-fetch.sh |  291 ++++++++++++++++++++++++
 2 files changed, 815 insertions(+), 0 deletions(-)
 create mode 100755 t/t9800-remote-cvs-basic.sh
 create mode 100755 t/t9801-remote-cvs-fetch.sh

diff --git a/t/t9800-remote-cvs-basic.sh b/t/t9800-remote-cvs-basic.sh
new file mode 100755
index 0000000..6ddec17
--- /dev/null
+++ b/t/t9800-remote-cvs-basic.sh
@@ -0,0 +1,524 @@
+#!/bin/sh
+
+test_description='git remote-cvs basic tests'
+. ./test-lib.sh
+
+if ! test_have_prereq PYTHON; then
+	say 'skipping CVS foreign-vcs helper tests, python not available'
+	test_done
+fi
+
+CVS_EXEC=cvs
+CVS_OPTS="-f -q"
+CVS="$CVS_EXEC $CVS_OPTS"
+
+CVSROOT=$(pwd)/cvsroot
+export CVSROOT
+unset CVS_SERVER
+
+CVSMODULE=cvsmodule
+GITREMOTE=cvsremote
+
+if ! type $CVS_EXEC >/dev/null 2>&1
+then
+	say 'skipping remote-cvs tests, $CVS_EXEC not found'
+	test_done
+fi
+
+test_expect_success 'setup cvsroot' '$CVS init'
+
+test_expect_success '#1: setup a cvs module' '
+
+	mkdir "$CVSROOT/$CVSMODULE" &&
+	$CVS co -d module-cvs $CVSMODULE &&
+	(
+		cd module-cvs &&
+		cat <<EOF >o_fortuna &&
+O Fortuna
+velut luna
+statu variabilis,
+
+semper crescis
+aut decrescis;
+vita detestabilis
+
+nunc obdurat
+et tunc curat
+ludo mentis aciem,
+
+egestatem,
+potestatem
+dissolvit ut glaciem.
+EOF
+		$CVS add o_fortuna &&
+		cat <<EOF >message &&
+add "O Fortuna" lyrics
+
+These public domain lyrics make an excellent sample text.
+EOF
+		$CVS commit -f -F message o_fortuna
+	)
+'
+
+test_expect_success 'set up CVS repo as a foreign remote' '
+
+	git config "user.name" "Test User"
+	git config "user.email" "test@example.com"
+	git config "remote.$GITREMOTE.vcs" cvs
+	git config "remote.$GITREMOTE.cvsRoot" "$CVSROOT"
+	git config "remote.$GITREMOTE.cvsModule" "$CVSMODULE"
+	git config "remote.$GITREMOTE.fetch" \
+		"+refs/cvs/$GITREMOTE/*:refs/remotes/$GITREMOTE/*"
+
+'
+
+test_expect_success '#1: git-remote-cvs "capabilities" command' '
+
+	echo "capabilities" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+import
+marks .git/info/cvs/$GITREMOTE/marks
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#1: git-remote-cvs "list" command' '
+
+	echo "list" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+? refs/cvs/$GITREMOTE/HEAD
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#1: git-remote-cvs "import" command' '
+
+	echo "import refs/cvs/$GITREMOTE/HEAD" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+# Importing CVS revision o_fortuna:1.1
+blob
+mark :1
+data 180
+O Fortuna
+velut luna
+statu variabilis,
+
+semper crescis
+aut decrescis;
+vita detestabilis
+
+nunc obdurat
+et tunc curat
+ludo mentis aciem,
+
+egestatem,
+potestatem
+dissolvit ut glaciem.
+
+commit refs/cvs/$GITREMOTE/HEAD
+mark :2
+data 82
+add "O Fortuna" lyrics
+
+These public domain lyrics make an excellent sample text.
+
+M 644 :1 o_fortuna
+
+# Importing note for object 2
+blob
+mark :3
+data 14
+o_fortuna:1.1
+
+commit refs/notes/cvs/$GITREMOTE
+mark :4
+data 46
+Annotate commits imported by "git remote-cvs"
+
+N :3 :2
+
+blob
+mark :5
+data 32
+1 o_fortuna:1.1
+2 o_fortuna:1.1
+
+blob
+mark :6
+data 16
+blob 1
+commit 2
+
+commit refs/cvs/$GITREMOTE/_metadata
+mark :7
+data 42
+Updated metadata used by "git remote-cvs"
+
+M 644 :5 CVS/marks
+M 644 :6 o_fortuna/1.1
+
+EOF
+	grep -v "^committer " actual > actual.filtered &&
+	test_cmp expect actual.filtered
+
+'
+
+test_expect_success '#1: Passing git-remote-cvs output to git-fast-import' '
+
+	git fast-import --quiet \
+		--export-marks=".git/info/cvs/$GITREMOTE/marks" \
+		< actual &&
+	git gc
+
+'
+
+test_expect_success '#1: Verifying correctness of import' '
+
+	echo "verify HEAD" | git remote-cvs "$GITREMOTE"
+
+'
+
+test_expect_success '#2: update cvs module' '
+
+	(
+		cd module-cvs &&
+		cat <<EOF >o_fortuna &&
+O Fortune,
+like the moon
+you are changeable,
+
+ever waxing
+and waning;
+hateful life
+
+first oppresses
+and then soothes
+as fancy takes it;
+
+poverty
+and power
+it melts them like ice.
+EOF
+		cat <<EOF >message &&
+translate to English
+
+My Latin is terrible.
+EOF
+		$CVS commit -f -F message o_fortuna
+	)
+'
+
+test_expect_success '#2: git-remote-cvs "capabilities" command' '
+
+	echo "capabilities" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+import
+marks .git/info/cvs/$GITREMOTE/marks
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#2: git-remote-cvs "list" command' '
+
+	echo "list" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+? refs/cvs/$GITREMOTE/HEAD
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#2: git-remote-cvs "import" command' '
+
+	echo "import refs/cvs/$GITREMOTE/HEAD" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+# Importing CVS revision o_fortuna:1.2
+blob
+mark :8
+data 179
+O Fortune,
+like the moon
+you are changeable,
+
+ever waxing
+and waning;
+hateful life
+
+first oppresses
+and then soothes
+as fancy takes it;
+
+poverty
+and power
+it melts them like ice.
+
+commit refs/cvs/$GITREMOTE/HEAD
+mark :9
+data 44
+translate to English
+
+My Latin is terrible.
+
+from refs/cvs/$GITREMOTE/HEAD^0
+M 644 :8 o_fortuna
+
+# Importing note for object 9
+blob
+mark :10
+data 14
+o_fortuna:1.2
+
+commit refs/notes/cvs/$GITREMOTE
+mark :11
+data 46
+Annotate commits imported by "git remote-cvs"
+
+from refs/notes/cvs/$GITREMOTE^0
+N :10 :9
+
+blob
+mark :12
+data 32
+8 o_fortuna:1.2
+9 o_fortuna:1.2
+
+blob
+mark :13
+data 94
+
+blob
+mark :14
+data 16
+blob 8
+commit 9
+
+commit refs/cvs/$GITREMOTE/_metadata
+mark :15
+data 42
+Updated metadata used by "git remote-cvs"
+
+from refs/cvs/$GITREMOTE/_metadata^0
+M 644 :12 CVS/marks
+M 644 :13 o_fortuna/1.1
+M 644 :14 o_fortuna/1.2
+
+EOF
+	grep -v -e "^committer " -e "\b[0-9a-f]\{40\}\b" actual > actual.filtered &&
+	test_cmp expect actual.filtered
+
+'
+
+test_expect_success '#2: Passing git-remote-cvs output to git-fast-import' '
+
+	git fast-import --quiet \
+		--import-marks=".git/info/cvs/$GITREMOTE/marks" \
+		--export-marks=".git/info/cvs/$GITREMOTE/marks" \
+		< actual &&
+	git gc
+
+'
+
+test_expect_success '#2: Verifying correctness of import' '
+
+	echo "verify HEAD" | git remote-cvs "$GITREMOTE"
+
+'
+
+test_expect_success '#3: update cvs module' '
+
+	(
+		cd module-cvs &&
+		echo 1 >tick &&
+		$CVS add tick &&
+		$CVS commit -f -m 1 tick
+	)
+
+'
+
+test_expect_success '#3: git-remote-cvs "capabilities" command' '
+
+	echo "capabilities" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+import
+marks .git/info/cvs/$GITREMOTE/marks
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#3: git-remote-cvs "list" command' '
+
+	echo "list" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+? refs/cvs/$GITREMOTE/HEAD
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#3: git-remote-cvs "import" command' '
+
+	echo "import refs/cvs/$GITREMOTE/HEAD" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+# Importing CVS revision tick:1.1
+blob
+mark :16
+data 2
+1
+
+commit refs/cvs/$GITREMOTE/HEAD
+mark :17
+data 2
+1
+
+from refs/cvs/$GITREMOTE/HEAD^0
+M 644 :16 tick
+
+# Importing note for object 17
+blob
+mark :18
+data 23
+o_fortuna:1.2
+tick:1.1
+
+commit refs/notes/cvs/$GITREMOTE
+mark :19
+data 46
+Annotate commits imported by "git remote-cvs"
+
+from refs/notes/cvs/$GITREMOTE^0
+N :18 :17
+
+blob
+mark :20
+data 41
+16 tick:1.1
+17 tick:1.1
+17 o_fortuna:1.2
+
+blob
+mark :21
+data 104
+commit 17
+
+blob
+mark :22
+data 18
+blob 16
+commit 17
+
+commit refs/cvs/$GITREMOTE/_metadata
+mark :23
+data 42
+Updated metadata used by "git remote-cvs"
+
+from refs/cvs/$GITREMOTE/_metadata^0
+M 644 :20 CVS/marks
+M 644 :21 o_fortuna/1.2
+M 644 :22 tick/1.1
+
+EOF
+	grep -v -e "^committer " -e "\b[0-9a-f]\{40\}\b" actual > actual.filtered &&
+	test_cmp expect actual.filtered
+
+'
+
+test_expect_success '#3: Passing git-remote-cvs output to git-fast-import' '
+
+	git fast-import --quiet \
+		--import-marks=".git/info/cvs/$GITREMOTE/marks" \
+		--export-marks=".git/info/cvs/$GITREMOTE/marks" \
+		< actual &&
+	git gc
+
+'
+
+test_expect_success '#3: Verifying correctness of import' '
+
+	echo "verify HEAD" | git remote-cvs "$GITREMOTE"
+
+'
+
+test_expect_success '#4: git-remote-cvs "capabilities" command' '
+
+	echo "capabilities" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+import
+marks .git/info/cvs/$GITREMOTE/marks
+
+EOF
+	test_cmp expect actual
+
+'
+
+commit=$(git rev-parse "refs/cvs/$GITREMOTE/HEAD")
+
+test_expect_success '#4: git-remote-cvs "list" command' '
+
+	echo "list" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+$commit refs/cvs/$GITREMOTE/HEAD unchanged
+
+EOF
+	test_cmp expect actual
+
+'
+
+test_expect_success '#4: git-remote-cvs "import" command' '
+
+	echo "import refs/cvs/$GITREMOTE/HEAD" | git remote-cvs "$GITREMOTE" > actual &&
+	cat <<EOF >expect &&
+blob
+mark :24
+data 0
+
+blob
+mark :25
+data 142
+
+blob
+mark :26
+data 94
+
+commit refs/cvs/$GITREMOTE/_metadata
+mark :27
+data 42
+Updated metadata used by "git remote-cvs"
+
+from refs/cvs/$GITREMOTE/_metadata^0
+M 644 :24 CVS/marks
+M 644 :25 o_fortuna/1.2
+M 644 :26 tick/1.1
+
+EOF
+	grep -v -e "^committer " -e "\b[0-9a-f]\{40\}\b" actual > actual.filtered &&
+	test_cmp expect actual.filtered
+
+'
+
+test_expect_success '#4: Passing git-remote-cvs output to git-fast-import' '
+
+	git fast-import --quiet \
+		--import-marks=".git/info/cvs/$GITREMOTE/marks" \
+		--export-marks=".git/info/cvs/$GITREMOTE/marks" \
+		< actual &&
+	git gc
+
+'
+
+test_expect_success '#4: Verifying correctness of import' '
+
+	echo "verify HEAD" | git remote-cvs "$GITREMOTE"
+
+'
+
+test_done
diff --git a/t/t9801-remote-cvs-fetch.sh b/t/t9801-remote-cvs-fetch.sh
new file mode 100755
index 0000000..93d44a7
--- /dev/null
+++ b/t/t9801-remote-cvs-fetch.sh
@@ -0,0 +1,291 @@
+#!/bin/sh
+
+test_description='git remote-cvs basic tests'
+. ./test-lib.sh
+
+if ! test_have_prereq PYTHON; then
+	say 'skipping CVS foreign-vcs helper tests, python not available'
+	test_done
+fi
+
+CVS_EXEC=cvs
+CVS_OPTS="-f -q"
+CVS="$CVS_EXEC $CVS_OPTS"
+
+CVSROOT=$(pwd)/cvsroot
+export CVSROOT
+unset CVS_SERVER
+
+CVSMODULE=cvsmodule
+GITREMOTE=cvsremote
+
+if ! type $CVS_EXEC >/dev/null 2>&1
+then
+	say 'skipping remote-cvs tests, $CVS_EXEC not found'
+	test_done
+fi
+
+verify () {
+	git log --reverse --format="--- %T%n%s%n%n%b" "$GITREMOTE/$1" >actual &&
+	test_cmp "expect.$1" actual &&
+	echo "verify $1" | git remote-cvs "$GITREMOTE"
+}
+
+test_expect_success 'setup CVS repo' '$CVS init'
+
+test_expect_success 'create CVS module with initial commit' '
+
+	mkdir "$CVSROOT/$CVSMODULE" &&
+	$CVS co -d module-cvs $CVSMODULE &&
+	(
+		cd module-cvs &&
+		cat <<EOF >o_fortuna &&
+O Fortuna
+velut luna
+statu variabilis,
+
+semper crescis
+aut decrescis;
+vita detestabilis
+
+nunc obdurat
+et tunc curat
+ludo mentis aciem,
+
+egestatem,
+potestatem
+dissolvit ut glaciem.
+EOF
+		$CVS add o_fortuna &&
+		cat <<EOF >message &&
+add "O Fortuna" lyrics
+
+These public domain lyrics make an excellent sample text.
+EOF
+		$CVS commit -f -F message o_fortuna
+	)
+'
+
+test_expect_success 'set up CVS repo/module as a foreign remote' '
+
+	git config "user.name" "Test User"
+	git config "user.email" "test@example.com"
+	git config "remote.$GITREMOTE.vcs" cvs
+	git config "remote.$GITREMOTE.cvsRoot" "$CVSROOT"
+	git config "remote.$GITREMOTE.cvsModule" "$CVSMODULE"
+	git config "remote.$GITREMOTE.fetch" \
+		"+refs/cvs/$GITREMOTE/*:refs/remotes/$GITREMOTE/*"
+
+'
+
+test_expect_success 'initial fetch from CVS remote' '
+
+	cat <<EOF >expect.HEAD &&
+--- 0e06d780dedab23e683c686fb041daa9a84c936c
+add "O Fortuna" lyrics
+
+These public domain lyrics make an excellent sample text.
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'CVS commit' '
+
+	(
+		cd module-cvs &&
+		cat <<EOF >o_fortuna &&
+O Fortune,
+like the moon
+you are changeable,
+
+ever waxing
+and waning;
+hateful life
+
+first oppresses
+and then soothes
+as fancy takes it;
+
+poverty
+and power
+it melts them like ice.
+EOF
+		cat <<EOF >message &&
+translate to English
+
+My Latin is terrible.
+EOF
+		$CVS commit -f -F message o_fortuna
+	) &&
+	cat <<EOF >>expect.HEAD &&
+--- daa87269a5e00388135ad9542dc16ab6754466e5
+translate to English
+
+My Latin is terrible.
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'CVS commit with new file' '
+
+	(
+		cd module-cvs &&
+		echo 1 >tick &&
+		$CVS add tick &&
+		$CVS commit -f -m 1 tick
+	) &&
+	cat <<EOF >>expect.HEAD &&
+--- 486935b4fccecea9b64cbed3a797ebbcbe2b7461
+1
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'fetch without CVS changes' '
+
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'add 2 CVS commits' '
+
+	(
+		cd module-cvs &&
+		echo 2 >tick &&
+		$CVS commit -f -m 2 tick &&
+		echo 3 >tick &&
+		$CVS commit -f -m 3 tick
+	) &&
+	cat <<EOF >>expect.HEAD &&
+--- 83437ab3e57bf0a42915de5310e3419792b5a36f
+2
+
+
+--- 60fc50406a82dc6bd32dc6e5f7bd23e4c3cdf7ef
+3
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'CVS commit with removed file' '
+
+	(
+		cd module-cvs &&
+		$CVS remove -f tick &&
+		$CVS commit -f -m "remove file" tick
+	) &&
+	cat <<EOF >>expect.HEAD &&
+--- daa87269a5e00388135ad9542dc16ab6754466e5
+remove file
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'CVS commit with several new files' '
+
+	(
+		cd module-cvs &&
+		echo spam >spam &&
+		echo sausage >sausage &&
+		echo eggs >eggs &&
+		$CVS add spam sausage eggs &&
+		$CVS commit -f -m "spam, sausage, and eggs" spam sausage eggs
+	) &&
+	cat <<EOF >>expect.HEAD &&
+--- 3190dfce44a6d5e9916b4870dbf8f37d1ca4ddaf
+spam, sausage, and eggs
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD
+
+'
+
+test_expect_success 'new CVS branch' '
+
+	(
+		cd module-cvs &&
+		$CVS tag -b foo
+	) &&
+	cp expect.HEAD expect.foo &&
+	git fetch "$GITREMOTE" &&
+	verify HEAD &&
+	verify foo
+
+'
+
+test_expect_success 'CVS commit on branch' '
+
+	(
+		cd module-cvs &&
+		$CVS up -r foo &&
+		echo "spam spam spam" >spam &&
+		$CVS commit -f -m "commit on branch foo" spam
+	) &&
+	cat <<EOF >>expect.foo &&
+--- 1aba123e5c83898ce3a8b976cc6064d60246aef4
+commit on branch foo
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD &&
+	verify foo
+
+'
+
+test_expect_success 'create CVS tag' '
+
+	(
+		cd module-cvs &&
+		$CVS tag bar
+	) &&
+	cp expect.foo expect.bar &&
+	git fetch "$GITREMOTE" &&
+	verify HEAD &&
+	verify foo &&
+	verify bar
+
+'
+
+test_expect_success 'another CVS commit on branch' '
+
+	(
+		cd module-cvs &&
+		echo "spam spam spam spam spam spam" >> spam &&
+		$CVS commit -f -m "another commit on branch foo" spam
+	) &&
+	cat <<EOF >>expect.foo &&
+--- 15a2635e76e8e5a5a8746021643de317452f2340
+another commit on branch foo
+
+
+EOF
+	git fetch "$GITREMOTE" &&
+	verify HEAD &&
+	verify foo &&
+	verify bar
+
+'
+
+test_done
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 16/19] Fix the Makefile-generated path to the git_remote_cvs package in git-remote-cvs
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (14 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 15/19] Add simple selftests of git-remote-cvs functionality Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 17/19] More fixes to the git-remote-cvs installation procedure Sverre Rabbelier
                   ` (2 subsequent siblings)
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland

From: Johan Herland <johan@herland.net>

Junio discovered that the installed git-remote-cvs executable is unable
to find the installed git_remote_cvs Python package. This is due to the
'instlibdir' target of git_remote_cvs/Makefile being invoked with
insufficient arguments (prefix and DESTDIR). This patch fixes that.

Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 Makefile                |    2 +-
 git_remote_cvs/Makefile |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index a63f44d..9fb2747 100644
--- a/Makefile
+++ b/Makefile
@@ -1525,7 +1525,7 @@ endif # NO_PERL
 ifndef NO_PYTHON
 $(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py
 	$(QUIET_GEN)$(RM) $@ $@+ && \
-	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s --no-print-directory instlibdir` && \
+	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s --no-print-directory prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' instlibdir` && \
 	sed -e '1{' \
 	    -e '	s|#!.*python|#!$(PYTHON_PATH_SQ)|' \
 	    -e '}' \
diff --git a/git_remote_cvs/Makefile b/git_remote_cvs/Makefile
index 061c247..0d9eb31 100644
--- a/git_remote_cvs/Makefile
+++ b/git_remote_cvs/Makefile
@@ -28,7 +28,7 @@ install: $(pysetupfile)
 	$(PYTHON_PATH) $(pysetupfile) install --prefix $(DESTDIR_SQ)$(prefix)
 
 instlibdir: $(pysetupfile)
-	@echo "$(prefix)/$(PYLIBDIR)"
+	@echo "$(DESTDIR_SQ)$(prefix)/$(PYLIBDIR)"
 
 clean:
 	$(QUIET)$(PYTHON_PATH) $(pysetupfile) $(QUIETSETUP) clean -a
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 17/19] More fixes to the git-remote-cvs installation procedure
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (15 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 16/19] Fix the Makefile-generated path to the git_remote_cvs package in git-remote-cvs Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers Sverre Rabbelier
  2009-10-29 18:01 ` [PATCH 19/19] .gitignore: add git-remote-cvs Sverre Rabbelier
  18 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Johan Herland

From: Johan Herland <johan@herland.net>

- Makefile: Make sure git-remote-cvs is rebuilt when 'prefix' changes
  (by adding a dependency on GIT-CFLAGS). This prevents a regular 'make'
  followed by a 'make prefix=/somewhere/else install' from installing a
  non-working git-remote-cvs.

- Makefile: When mangling git-remote-cvs to add the git_remote_cvs install
  location to the Python search path, _replace_ the initial 'current dir'
  entry in sys.path (instead of merely prepending the install location).
  Hence, if the git_remote_cvs package is not installed at the correct
  location (and also not present in any of Python's default package dirs),
  then git-remote-cvs will fail loudly instead of silently falling back on
  the git_remote_cvs subdir in git.git.

- Allow for the git_remote_cvs install path to be overridden by the
  $GITPYTHONLIB environment variable.

- t/test-lib.sh: Set $GITPYTHONLIB (unless $GIT_TEST_INSTALLED is enabled)
  so that we test the currently built version of git_remote_cvs  (the one
  that is built in git_remote_cvs/build/lib) instead of a previously
  installed version.

- Another minor check and a line length fix.

Found-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johan Herland <johan@herland.net>
---

	Unchanged.

 Makefile      |   11 +++++++++--
 t/test-lib.sh |    9 +++++++++
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 9fb2747..6d37399 100644
--- a/Makefile
+++ b/Makefile
@@ -1523,13 +1523,20 @@ $(patsubst %.perl,%,$(SCRIPT_PERL)) git-instaweb: % : unimplemented.sh
 endif # NO_PERL
 
 ifndef NO_PYTHON
+$(patsubst %.py,%,$(SCRIPT_PYTHON)): GIT-CFLAGS
 $(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py
 	$(QUIET_GEN)$(RM) $@ $@+ && \
-	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s --no-print-directory prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' instlibdir` && \
+	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s \
+		--no-print-directory prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' \
+		instlibdir` && \
 	sed -e '1{' \
 	    -e '	s|#!.*python|#!$(PYTHON_PATH_SQ)|' \
 	    -e '}' \
-	    -e 's|^import sys.*|&; sys.path.insert(0, "@@INSTLIBDIR@@")|' \
+	    -e 's|^import sys.*|&; \\\
+	           import os; \\\
+	           sys.path[0] = os.environ.has_key("GITPYTHONLIB") and \\\
+	                         os.environ["GITPYTHONLIB"] or \\\
+	                         "@@INSTLIBDIR@@"|' \
 	    -e 's|@@INSTLIBDIR@@|'"$$INSTLIBDIR"'|g' \
 	    $@.py >$@+ && \
 	chmod +x $@+ && \
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0b991db..77ad23e 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -638,6 +638,15 @@ test -d ../templates/blt || {
 	error "You haven't built things yet, have you?"
 }
 
+if test -z "$GIT_TEST_INSTALLED"
+then
+	GITPYTHONLIB="$(pwd)/../git_remote_cvs/build/lib"
+	export GITPYTHONLIB
+	test -d ../git_remote_cvs/build || {
+		error "You haven't built git_remote_cvs yet, have you?"
+	}
+fi
+
 if ! test -x ../test-chmtime; then
 	echo >&2 'You need to build test-chmtime:'
 	echo >&2 'Run "make test-chmtime" in the source (toplevel) directory'
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (16 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 17/19] More fixes to the git-remote-cvs installation procedure Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-30  8:42   ` Johan Herland
  2009-10-29 18:01 ` [PATCH 19/19] .gitignore: add git-remote-cvs Sverre Rabbelier
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Sverre Rabbelier

This in an effort to allow future remote helpers written in python to
re-use the non-cvs-specific code.

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---

	This time with -M -C :).

 Makefile                                           |    8 +++---
 git-remote-cvs.py                                  |   14 +++++-----
 git_remote_cvs/__init__.py                         |   27 --------------------
 git_remote_cvs/setup.py                            |   17 ------------
 {git_remote_cvs => git_remote_helpers}/.gitignore  |    0
 {git_remote_cvs => git_remote_helpers}/Makefile    |    2 +-
 git_remote_helpers/__init__.py                     |   27 ++++++++++++++++++++
 .../cvs}/changeset.py                              |    2 +-
 .../cvs}/commit_states.py                          |    4 +-
 {git_remote_cvs => git_remote_helpers/cvs}/cvs.py  |    6 ++--
 .../cvs/revision_map.py                            |    6 ++--
 .../cvs/symbol_cache.py                            |    4 +-
 {git_remote_cvs => git_remote_helpers/git}/git.py  |    4 +-
 git_remote_helpers/setup.py                        |   17 ++++++++++++
 {git_remote_cvs => git_remote_helpers}/util.py     |    0
 t/test-lib.sh                                      |    4 +-
 16 files changed, 71 insertions(+), 71 deletions(-)
 delete mode 100644 git_remote_cvs/__init__.py
 delete mode 100644 git_remote_cvs/setup.py
 rename {git_remote_cvs => git_remote_helpers}/.gitignore (100%)
 rename {git_remote_cvs => git_remote_helpers}/Makefile (92%)
 create mode 100644 git_remote_helpers/__init__.py
 create mode 100644 git_remote_helpers/cvs/__init__.py
 rename {git_remote_cvs => git_remote_helpers/cvs}/changeset.py (98%)
 rename {git_remote_cvs => git_remote_helpers/cvs}/commit_states.py (95%)
 rename {git_remote_cvs => git_remote_helpers/cvs}/cvs.py (99%)
 rename git_remote_cvs/cvs_revision_map.py => git_remote_helpers/cvs/revision_map.py (98%)
 rename git_remote_cvs/cvs_symbol_cache.py => git_remote_helpers/cvs/symbol_cache.py (98%)
 create mode 100644 git_remote_helpers/git/__init__.py
 rename {git_remote_cvs => git_remote_helpers/git}/git.py (99%)
 create mode 100644 git_remote_helpers/setup.py
 rename {git_remote_cvs => git_remote_helpers}/util.py (100%)

diff --git a/Makefile b/Makefile
index 6d37399..70b8195 100644
--- a/Makefile
+++ b/Makefile
@@ -1402,7 +1402,7 @@ ifndef NO_PERL
 	$(QUIET_SUBDIR0)perl $(QUIET_SUBDIR1) PERL_PATH='$(PERL_PATH_SQ)' prefix='$(prefix_SQ)' all
 endif
 ifndef NO_PYTHON
-	$(QUIET_SUBDIR0)git_remote_cvs $(QUIET_SUBDIR1) PYTHON_PATH='$(PYTHON_PATH_SQ)' prefix='$(prefix_SQ)' all
+	$(QUIET_SUBDIR0)git_remote_helpers $(QUIET_SUBDIR1) PYTHON_PATH='$(PYTHON_PATH_SQ)' prefix='$(prefix_SQ)' all
 endif
 	$(QUIET_SUBDIR0)templates $(QUIET_SUBDIR1)
 
@@ -1526,7 +1526,7 @@ ifndef NO_PYTHON
 $(patsubst %.py,%,$(SCRIPT_PYTHON)): GIT-CFLAGS
 $(patsubst %.py,%,$(SCRIPT_PYTHON)): % : %.py
 	$(QUIET_GEN)$(RM) $@ $@+ && \
-	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_cvs -s \
+	INSTLIBDIR=`MAKEFLAGS= $(MAKE) -C git_remote_helpers -s \
 		--no-print-directory prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' \
 		instlibdir` && \
 	sed -e '1{' \
@@ -1776,7 +1776,7 @@ ifndef NO_PERL
 	$(MAKE) -C perl prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' install
 endif
 ifndef NO_PYTHON
-	$(MAKE) -C git_remote_cvs prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' install
+	$(MAKE) -C git_remote_helpers prefix='$(prefix_SQ)' DESTDIR='$(DESTDIR_SQ)' install
 endif
 ifndef NO_TCLTK
 	$(MAKE) -C gitk-git install
@@ -1893,7 +1893,7 @@ ifndef NO_PERL
 	$(MAKE) -C perl clean
 endif
 ifndef NO_PYTHON
-	$(MAKE) -C git_remote_cvs clean
+	$(MAKE) -C git_remote_helpers clean
 endif
 	$(MAKE) -C templates/ clean
 	$(MAKE) -C t/ clean
diff --git a/git-remote-cvs.py b/git-remote-cvs.py
index 94b61e7..f322f96 100755
--- a/git-remote-cvs.py
+++ b/git-remote-cvs.py
@@ -36,16 +36,16 @@ configuration details of this remote helper.
 import sys
 import os
 
-from git_remote_cvs.util import (debug, error, die, ProgressIndicator,
+from git_remote_helpers.util import (debug, error, die, ProgressIndicator,
                                  run_command)
-from git_remote_cvs.cvs import CVSState, CVSWorkDir, fetch_revs
-from git_remote_cvs.git import (get_git_dir, parse_git_config, git_config_bool,
+from git_remote_helpers.cvs.cvs import CVSState, CVSWorkDir, fetch_revs
+from git_remote_helpers.git.git import (get_git_dir, parse_git_config, git_config_bool,
                                 valid_git_ref, GitObjectFetcher, GitRefMap,
                                 GitFICommit, GitFastImport, GitNotes)
-from git_remote_cvs.cvs_symbol_cache import CVSSymbolCache
-from git_remote_cvs.commit_states import CommitStates
-from git_remote_cvs.cvs_revision_map import CVSRevisionMap, CVSStateMap
-from git_remote_cvs.changeset import build_changesets_from_revs
+from git_remote_helpers.cvs.symbol_cache import CVSSymbolCache
+from git_remote_helpers.cvs.commit_states import CommitStates
+from git_remote_helpers.cvs.revision_map import CVSRevisionMap, CVSStateMap
+from git_remote_helpers.cvs.changeset import build_changesets_from_revs
 
 
 class Config(object):
diff --git a/git_remote_cvs/__init__.py b/git_remote_cvs/__init__.py
deleted file mode 100644
index 4956d2d..0000000
--- a/git_remote_cvs/__init__.py
+++ /dev/null
@@ -1,27 +0,0 @@
-#!/usr/bin/env python
-
-"""Support library package for git-remote-cvs.
-
-git-remote-cvs is a Git remote helper command that interfaces with a
-CVS repository to provide automatic import of CVS history into a Git
-repository.
-
-This package provides the support library needed by git-remote-cvs.
-The following modules are included:
-
-- cvs - Interaction with CVS repositories
-
-- cvs_symbol_cache - Local CVS symbol cache
-
-- changeset - Collect individual CVS revisions into commits
-
-- git - Interaction with Git repositories
-
-- commit_states - Map Git commits to CVS states
-
-- cvs_revision_map - Map CVS revisions to various metainformation
-
-- util - General utility functionality use by the other modules in
-         this package, and also used directly by git-remote-cvs.
-
-"""
diff --git a/git_remote_cvs/setup.py b/git_remote_cvs/setup.py
deleted file mode 100644
index 21f521a..0000000
--- a/git_remote_cvs/setup.py
+++ /dev/null
@@ -1,17 +0,0 @@
-#!/usr/bin/env python
-
-"""Distutils build/install script for the git_remote_cvs package."""
-
-from distutils.core import setup
-
-setup(
-    name = 'git_remote_cvs',
-    version = '0.1.0',
-    description = 'Git remote helper program for CVS repositories',
-    license = 'GPLv2',
-    author = 'The Git Community',
-    author_email = 'git@vger.kernel.org',
-    url = 'http://www.git-scm.com/',
-    package_dir = {'git_remote_cvs': ''},
-    packages = ['git_remote_cvs'],
-)
diff --git a/git_remote_cvs/.gitignore b/git_remote_helpers/.gitignore
similarity index 100%
rename from git_remote_cvs/.gitignore
rename to git_remote_helpers/.gitignore
diff --git a/git_remote_cvs/Makefile b/git_remote_helpers/Makefile
similarity index 92%
rename from git_remote_cvs/Makefile
rename to git_remote_helpers/Makefile
index 0d9eb31..c62dfd0 100644
--- a/git_remote_cvs/Makefile
+++ b/git_remote_helpers/Makefile
@@ -1,5 +1,5 @@
 #
-# Makefile for the git_remote_cvs python support modules
+# Makefile for the git_remote_helpers python support modules
 #
 pysetupfile:=setup.py
 
diff --git a/git_remote_helpers/__init__.py b/git_remote_helpers/__init__.py
new file mode 100644
index 0000000..38c7b5f
--- /dev/null
+++ b/git_remote_helpers/__init__.py
@@ -0,0 +1,27 @@
+#!/usr/bin/env python
+
+"""Support library package for git remote helpers.
+
+Git remote helpers are helper commands that interfaces with a non-git
+repository to provide automatic import of non-git history into a Git
+repository.
+
+This package provides the support library needed by these helpers..
+The following modules are included:
+
+- cvs/cvs - Interaction with CVS repositories
+
+- cvs/symbol_cache - Local CVS symbol cache
+
+- cvs/changeset - Collect individual CVS revisions into commits
+
+- cvs/commit_states - Map Git commits to CVS states
+
+- cvs/revision_map - Map CVS revisions to various metainformation
+
+- git/git - Interaction with Git repositories
+
+- util - General utility functionality use by the other modules in
+         this package, and also used directly by git-remote-cvs.
+
+"""
diff --git a/git_remote_helpers/cvs/__init__.py b/git_remote_helpers/cvs/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/git_remote_cvs/changeset.py b/git_remote_helpers/cvs/changeset.py
similarity index 98%
rename from git_remote_cvs/changeset.py
rename to git_remote_helpers/cvs/changeset.py
index 9eea9d2..4865c37 100644
--- a/git_remote_cvs/changeset.py
+++ b/git_remote_helpers/cvs/changeset.py
@@ -11,7 +11,7 @@ module tries to reconstruct this notion of related revisions.
 
 """
 
-from git_remote_cvs.util import debug, error, die
+from git_remote_helpers.util import debug, error, die
 
 
 class Changeset(object):
diff --git a/git_remote_cvs/commit_states.py b/git_remote_helpers/cvs/commit_states.py
similarity index 95%
rename from git_remote_cvs/commit_states.py
rename to git_remote_helpers/cvs/commit_states.py
index 2e0af6c..a796bb1 100644
--- a/git_remote_cvs/commit_states.py
+++ b/git_remote_helpers/cvs/commit_states.py
@@ -2,8 +2,8 @@
 
 """Code for relating Git commits to corresponding CVSState objects."""
 
-from git_remote_cvs.util import debug, error, die
-from git_remote_cvs.cvs import CVSState
+from git_remote_helpers.util import debug, error, die
+from git_remote_helpers.cvs.cvs import CVSState
 
 
 class CommitStates(object):
diff --git a/git_remote_cvs/cvs.py b/git_remote_helpers/cvs/cvs.py
similarity index 99%
rename from git_remote_cvs/cvs.py
rename to git_remote_helpers/cvs/cvs.py
index f870ae0..a1a02be 100644
--- a/git_remote_cvs/cvs.py
+++ b/git_remote_helpers/cvs/cvs.py
@@ -17,9 +17,9 @@ import time
 from calendar import timegm
 import unittest
 
-from git_remote_cvs.util import (debug, error, die, ProgressIndicator,
-                                 start_command, run_command,
-                                 file_reader_method, file_writer_method)
+from git_remote_helpers.util import (debug, error, die, ProgressIndicator,
+                                     start_command, run_command,
+                                     file_reader_method, file_writer_method)
 
 
 class CVSNum(object):
diff --git a/git_remote_cvs/cvs_revision_map.py b/git_remote_helpers/cvs/revision_map.py
similarity index 98%
rename from git_remote_cvs/cvs_revision_map.py
rename to git_remote_helpers/cvs/revision_map.py
index 0e65ba6..b7b17bc 100644
--- a/git_remote_cvs/cvs_revision_map.py
+++ b/git_remote_helpers/cvs/revision_map.py
@@ -18,9 +18,9 @@ CVSStateMap - provides a mapping from CVS states to corresponding
 
 import os
 
-from git_remote_cvs.util import debug, error, die, file_reader_method
-from git_remote_cvs.cvs import CVSNum, CVSDate
-from git_remote_cvs.git import GitFICommit
+from git_remote_helpers.util import debug, error, die, file_reader_method
+from git_remote_helpers.cvs.cvs import CVSNum, CVSDate
+from git_remote_helpers.git.git import GitFICommit
 
 
 class _CVSPathInfo(object):
diff --git a/git_remote_cvs/cvs_symbol_cache.py b/git_remote_helpers/cvs/symbol_cache.py
similarity index 98%
rename from git_remote_cvs/cvs_symbol_cache.py
rename to git_remote_helpers/cvs/symbol_cache.py
index cc8d88b..6bd1715 100644
--- a/git_remote_cvs/cvs_symbol_cache.py
+++ b/git_remote_helpers/cvs/symbol_cache.py
@@ -23,8 +23,8 @@ synchronizing _all_ CVS symbols in one operation (by executing
 import sys
 import os
 
-from git_remote_cvs.util import debug, error, die, ProgressIndicator
-from git_remote_cvs.cvs import CVSNum, CVSState, CVSLogParser
+from git_remote_helpers.util import debug, error, die, ProgressIndicator
+from git_remote_helpers.cvs.cvs import CVSNum, CVSState, CVSLogParser
 
 
 class CVSSymbolStateLister(CVSLogParser):
diff --git a/git_remote_helpers/git/__init__.py b/git_remote_helpers/git/__init__.py
new file mode 100644
index 0000000..e69de29
diff --git a/git_remote_cvs/git.py b/git_remote_helpers/git/git.py
similarity index 99%
rename from git_remote_cvs/git.py
rename to git_remote_helpers/git/git.py
index cf037e3..8e1bc77 100644
--- a/git_remote_cvs/git.py
+++ b/git_remote_helpers/git/git.py
@@ -12,8 +12,8 @@ from binascii import hexlify
 from cStringIO import StringIO
 import unittest
 
-from git_remote_cvs.util import debug, error, die, start_command, run_command
-from git_remote_cvs.cvs import CVSDate
+from git_remote_helpers.util import debug, error, die, start_command, run_command
+from git_remote_helpers.cvs.cvs import CVSDate
 
 
 def get_git_dir ():
diff --git a/git_remote_helpers/setup.py b/git_remote_helpers/setup.py
new file mode 100644
index 0000000..327f0ff
--- /dev/null
+++ b/git_remote_helpers/setup.py
@@ -0,0 +1,17 @@
+#!/usr/bin/env python
+
+"""Distutils build/install script for the git_remote_helpers package."""
+
+from distutils.core import setup
+
+setup(
+    name = 'git_remote_helpers',
+    version = '0.1.0',
+    description = 'Git remote helper program for non-git repositories',
+    license = 'GPLv2',
+    author = 'The Git Community',
+    author_email = 'git@vger.kernel.org',
+    url = 'http://www.git-scm.com/',
+    package_dir = {'git_remote_helpers': ''},
+    packages = ['git_remote_helpers', 'git_remote_helpers.git', 'git_remote_helpers.cvs'],
+)
diff --git a/git_remote_cvs/util.py b/git_remote_helpers/util.py
similarity index 100%
rename from git_remote_cvs/util.py
rename to git_remote_helpers/util.py
diff --git a/t/test-lib.sh b/t/test-lib.sh
index 77ad23e..c7530aa 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -640,9 +640,9 @@ test -d ../templates/blt || {
 
 if test -z "$GIT_TEST_INSTALLED"
 then
-	GITPYTHONLIB="$(pwd)/../git_remote_cvs/build/lib"
+	GITPYTHONLIB="$(pwd)/../git_remote_helpers/build/lib"
 	export GITPYTHONLIB
-	test -d ../git_remote_cvs/build || {
+	test -d ../git_remote_helpers/build || {
 		error "You haven't built git_remote_cvs yet, have you?"
 	}
 fi
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH 19/19] .gitignore: add git-remote-cvs
  2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
                   ` (17 preceding siblings ...)
  2009-10-29 18:01 ` [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers Sverre Rabbelier
@ 2009-10-29 18:01 ` Sverre Rabbelier
  2009-10-29 18:05   ` Shawn O. Pearce
  18 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:01 UTC (permalink / raw)
  To: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland
  Cc: Sverre Rabbelier

Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
---
 .gitignore |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/.gitignore b/.gitignore
index 51a37b1..461f6ee 100644
--- a/.gitignore
+++ b/.gitignore
@@ -105,6 +105,7 @@ git-reflog
 git-relink
 git-remote
 git-remote-curl
+git-remote-cvs
 git-repack
 git-replace
 git-repo-config
-- 
1.6.5.2.291.gf76a3

^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/19] .gitignore: add git-remote-cvs
  2009-10-29 18:01 ` [PATCH 19/19] .gitignore: add git-remote-cvs Sverre Rabbelier
@ 2009-10-29 18:05   ` Shawn O. Pearce
  2009-10-29 18:08     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Shawn O. Pearce @ 2009-10-29 18:05 UTC (permalink / raw)
  To: Sverre Rabbelier
  Cc: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland

Sverre Rabbelier <srabbelier@gmail.com> wrote:
> diff --git a/.gitignore b/.gitignore
> index 51a37b1..461f6ee 100644
> --- a/.gitignore
> +++ b/.gitignore
> @@ -105,6 +105,7 @@ git-reflog
>  git-relink
>  git-remote
>  git-remote-curl
> +git-remote-cvs
>  git-repack
>  git-replace
>  git-repo-config

Uh, squash to "[PATCH 14/19] git-remote-cvs: Remote help..." ?

-- 
Shawn.

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 19/19] .gitignore: add git-remote-cvs
  2009-10-29 18:05   ` Shawn O. Pearce
@ 2009-10-29 18:08     ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-29 18:08 UTC (permalink / raw)
  To: Shawn O. Pearce
  Cc: Git List, Johannes Schindelin, Daniel Barkalow, Johan Herland

Heya,

On Thu, Oct 29, 2009 at 11:05, Shawn O. Pearce <spearce@spearce.org> wrote:
> Uh, squash to "[PATCH 14/19] git-remote-cvs: Remote help..." ?

Durr, my bad, done locally, not sending another version yet though :P.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/19] Allow fetch to modify refs
  2009-10-29 18:01 ` [PATCH 02/19] Allow fetch to modify refs Sverre Rabbelier
@ 2009-10-30  5:56   ` Daniel Barkalow
  2009-10-30 12:22     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-10-30  5:56 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Thu, 29 Oct 2009, Sverre Rabbelier wrote:

> From: Daniel Barkalow <barkalow@iabervon.org>
> 
> This allows the transport to use the null sha1 for a ref reported to
> be present in the remote repository to indicate that a ref exists but
> its actual value is presently unknown and will be set if the objects
> are fetched.
> 
> Also adds documentation to the API to specify exactly what the methods
> should do and how they should interpret arguments.
> 
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
> ---
> 
> 	Signed off by me because I rebased it on master and fixed the
> 	conflicts with nico's patch.
> 
>  builtin-clone.c    |    5 +++--
>  transport-helper.c |    4 ++--
>  transport.c        |   13 +++++++------
>  transport.h        |   41 +++++++++++++++++++++++++++++++++++++++--
>  4 files changed, 51 insertions(+), 12 deletions(-)
> 
> diff --git a/builtin-clone.c b/builtin-clone.c
> index 5762a6f..0042bee 100644
> --- a/builtin-clone.c
> +++ b/builtin-clone.c
> @@ -526,8 +526,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  
>  		refs = transport_get_remote_refs(transport);
>  		if (refs) {
> -			mapped_refs = wanted_peer_refs(refs, refspec);
> -			transport_fetch_refs(transport, mapped_refs);
> +			struct ref *ref_cpy = wanted_peer_refs(refs, refspec);
> +			mapped_refs = ref_cpy;
> +			transport_fetch_refs(transport, ref_cpy);

Just drop this hunk; the change doesn't actually do anything. Otherwise, 
the patch matches what I have.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls
  2009-10-29 18:01 ` [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls Sverre Rabbelier
@ 2009-10-30  6:02   ` Daniel Barkalow
  2009-10-30 12:24     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-10-30  6:02 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Thu, 29 Oct 2009, Sverre Rabbelier wrote:

> From: Daniel Barkalow <barkalow@iabervon.org>
> 
> For fetch and ls-remote, which use the first url of a remote, have
> transport_get() determine this by passing a remote and passing NULL
> for the url. For push, which uses every url of a remote, use each url
> in turn if there are any, and use NULL if there are none.
> 
> This will allow the transport code to do something different if the
> location is not specified with a url.
> 
> Also, have the message for a fetch say "foreign" if there is no url.
> 
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
> ---
> 
> 	I rebased this on master and had major conflicts with the
> 	recent 'advice' series, Daniel, please have a look at this to
> 	see whether it is sane at all.

Yup, this is right. Perhaps I should try to get the refactor in 
builtin-push to use push_with_options() without any behavior change into 
master ahead of the rest of this series, to avoid having to deal with this 
sort of conflict every time someone touches this section of code, which 
has happened several times recently now. Anyway, you correctly understood 
the intended change here.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-10-29 18:01 ` [RFC PATCH 06/19] Factor ref updating out of fetch_with_import Sverre Rabbelier
@ 2009-10-30  7:10   ` Daniel Barkalow
  2009-10-30 12:57     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-10-30  7:10 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Thu, 29 Oct 2009, Sverre Rabbelier wrote:

> Also allow the new update_refs to actually update the refs set, this
> way the remote helper can set the value of previously unknown refs.
> 
> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
> ---
> 
> 	Daniel, if we can get wanted_peer_refs to keep HEAD as a
> 	wanted ref somehow this patch could be a lot simpler.

Actually, I think the problem is that builtin-clone will always default to 
setting up a refspec of "refs/heads/*:refs/remotes/origin/*", but that 
doesn't actually make any sense if the source repository isn't presented 
as having refs names like "refs/heads/*". The immediate problem that you 
don't get HEAD because it's a symref to something outside the pattern is 
somewhat secondary to the general problem that you don't get anything at 
all, because everything's outside the pattern.

I don't really think that presenting the real refs in refs/<vcs>/*, and 
having the list give symrefs to them from refs/heads/* and refs/tags/* 
really makes sense; I think it would be better to rework fetch_with_import 
and the list provided by a foreign vcs helper such that it can present 
refs with normal names (refs/heads/*, refs/tags/*, etc) if the foreign vcs 
has standards that correspond to branches and tags. Then "clone" would 
work normally.

Or, perhaps, there should be some other way of allowing git users to take 
advantage of foreign vcs standards; I don't have a good perspective on 
this, since I only presently use git and a foreign vcs without any useful
standards. But I think it should be such that the transport user sees as 
close as possible to a native git layout, and clone continues to rely on 
the layout being normal, instead of presenting a weird layout mapped into 
a normal layout or something like that and working around transport users 
not expecting this.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/19] Allow helpers to request marks for fast-import
  2009-10-29 18:01 ` [PATCH 10/19] Allow helpers to request marks for fast-import Sverre Rabbelier
@ 2009-10-30  8:21   ` Johan Herland
  2009-10-30 12:26     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Johan Herland @ 2009-10-30  8:21 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Daniel Barkalow

On Thursday 29 October 2009, Sverre Rabbelier wrote:
> From: Johan Herland <johan@herland.net>
> 
> The 'marks' capability is reported by the remote helper if it requires
> the fast-import marks database to loaded/saved by any git-fast-import
> process that is provided by the transport machinery. The feature is
> advertised along with exactly one argument: the location of the file
> containing the marks database.
> 
> Signed-off-by: Johan Herland <johan@herland.net>
> Signed-off-by: Daniel Barkalow <barkalow@iabervon.org>
> ---
> 
> 	Unchanged.

Please drop this patch from the series. The functionality is not needed, 
since we'll use the fast-import "option" command from the sr/gfi-options 
series instead.


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/19] 1/2: Add Python support library for CVS remote helper
  2009-10-29 18:01 ` [PATCH 12/19] 1/2: Add Python support library for CVS remote helper Sverre Rabbelier
@ 2009-10-30  8:33   ` Johan Herland
  2009-10-30 12:27     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Johan Herland @ 2009-10-30  8:33 UTC (permalink / raw)
  To: Sverre Rabbelier
  Cc: Git List, Johannes Schindelin, Daniel Barkalow, David Aguilar

On Thursday 29 October 2009, Sverre Rabbelier wrote:
> From: Johan Herland <johan@herland.net>
> 
> This patch introduces parts of a Python package called "git_remote_cvs"
> containing the building blocks of the CVS remote helper.
> The CVS remote helper itself is NOT part of this patch.
> 
> This patch has been improved by the following contributions:
> - David Aguilar: Lots of Python coding style fixes
> 
> Cc: David Aguilar <davvid@gmail.com>
> Signed-off-by: Johan Herland <johan@herland.net>
> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
> ---
> 
> 	This has my patch to util.py squashed in.

Why? Or: why that one, and not the others? Also, you might want to mention 
your contribution in the commit message itself.

> diff --git a/git_remote_cvs/util.py b/git_remote_cvs/util.py
> new file mode 100644
> index 0000000..d3ca487
> --- /dev/null
> +++ b/git_remote_cvs/util.py
> @@ -0,0 +1,194 @@
[snip]
> +
> +def notify(msg, *args):
> +	"""Print a message to stderr."""
> +	print >> sys.stderr, msg % args
> +
> +def debug (msg, *args):
> +    """Print a debug message to stderr when DEBUG is enabled."""
> +    if DEBUG:
> +        print >> sys.stderr, msg % args
> +
> +def error (msg, *args):
> +    """Print an error message to stderr."""
> +    print >> sys.stderr, "ERROR:", msg % args
> +
> +def warn(msg, *args):
> +	"""Print a warning message to stderr."""
> +	print >> sys.stderr, "warning:", msg % args
> +
> +def die (msg, *args):
> +    """Print as error message to stderr and exit the program."""
> +    error(msg, *args)
> +    sys.exit(1)
> +
> +

It seems the two functions you add (notify() and warn()) have a different 
indentation than the existing code (which uses 4 spaces). Please fix.

(When I first introduced these Python patches, there was a discussion on the 
differences in indentation/style between the Git C code, and Python code, 
and it was decided to follow the Python conventions, to make the code more 
inviting to the Python community.)


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers
  2009-10-29 18:01 ` [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers Sverre Rabbelier
@ 2009-10-30  8:42   ` Johan Herland
  2009-10-30 12:29     ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Johan Herland @ 2009-10-30  8:42 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Daniel Barkalow

On Thursday 29 October 2009, Sverre Rabbelier wrote:
> This in an effort to allow future remote helpers written in python to
> re-use the non-cvs-specific code.
> 
> Signed-off-by: Sverre Rabbelier <srabbelier@gmail.com>
> ---

[snip]

> diff --git a/git_remote_helpers/__init__.py
>  b/git_remote_helpers/__init__.py new file mode 100644
> index 0000000..38c7b5f
> --- /dev/null
> +++ b/git_remote_helpers/__init__.py
> @@ -0,0 +1,27 @@
> +#!/usr/bin/env python
> +
> +"""Support library package for git remote helpers.
> +
> +Git remote helpers are helper commands that interfaces with a non-git
> +repository to provide automatic import of non-git history into a Git
> +repository.
> +
> +This package provides the support library needed by these helpers..
> +The following modules are included:
> +
> +- cvs/cvs - Interaction with CVS repositories
> +
> +- cvs/symbol_cache - Local CVS symbol cache
> +
> +- cvs/changeset - Collect individual CVS revisions into commits
> +
> +- cvs/commit_states - Map Git commits to CVS states
> +
> +- cvs/revision_map - Map CVS revisions to various metainformation
> +
> +- git/git - Interaction with Git repositories

Since this is Python documentation within a package, I'd rather refer to the 
python modules as _modules_ and not files. I.e. please use '.' instead of 
'/':

+- cvs.cvs - Interaction with CVS repositories
+
+- cvs.symbol_cache - Local CVS symbol cache
+
+- cvs.changeset - Collect individual CVS revisions into commits
+
+- cvs.commit_states - Map Git commits to CVS states
+
+- cvs.revision_map - Map CVS revisions to various metainformation
+
+- git.git - Interaction with Git repositories


> +
> +- util - General utility functionality use by the other modules in
> +         this package, and also used directly by git-remote-cvs.

Probably you should drop the direct reference to git-remote-cvs.

> diff --git a/git_remote_cvs/util.py b/git_remote_helpers/util.py
> similarity index 100%
> rename from git_remote_cvs/util.py
> rename to git_remote_helpers/util.py
> diff --git a/t/test-lib.sh b/t/test-lib.sh
> index 77ad23e..c7530aa 100644
> --- a/t/test-lib.sh
> +++ b/t/test-lib.sh
> @@ -640,9 +640,9 @@ test -d ../templates/blt || {
> 
>  if test -z "$GIT_TEST_INSTALLED"
>  then
> -	GITPYTHONLIB="$(pwd)/../git_remote_cvs/build/lib"
> +	GITPYTHONLIB="$(pwd)/../git_remote_helpers/build/lib"
>  	export GITPYTHONLIB
> -	test -d ../git_remote_cvs/build || {
> +	test -d ../git_remote_helpers/build || {
>  		error "You haven't built git_remote_cvs yet, have you?"

Please also s/git_remote_cvs/git_remote_helpers/ in the error message.

Otherwise, it all looks good :)


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/19] Allow fetch to modify refs
  2009-10-30  5:56   ` Daniel Barkalow
@ 2009-10-30 12:22     ` Sverre Rabbelier
  2009-10-30 15:16       ` Daniel Barkalow
  0 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:22 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

Heya,

On Thu, Oct 29, 2009 at 22:56, Daniel Barkalow <barkalow@iabervon.org> wrote:
>> +++ b/builtin-clone.c
>> @@ -526,8 +526,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>>
>>               refs = transport_get_remote_refs(transport);
>>               if (refs) {
>> -                     mapped_refs = wanted_peer_refs(refs, refspec);
>> -                     transport_fetch_refs(transport, mapped_refs);
>> +                     struct ref *ref_cpy = wanted_peer_refs(refs, refspec);
>> +                     mapped_refs = ref_cpy;
>> +                     transport_fetch_refs(transport, ref_cpy);
>
> Just drop this hunk; the change doesn't actually do anything. Otherwise,
> the patch matches what I have.

Am I missing something? I thought we need this hunk since mapped_refs
is const, and transport_fetch_refs takes a non-const argument?

builtin-clone.c: In function ‘cmd_clone’:
builtin-clone.c:531: warning: passing argument 2 of
‘transport_fetch_refs’ discards qualifiers from pointer target type

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes  having urls
  2009-10-30  6:02   ` Daniel Barkalow
@ 2009-10-30 12:24     ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

Heya,

On Thu, Oct 29, 2009 at 23:02, Daniel Barkalow <barkalow@iabervon.org> wrote:
> Perhaps I should try to get the refactor in
> builtin-push to use push_with_options() without any behavior change into
> master ahead of the rest of this series.

I think that's a good idea in general :).

> Anyway, you correctly understood the intended change here.

Ok, nice.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/19] Allow helpers to request marks for fast-import
  2009-10-30  8:21   ` Johan Herland
@ 2009-10-30 12:26     ` Sverre Rabbelier
  2009-10-31 12:04       ` Johan Herland
  0 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:26 UTC (permalink / raw)
  To: Johan Herland; +Cc: Git List, Johannes Schindelin, Daniel Barkalow

Heya,

On Fri, Oct 30, 2009 at 01:21, Johan Herland <johan@herland.net> wrote:
> Please drop this patch from the series. The functionality is not needed,
> since we'll use the fast-import "option" command from the sr/gfi-options
> series instead.

In that case I will rebase the series on top of sr/gfi-options then as
soon as I reroll that one. Also, do you need to change anything else
in git-remote-cvs to do that?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 12/19] 1/2: Add Python support library for CVS remote  helper
  2009-10-30  8:33   ` Johan Herland
@ 2009-10-30 12:27     ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:27 UTC (permalink / raw)
  To: Johan Herland
  Cc: Git List, Johannes Schindelin, Daniel Barkalow, David Aguilar

Heya,

On Fri, Oct 30, 2009 at 01:33, Johan Herland <johan@herland.net> wrote:
> Why? Or: why that one, and not the others? Also, you might want to mention
> your contribution in the commit message itself.

What others? And I thought my contributions were so minor it doesn't
really matter that much :).
> It seems the two functions you add (notify() and warn()) have a different
> indentation than the existing code (which uses 4 spaces). Please fix.

That's weird, will fix.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 18/19] Refactor git_remote_cvs to a more generic  git_remote_helpers
  2009-10-30  8:42   ` Johan Herland
@ 2009-10-30 12:29     ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:29 UTC (permalink / raw)
  To: Johan Herland; +Cc: Git List, Johannes Schindelin, Daniel Barkalow

Heya,

On Fri, Oct 30, 2009 at 01:42, Johan Herland <johan@herland.net> wrote:
>> +- git/git - Interaction with Git repositories
>
> Since this is Python documentation within a package, I'd rather refer to the
> python modules as _modules_ and not files. I.e. please use '.' instead of
> '/':

Ok, will do.

>> +- util - General utility functionality use by the other modules in
>> +         this package, and also used directly by git-remote-cvs.
>
> Probably you should drop the direct reference to git-remote-cvs.

Ah, yes, my bad.

>> +     test -d ../git_remote_helpers/build || {
>>               error "You haven't built git_remote_cvs yet, have you?"
>
> Please also s/git_remote_cvs/git_remote_helpers/ in the error message.

That's weird, I thought I did that, ah well, will fix.

> Otherwise, it all looks good :)

Thanks for the review.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-10-30  7:10   ` Daniel Barkalow
@ 2009-10-30 12:57     ` Sverre Rabbelier
  2009-10-30 16:04       ` Daniel Barkalow
  0 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 12:57 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

On Fri, Oct 30, 2009 at 00:10, Daniel Barkalow <barkalow@iabervon.org> wrote:
> Actually, I think the problem is that builtin-clone will always default to
> setting up a refspec of "refs/heads/*:refs/remotes/origin/*", but that
> doesn't actually make any sense if the source repository isn't presented
> as having refs names like "refs/heads/*". The immediate problem that you
> don't get HEAD because it's a symref to something outside the pattern is
> somewhat secondary to the general problem that you don't get anything at
> all, because everything's outside the pattern.

Ah, I didn't realize that as long as HEAD is a symref to something in
refs/heads/* it would be fetched.

> I don't really think that presenting the real refs in refs/<vcs>/*, and
> having the list give symrefs to them from refs/heads/* and refs/tags/*
> really makes sense; I think it would be better to rework fetch_with_import
> and the list provided by a foreign vcs helper such that it can present
> refs with normal names (refs/heads/*, refs/tags/*, etc) if the foreign vcs
> has standards that correspond to branches and tags. Then "clone" would
> work normally.

Perhaps we could introduce an additional phase before import to set
the default refspec? If we could tell git that we want
"refs/heads/*:refs/hg/origin/*" before line 468 "if (option_bare) {",
that would save us a lot of trouble. Does that sound like a good idea?
It would mean we have to move the transport_get part up a bit, but I
don't think that will be a problem. A svn helper for example might
respond the following to the "refspec" command:
"refs/heads/trunkr:refs/svn/origin/trunk"
"refs/tags/*:refs/svn/origin/tags/*"
"refs/heads/*:refs/svn/origin/branches/*"

How does that sound?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/19] Allow fetch to modify refs
  2009-10-30 12:22     ` Sverre Rabbelier
@ 2009-10-30 15:16       ` Daniel Barkalow
  2009-10-30 21:24         ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-10-30 15:16 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

[-- Attachment #1: Type: TEXT/PLAIN, Size: 1314 bytes --]

On Fri, 30 Oct 2009, Sverre Rabbelier wrote:

> Heya,
> 
> On Thu, Oct 29, 2009 at 22:56, Daniel Barkalow <barkalow@iabervon.org> wrote:
> >> +++ b/builtin-clone.c
> >> @@ -526,8 +526,9 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >>
> >>               refs = transport_get_remote_refs(transport);
> >>               if (refs) {
> >> -                     mapped_refs = wanted_peer_refs(refs, refspec);
> >> -                     transport_fetch_refs(transport, mapped_refs);
> >> +                     struct ref *ref_cpy = wanted_peer_refs(refs, refspec);
> >> +                     mapped_refs = ref_cpy;
> >> +                     transport_fetch_refs(transport, ref_cpy);
> >
> > Just drop this hunk; the change doesn't actually do anything. Otherwise,
> > the patch matches what I have.
> 
> Am I missing something? I thought we need this hunk since mapped_refs
> is const, and transport_fetch_refs takes a non-const argument?
> 
> builtin-clone.c: In function ‘cmd_clone’:
> builtin-clone.c:531: warning: passing argument 2 of
> ‘transport_fetch_refs’ discards qualifiers from pointer target type

Ah, actually, mapped_refs should just be made non-const; unlike "refs", 
it's set from wanted_peer_refs(), which returns a non-const value.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-10-30 12:57     ` Sverre Rabbelier
@ 2009-10-30 16:04       ` Daniel Barkalow
  2009-11-02  1:33         ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-10-30 16:04 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Fri, 30 Oct 2009, Sverre Rabbelier wrote:

> On Fri, Oct 30, 2009 at 00:10, Daniel Barkalow <barkalow@iabervon.org> wrote:
> > Actually, I think the problem is that builtin-clone will always default to
> > setting up a refspec of "refs/heads/*:refs/remotes/origin/*", but that
> > doesn't actually make any sense if the source repository isn't presented
> > as having refs names like "refs/heads/*". The immediate problem that you
> > don't get HEAD because it's a symref to something outside the pattern is
> > somewhat secondary to the general problem that you don't get anything at
> > all, because everything's outside the pattern.
> 
> Ah, I didn't realize that as long as HEAD is a symref to something in
> refs/heads/* it would be fetched.

Clone only cares about the remote HEAD for the purpose of making 
refs/remotes/origin/HEAD a symref to something. It doesn't really want a 
SHA1 for it (except in order to guess that it's a symref to something that 
the remote has dereferenced in its list); and, since it's a symref, no 
objects have to be fetched for it -- except for the possibility that it's 
known to be a symref to something that wasn't fetched, in which case, we 
know that HEAD -> something, and something's SHA1 is sha1, but we didn't 
fetch something so we don't have sha1. Of course, I think clone then 
writes a dangling symref anyway and forgets about sha1.

> > I don't really think that presenting the real refs in refs/<vcs>/*, and
> > having the list give symrefs to them from refs/heads/* and refs/tags/*
> > really makes sense; I think it would be better to rework fetch_with_import
> > and the list provided by a foreign vcs helper such that it can present
> > refs with normal names (refs/heads/*, refs/tags/*, etc) if the foreign vcs
> > has standards that correspond to branches and tags. Then "clone" would
> > work normally.
> 
> Perhaps we could introduce an additional phase before import to set
> the default refspec? If we could tell git that we want
> "refs/heads/*:refs/hg/origin/*" before line 468 "if (option_bare) {",
> that would save us a lot of trouble. Does that sound like a good idea?
> It would mean we have to move the transport_get part up a bit, but I
> don't think that will be a problem. A svn helper for example might
> respond the following to the "refspec" command:
> "refs/heads/trunkr:refs/svn/origin/trunk"
> "refs/tags/*:refs/svn/origin/tags/*"
> "refs/heads/*:refs/svn/origin/branches/*"
> 
> How does that sound?

The values you've got for refspecs don't make sense as fetch refspecs; 
these would cause refs with names the helper won't supply to be stored in 
the helper's private namespace, which is certainly wrong. (And I think you 
have the sides backwards)

On the other hand, I think it would make sense to use the same style of 
refspec between the helper and transport-helper.c such that the helper 
reports something like:

refs/svn/origin/trunk:refs/heads/trunkr
refs/svn/origin/branches/*:refs/heads/*
refs/svn/origin/tags/*:refs/tags/*

"list" gives:

000000...000 refs/heads/trunkr

"import refs/heads/trunkr" imports the objects, but the refspecs have to 
be consulted by transport-helper.c in order to determine what ref to read 
to get the value of refs/heads/trunkr. Instead of getting the value with 
read_ref("refs/heads/trunkr", ...) like it does now, it would do 
read_ref("refs/svn/origin/trunk", ...). And systems like p4 that don't 
have a useful standard just wouldn't support the "refspec" command and 
people would have to do site-specific configuration to get anything 
useful.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 02/19] Allow fetch to modify refs
  2009-10-30 15:16       ` Daniel Barkalow
@ 2009-10-30 21:24         ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-30 21:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

Heya,

On Fri, Oct 30, 2009 at 16:16, Daniel Barkalow <barkalow@iabervon.org> wrote:
> Ah, actually, mapped_refs should just be made non-const; unlike "refs",
> it's set from wanted_peer_refs(), which returns a non-const value.

That works too, even better I guess :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/19] Allow helpers to request marks for fast-import
  2009-10-30 12:26     ` Sverre Rabbelier
@ 2009-10-31 12:04       ` Johan Herland
  2009-10-31 16:19         ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Johan Herland @ 2009-10-31 12:04 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: git, Johannes Schindelin, Daniel Barkalow

On Friday 30 October 2009, Sverre Rabbelier wrote:
> Heya,
> 
> On Fri, Oct 30, 2009 at 01:21, Johan Herland <johan@herland.net> wrote:
> > Please drop this patch from the series. The functionality is not
> > needed, since we'll use the fast-import "option" command from the
> > sr/gfi-options series instead.
> 
> In that case I will rebase the series on top of sr/gfi-options then as
> soon as I reroll that one.

Good.

> Also, do you need to change anything else in git-remote-cvs to do that?

Yes, the sr/gfi-options series does cause some changes, both in git-remote-
cvs, and in the support libraries (adding a couple of methods to the 
GitFastImport class in git.py).

This conglomeration of patch series is becoming fairly complicated, and it's 
becoming hard to stay in sync. I suggest that you drop the CVS-specific 
parts from this series, and work on stabilizing the common infrastructure. 
Once that has settled, you can send a git-remote-hg series, and I can send a 
rebased and updated git-remote-cvs series.

Feel free to reorganize the patches so that the git_remote_helpers 
infrastructure is created in the correct location (instead of reorganizing 
git_remote_cvs).


...Johan

-- 
Johan Herland, <johan@herland.net>
www.herland.net

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH 10/19] Allow helpers to request marks for fast-import
  2009-10-31 12:04       ` Johan Herland
@ 2009-10-31 16:19         ` Sverre Rabbelier
  0 siblings, 0 replies; 43+ messages in thread
From: Sverre Rabbelier @ 2009-10-31 16:19 UTC (permalink / raw)
  To: Johan Herland; +Cc: git, Johannes Schindelin, Daniel Barkalow

Heya,

On Sat, Oct 31, 2009 at 13:04, Johan Herland <johan@herland.net> wrote:
> On Friday 30 October 2009, Sverre Rabbelier wrote:
> This conglomeration of patch series is becoming fairly complicated, and it's
> becoming hard to stay in sync. I suggest that you drop the CVS-specific
> parts from this series, and work on stabilizing the common infrastructure.
> Once that has settled, you can send a git-remote-hg series, and I can send a
> rebased and updated git-remote-cvs series.

That sounds like a good idea; I want to use the marks in git-remote-hg as well.

> Feel free to reorganize the patches so that the git_remote_helpers
> infrastructure is created in the correct location (instead of reorganizing
> git_remote_cvs).

Ok, will do.

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-10-30 16:04       ` Daniel Barkalow
@ 2009-11-02  1:33         ` Sverre Rabbelier
  2009-11-02  3:16           ` Daniel Barkalow
  0 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-11-02  1:33 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

Heya,

On Fri, Oct 30, 2009 at 17:04, Daniel Barkalow <barkalow@iabervon.org> wrote:
> I think you have the sides backwards

Yup, sorry, got them the wrong way around for some reason.

> On the other hand, I think it would make sense to use the same style of
> refspec between the helper and transport-helper.c such that the helper
> reports something like:
>
> refs/svn/origin/trunk:refs/heads/trunkr
> refs/svn/origin/branches/*:refs/heads/*
> refs/svn/origin/tags/*:refs/tags/*
>
> "list" gives:
>
> 000000...000 refs/heads/trunkr
>
> "import refs/heads/trunkr" imports the objects, but the refspecs have to
> be consulted by transport-helper.c in order to determine what ref to read
> to get the value of refs/heads/trunkr. Instead of getting the value with
> read_ref("refs/heads/trunkr", ...) like it does now, it would do
> read_ref("refs/svn/origin/trunk", ...). And systems like p4 that don't
> have a useful standard just wouldn't support the "refspec" command and
> people would have to do site-specific configuration to get anything
> useful.

Yes, that sounds very reasonable, and I think that's the right way to
go. This leaves us with only one thing, we need a remote HEAD for 'git
clone hg::/path/to/repo' to work and have it check out a branch, I
think a seperate 'head' command might be appropriate? If supported it
returns the which local symref (e.g. 'refs/heads/trunkr' in the svn
case) should be pointed at by HEAD. If not supported we can just not
set it and clone will give the default 'no remote HEAD, nothing
checked out' message, which would probably be best for p4?

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-11-02  1:33         ` Sverre Rabbelier
@ 2009-11-02  3:16           ` Daniel Barkalow
  2009-11-02 15:12             ` Sverre Rabbelier
  0 siblings, 1 reply; 43+ messages in thread
From: Daniel Barkalow @ 2009-11-02  3:16 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Mon, 2 Nov 2009, Sverre Rabbelier wrote:

> Yes, that sounds very reasonable, and I think that's the right way to
> go. This leaves us with only one thing, we need a remote HEAD for 'git
> clone hg::/path/to/repo' to work and have it check out a branch, I
> think a seperate 'head' command might be appropriate? If supported it
> returns the which local symref (e.g. 'refs/heads/trunkr' in the svn
> case) should be pointed at by HEAD. If not supported we can just not
> set it and clone will give the default 'no remote HEAD, nothing
> checked out' message, which would probably be best for p4?

Why not have the regular list report:

@refs/heads/trunkr HEAD

or whatever it is, again like native git? That is, SVN would have an 
interaction like:

> list
< ? refs/heads/trunkr
< ? refs/heads/my-branch
< ? refs/tags/v1.0
< @refs/heads/trunkr HEAD
<

This is similar to the response from the curl helper, except that it uses 
"?" for the sha1s because it doesn't know them until they're imported.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-11-02  3:16           ` Daniel Barkalow
@ 2009-11-02 15:12             ` Sverre Rabbelier
  2009-11-02 16:36               ` Daniel Barkalow
  0 siblings, 1 reply; 43+ messages in thread
From: Sverre Rabbelier @ 2009-11-02 15:12 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Git List, Johannes Schindelin, Johan Herland

Heya,

On Mon, Nov 2, 2009 at 04:16, Daniel Barkalow <barkalow@iabervon.org> wrote:
> Why not have the regular list report:
>
> @refs/heads/trunkr HEAD
>
> or whatever it is, again like native git? That is, SVN would have an
> interaction like:

That's fine with me, but earlier you said you didn't like the whole
symlinking idea.

You said in another thread you'll be working on some patches, does
that include this 'refs' command? I want to avoid duplicate work if
possible :).

-- 
Cheers,

Sverre Rabbelier

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [RFC PATCH 06/19] Factor ref updating out of fetch_with_import
  2009-11-02 15:12             ` Sverre Rabbelier
@ 2009-11-02 16:36               ` Daniel Barkalow
  0 siblings, 0 replies; 43+ messages in thread
From: Daniel Barkalow @ 2009-11-02 16:36 UTC (permalink / raw)
  To: Sverre Rabbelier; +Cc: Git List, Johannes Schindelin, Johan Herland

On Mon, 2 Nov 2009, Sverre Rabbelier wrote:

> Heya,
> 
> On Mon, Nov 2, 2009 at 04:16, Daniel Barkalow <barkalow@iabervon.org> wrote:
> > Why not have the regular list report:
> >
> > @refs/heads/trunkr HEAD
> >
> > or whatever it is, again like native git? That is, SVN would have an
> > interaction like:
> 
> That's fine with me, but earlier you said you didn't like the whole
> symlinking idea.

Now that you're not using them for other stuff, they're free to use for 
their normal purpose without confusion. The normal layout for a remote git 
repository's refs has HEAD as a symlink to something in refs/heads/*, so 
it's

> You said in another thread you'll be working on some patches, does
> that include this 'refs' command? I want to avoid duplicate work if
> possible :).

No, I'm just working on getting the handler to work in a context where 
there's no local repository, if possible. This will probably generate a 
bunch of easy conflicts, but not actually overlap with what you're doing 
here.

	-Daniel
*This .sig left intentionally blank*

^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2009-11-02 16:37 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-10-29 18:01 [PATCH 0/19] Reroll of the remote-vcs-helper series Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 01/19] Use a function to determine whether a remote is valid Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 02/19] Allow fetch to modify refs Sverre Rabbelier
2009-10-30  5:56   ` Daniel Barkalow
2009-10-30 12:22     ` Sverre Rabbelier
2009-10-30 15:16       ` Daniel Barkalow
2009-10-30 21:24         ` Sverre Rabbelier
2009-10-29 18:01 ` [UNSTABLE PATCH 03/19] Allow programs to not depend on remotes having urls Sverre Rabbelier
2009-10-30  6:02   ` Daniel Barkalow
2009-10-30 12:24     ` Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 04/19] Add a config option for remotes to specify a foreign vcs Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 05/19] Add support for "import" helper command Sverre Rabbelier
2009-10-29 18:01 ` [RFC PATCH 06/19] Factor ref updating out of fetch_with_import Sverre Rabbelier
2009-10-30  7:10   ` Daniel Barkalow
2009-10-30 12:57     ` Sverre Rabbelier
2009-10-30 16:04       ` Daniel Barkalow
2009-11-02  1:33         ` Sverre Rabbelier
2009-11-02  3:16           ` Daniel Barkalow
2009-11-02 15:12             ` Sverre Rabbelier
2009-11-02 16:36               ` Daniel Barkalow
2009-10-29 18:01 ` [PATCH 07/19] Allow helpers to report in "list" command that the ref is unchanged Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 08/19] Fix memory leak in helper method for disconnect Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 09/19] Finally make remote helper support useful Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 10/19] Allow helpers to request marks for fast-import Sverre Rabbelier
2009-10-30  8:21   ` Johan Herland
2009-10-30 12:26     ` Sverre Rabbelier
2009-10-31 12:04       ` Johan Herland
2009-10-31 16:19         ` Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 11/19] Basic build infrastructure for Python scripts Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 12/19] 1/2: Add Python support library for CVS remote helper Sverre Rabbelier
2009-10-30  8:33   ` Johan Herland
2009-10-30 12:27     ` Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 13/19] 2/2: " Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 14/19] git-remote-cvs: Remote helper program for CVS repositories Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 15/19] Add simple selftests of git-remote-cvs functionality Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 16/19] Fix the Makefile-generated path to the git_remote_cvs package in git-remote-cvs Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 17/19] More fixes to the git-remote-cvs installation procedure Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 18/19] Refactor git_remote_cvs to a more generic git_remote_helpers Sverre Rabbelier
2009-10-30  8:42   ` Johan Herland
2009-10-30 12:29     ` Sverre Rabbelier
2009-10-29 18:01 ` [PATCH 19/19] .gitignore: add git-remote-cvs Sverre Rabbelier
2009-10-29 18:05   ` Shawn O. Pearce
2009-10-29 18:08     ` Sverre Rabbelier

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).