All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH/RFC v2 0/9] Subversion dump parsing library
@ 2010-06-24 10:50 Jonathan Nieder
  2010-06-24 10:51 ` [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
                   ` (9 more replies)
  0 siblings, 10 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:50 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

Hi gitsters,

Ram last sent this series a couple of weeks ago[1], and it was
merged to pu then as rr/svn-export.  Here’s another iteration
of the same for discussion, now including David Barr’s program
that demonstrates the functionality.

Patch 1 is not so closely related; it just modifies
parse_date_toffset() to keep me a little saner while using it.

Patches 2-8 are very similar to the versions Ram sent.  I
expanded the commit messages, took the latest code from
git://github.com/barrbrain/svn-dump-fast-export where possible,
and added a simple example build system so one can see the
result of compiling with

 make vcs-svn/lib.a

Probably more interesting is what the patches do not do:

 - they do not include any tests

 - they do not remove the persistent object pool functionality.
   If you try this code out, be sure to remove all the .bin
   files from the current directory after each run.

 - they are not guaranteed to have fewer bugs than the version
   Ram sent.  In fact, the opposite is more likely, since the
   code is only lightly tested

You can try it out with

 ; cd contrib/svn-fe
 ; wget http://github.com/barrbrain/svn-dump-fast-export/raw/master/test.dump
 ; make svn-fe
 ; ./svn-fe <test.dump

or

 ; make svn-fe.1
 ; man ./svn-fe.1

and go from there.

Any feedback is appreciated, especially on how to make this fit better
with git.  I would be particularly interested in making vcs-svn/lib.a
self-sufficient --- that is, would there be a simple way to pull out
the required code from date.c?

David Barr (5):
  Add memory pool library
  Add string-specific memory pool
  Add stream helper library
  Add infrastructure to write revisions in fast-export format
  Add SVN dump parser

Jason Evans (1):
  Add treap implementation

Jonathan Nieder (3):
  Export parse_date_basic() to convert a date string to timestamp
  Introduce vcs-svn lib
  Add a sample user for the svndump library

 Makefile                  |   12 ++-
 cache.h                   |    1 +
 contrib/svn-fe/.gitignore |    3 +
 contrib/svn-fe/Makefile   |   63 +++++++++
 contrib/svn-fe/svn-fe.c   |   43 ++++++
 contrib/svn-fe/svn-fe.txt |   56 ++++++++
 date.c                    |   14 +-
 vcs-svn/LICENSE           |   33 +++++
 vcs-svn/fast_export.c     |   75 ++++++++++
 vcs-svn/fast_export.h     |   14 ++
 vcs-svn/line_buffer.c     |   93 +++++++++++++
 vcs-svn/line_buffer.h     |   14 ++
 vcs-svn/obj_pool.h        |   80 +++++++++++
 vcs-svn/repo_tree.c       |  335 +++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/repo_tree.h       |   26 ++++
 vcs-svn/string_pool.c     |  114 +++++++++++++++
 vcs-svn/string_pool.h     |   15 ++
 vcs-svn/svndump.c         |  289 ++++++++++++++++++++++++++++++++++++++
 vcs-svn/svndump.h         |    8 +
 vcs-svn/trp.h             |  220 +++++++++++++++++++++++++++++
 vcs-svn/trp.txt           |   90 ++++++++++++
 21 files changed, 1589 insertions(+), 9 deletions(-)
 create mode 100644 contrib/svn-fe/.gitignore
 create mode 100644 contrib/svn-fe/Makefile
 create mode 100644 contrib/svn-fe/svn-fe.c
 create mode 100644 contrib/svn-fe/svn-fe.txt
 create mode 100644 vcs-svn/LICENSE
 create mode 100644 vcs-svn/fast_export.c
 create mode 100644 vcs-svn/fast_export.h
 create mode 100644 vcs-svn/line_buffer.c
 create mode 100644 vcs-svn/line_buffer.h
 create mode 100644 vcs-svn/obj_pool.h
 create mode 100644 vcs-svn/repo_tree.c
 create mode 100644 vcs-svn/repo_tree.h
 create mode 100644 vcs-svn/string_pool.c
 create mode 100644 vcs-svn/string_pool.h
 create mode 100644 vcs-svn/svndump.c
 create mode 100644 vcs-svn/svndump.h
 create mode 100644 vcs-svn/trp.h
 create mode 100644 vcs-svn/trp.txt

[1] http://thread.gmane.org/gmane.comp.version-control.git/148866

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
@ 2010-06-24 10:51 ` Jonathan Nieder
  2010-06-24 18:32   ` Ramkumar Ramachandra
  2010-06-24 10:52 ` [PATCH 2/9] Introduce vcs-svn lib Jonathan Nieder
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:51 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

approxidate() is not appropriate for reading machine-written dates
because it guesses instead of erroring out on malformed dates.
parse_date() is less convenient since it returns its output as a
string.  So export the underlying function that writes a timestamp.

While at it, change the return value to match the usual convention:
return 0 for success and -1 for failure.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 cache.h |    1 +
 date.c  |   14 ++++++--------
 2 files changed, 7 insertions(+), 8 deletions(-)

diff --git a/cache.h b/cache.h
index ff4a7c2..4566501 100644
--- a/cache.h
+++ b/cache.h
@@ -800,6 +800,7 @@ const char *show_date_relative(unsigned long time, int tz,
 			       char *timebuf,
 			       size_t timebuf_size);
 int parse_date(const char *date, char *buf, int bufsize);
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset);
 void datestamp(char *buf, int bufsize);
 #define approxidate(s) approxidate_careful((s), NULL)
 unsigned long approxidate_careful(const char *, int *);
diff --git a/date.c b/date.c
index 68cdcaa..383706d 100644
--- a/date.c
+++ b/date.c
@@ -586,7 +586,7 @@ static int date_string(unsigned long date, int offset, char *buf, int len)
 
 /* Gr. strptime is crap for this; it doesn't have a way to require RFC2822
    (i.e. English) day/month names, and it doesn't work correctly with %z. */
-int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
+int parse_date_basic(const char *date, unsigned long *timestamp, int *offset)
 {
 	struct tm tm;
 	int tm_gmt;
@@ -642,17 +642,16 @@ int parse_date_toffset(const char *date, unsigned long *timestamp, int *offset)
 
 	if (!tm_gmt)
 		*timestamp -= *offset * 60;
-	return 1; /* success */
+	return 0; /* success */
 }
 
 int parse_date(const char *date, char *result, int maxlen)
 {
 	unsigned long timestamp;
 	int offset;
-	if (parse_date_toffset(date, &timestamp, &offset) > 0)
-		return date_string(timestamp, offset, result, maxlen);
-	else
+	if (parse_date_basic(date, &timestamp, &offset))
 		return -1;
+	return date_string(timestamp, offset, result, maxlen);
 }
 
 enum date_mode parse_date_format(const char *format)
@@ -1004,9 +1003,8 @@ unsigned long approxidate_relative(const char *date, const struct timeval *tv)
 	int offset;
 	int errors = 0;
 
-	if (parse_date_toffset(date, &timestamp, &offset) > 0)
+	if (!parse_date_basic(date, &timestamp, &offset))
 		return timestamp;
-
 	return approxidate_str(date, tv, &errors);
 }
 
@@ -1019,7 +1017,7 @@ unsigned long approxidate_careful(const char *date, int *error_ret)
 	if (!error_ret)
 		error_ret = &dummy;
 
-	if (parse_date_toffset(date, &timestamp, &offset) > 0) {
+	if (!parse_date_basic(date, &timestamp, &offset)) {
 		*error_ret = 0;
 		return timestamp;
 	}
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/9] Introduce vcs-svn lib
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
  2010-06-24 10:51 ` [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
@ 2010-06-24 10:52 ` Jonathan Nieder
  2010-06-24 20:27   ` Ramkumar Ramachandra
  2010-06-24 10:53 ` [PATCH 3/9] Add memory pool library Jonathan Nieder
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:52 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

Teach the build system to build a separate library for the
upcoming subversion interop support.

The resulting vcs-svn/lib.a does not contain any code, nor is
it built during a normal build.  This is just scaffolding for
later changes.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/Makefile b/Makefile
index 9aca8a1..6441dcb 100644
--- a/Makefile
+++ b/Makefile
@@ -468,6 +468,7 @@ export PYTHON_PATH
 
 LIB_FILE=libgit.a
 XDIFF_LIB=xdiff/lib.a
+VCSSVN_LIB=vcs-svn/lib.a
 
 LIB_H += advice.h
 LIB_H += archive.h
@@ -1739,7 +1740,8 @@ ifndef NO_CURL
 endif
 XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
 	xdiff/xmerge.o xdiff/xpatience.o
-OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS)
+VCSSVN_OBJS =
+OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
 
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
 dep_dirs := $(addsuffix .depend,$(sort $(dir $(OBJECTS))))
@@ -1860,6 +1862,8 @@ http.o http-walker.o http-push.o remote-curl.o: http.h
 xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xinclude.h xdiff/xmacros.h xdiff/xdiff.h xdiff/xtypes.h \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
+
+$(VCSSVN_OBJS):
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
@@ -1908,6 +1912,8 @@ $(LIB_FILE): $(LIB_OBJS)
 $(XDIFF_LIB): $(XDIFF_OBJS)
 	$(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(XDIFF_OBJS)
 
+$(VCSSVN_LIB): $(VCSSVN_OBJS)
+	$(QUIET_AR)$(RM) $@ && $(AR) rcs $@ $(VCSSVN_OBJS)
 
 doc:
 	$(MAKE) -C Documentation all
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/9] Add memory pool library
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
  2010-06-24 10:51 ` [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
  2010-06-24 10:52 ` [PATCH 2/9] Introduce vcs-svn lib Jonathan Nieder
@ 2010-06-24 10:53 ` Jonathan Nieder
  2010-06-24 18:43   ` Ramkumar Ramachandra
  2010-06-24 10:57 ` [PATCH 4/9] Add treap implementation Jonathan Nieder
                   ` (6 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:53 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: David Barr <david.barr@cordelta.com>

Add a memory pool library implemented using C macros. The obj_pool_gen()
macro creates a type-specific memory pool API.

The memory pool library is distinguished from the existing specialized
allocators in alloc.c by using a contiguous block for all allocations.
This means that on one hand, long-lived pointers have to be written as
offsets, since the base address changes as the pool grows, but on the
other hand, the entire pool can be easily written to the file system.
This allows the memory pool to persist between runs of an application.

For the svn importer, such a facility is useful because each svn
revision can copy trees and files from any previous revision.  The
relevant information for all revisions has to persist somehow to
support incremental runs, and for now it is simplest to avoid relying
on the target VCS for that.

obj_pool_gen(pre, obj_t, initial_capability)

	pre: Prefix for generated functions (example: string).
	obj_t: Type for treap data structure (example: char).
	initial_capacity: Initial size of the memory pool (example: 4096).

void pre_init(void);

	Read values from a previous run to initialize the pool.
	If this function is not called, the pool begins valid but empty.

uint32_t pre_alloc(uint32_t nmemb);

	Reserve space for a few objects in the pool and return an
	offset to the first one.

uint32_t pre_free(uint32_t nmemb);

	Unreserve the last few objects reserved.

uint32_t pre_offset(obj_t *pointer);
obj_t *pre_pointer(uint32_t offset);

	Convert between pointers into the in-memory pool and offsets
	from the beginning (or ~0 for the NULL pointer).  Pointers are
	not guaranteed to remain valid after a pre_alloc() operation
	or pre_reset() followed by pre_init(), but offsets are.

void pre_commit(void);

	Write the pool to file.  A pre_reset() followed by pre_init()
	(pehaps with exit() in between) will return the pool to the
	last committed state.

void pre_reset(void);

	Deinitialize the pool, freeing any associated memory and
	file handles.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile           |    3 +-
 vcs-svn/LICENSE    |   26 +++++++++++++++++
 vcs-svn/obj_pool.h |   80 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 108 insertions(+), 1 deletions(-)
 create mode 100644 vcs-svn/LICENSE
 create mode 100644 vcs-svn/obj_pool.h

diff --git a/Makefile b/Makefile
index 6441dcb..fc31ee0 100644
--- a/Makefile
+++ b/Makefile
@@ -1863,7 +1863,8 @@ xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xinclude.h xdiff/xmacros.h xdiff/xdiff.h xdiff/xtypes.h \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
 
-$(VCSSVN_OBJS):
+$(VCSSVN_OBJS): \
+	vcs-svn/obj_pool.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/LICENSE b/vcs-svn/LICENSE
new file mode 100644
index 0000000..6e52372
--- /dev/null
+++ b/vcs-svn/LICENSE
@@ -0,0 +1,26 @@
+Copyright (C) 2010 David Barr <david.barr@cordelta.com>.
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions
+are met:
+1. Redistributions of source code must retain the above copyright
+   notice(s), this list of conditions and the following disclaimer
+   unmodified other than the allowable addition of one or more
+   copyright notices.
+2. Redistributions in binary form must reproduce the above copyright
+   notice(s), this list of conditions and the following disclaimer in
+   the documentation and/or other materials provided with the
+   distribution.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
+EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE
+LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
diff --git a/vcs-svn/obj_pool.h b/vcs-svn/obj_pool.h
new file mode 100644
index 0000000..f60c872
--- /dev/null
+++ b/vcs-svn/obj_pool.h
@@ -0,0 +1,80 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#ifndef OBJ_POOL_H_
+#define OBJ_POOL_H_
+
+#include "git-compat-util.h"
+
+#define MAYBE_UNUSED __attribute__((__unused__))
+
+#define obj_pool_gen(pre, obj_t, initial_capacity) \
+static struct { \
+	uint32_t committed; \
+	uint32_t size; \
+	uint32_t capacity; \
+	obj_t *base; \
+	FILE *file; \
+} pre##_pool = { 0, 0, 0, NULL, NULL}; \
+static MAYBE_UNUSED void pre##_init(void) \
+{ \
+	struct stat st; \
+	pre##_pool.file = fopen(#pre ".bin", "a+"); \
+	rewind(pre##_pool.file); \
+	fstat(fileno(pre##_pool.file), &st); \
+	pre##_pool.size = st.st_size / sizeof(obj_t); \
+	pre##_pool.committed = pre##_pool.size; \
+	pre##_pool.capacity = pre##_pool.size * 2; \
+	if (pre##_pool.capacity < initial_capacity) \
+		pre##_pool.capacity = initial_capacity; \
+	pre##_pool.base = malloc(pre##_pool.capacity * sizeof(obj_t)); \
+	fread(pre##_pool.base, sizeof(obj_t), pre##_pool.size, pre##_pool.file); \
+} \
+static MAYBE_UNUSED uint32_t pre##_alloc(uint32_t count) \
+{ \
+	uint32_t offset; \
+	if (pre##_pool.size + count > pre##_pool.capacity) { \
+		while (pre##_pool.size + count > pre##_pool.capacity) \
+			if (pre##_pool.capacity) \
+				pre##_pool.capacity *= 2; \
+			else \
+				pre##_pool.capacity = initial_capacity; \
+		pre##_pool.base = realloc(pre##_pool.base, \
+					pre##_pool.capacity * sizeof(obj_t)); \
+	} \
+	offset = pre##_pool.size; \
+	pre##_pool.size += count; \
+	return offset; \
+} \
+static MAYBE_UNUSED void pre##_free(uint32_t count) \
+{ \
+	pre##_pool.size -= count; \
+} \
+static MAYBE_UNUSED uint32_t pre##_offset(obj_t *obj) \
+{ \
+	return obj == NULL ? ~0 : obj - pre##_pool.base; \
+} \
+static MAYBE_UNUSED obj_t *pre##_pointer(uint32_t offset) \
+{ \
+	return offset >= pre##_pool.size ? NULL : &pre##_pool.base[offset]; \
+} \
+static MAYBE_UNUSED void pre##_commit(void) \
+{ \
+	pre##_pool.committed += fwrite(pre##_pool.base + pre##_pool.committed, \
+		sizeof(obj_t), pre##_pool.size - pre##_pool.committed, \
+		pre##_pool.file); \
+} \
+static MAYBE_UNUSED void pre##_reset(void) \
+{ \
+	free(pre##_pool.base); \
+	if (pre##_pool.file) \
+		fclose(pre##_pool.file); \
+	pre##_pool.base = NULL; \
+	pre##_pool.size = 0; \
+	pre##_pool.capacity = 0; \
+	pre##_pool.file = NULL; \
+}
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/9] Add treap implementation
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (2 preceding siblings ...)
  2010-06-24 10:53 ` [PATCH 3/9] Add memory pool library Jonathan Nieder
@ 2010-06-24 10:57 ` Jonathan Nieder
  2010-06-24 19:08   ` Ramkumar Ramachandra
  2010-06-24 10:58 ` [PATCH 5/9] Add string-specific memory pool Jonathan Nieder
                   ` (5 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:57 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: Jason Evans <jasone@canonware.com>

Provide macros to generate a type-specific treap implementation and
various functions to operate on it. It uses obj_pool.h to store memory
nodes in a treap.  Previously committed nodes are never removed from
the pool; after any *_commit operation, it is assumed (correctly, in
the case of svn-fast-export) that someone else must care about them.

Treaps provide a memory-efficient binary search tree structure.
Insertion/deletion/search are about as about as fast in the average
case as red-black trees and the chances of worst-case behavior are
vanishingly small, thanks to (pseudo-)randomness.  The bad worst-case
behavior is a small price to pay, given that treaps are much simpler
to implement.

From http://www.canonware.com/download/trp/trp_hash/trp.h

[db: Altered to reference nodes by offset from a common base pointer]
[db: Bob Jenkins' hashing implementation dropped for Knuth's]
[db: Methods unnecessary for search and insert dropped]

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile        |    2 +-
 vcs-svn/LICENSE |    3 +
 vcs-svn/trp.h   |  220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/trp.txt |   90 ++++++++++++++++++++++
 4 files changed, 314 insertions(+), 1 deletions(-)
 create mode 100644 vcs-svn/trp.h
 create mode 100644 vcs-svn/trp.txt

diff --git a/Makefile b/Makefile
index fc31ee0..663a366 100644
--- a/Makefile
+++ b/Makefile
@@ -1864,7 +1864,7 @@ xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
 
 $(VCSSVN_OBJS): \
-	vcs-svn/obj_pool.h
+	vcs-svn/obj_pool.h vcs-svn/trp.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/LICENSE b/vcs-svn/LICENSE
index 6e52372..a3d384c 100644
--- a/vcs-svn/LICENSE
+++ b/vcs-svn/LICENSE
@@ -1,6 +1,9 @@
 Copyright (C) 2010 David Barr <david.barr@cordelta.com>.
 All rights reserved.
 
+Copyright (C) 2008 Jason Evans <jasone@canonware.com>.
+All rights reserved.
+
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
diff --git a/vcs-svn/trp.h b/vcs-svn/trp.h
new file mode 100644
index 0000000..dd7d5ee
--- /dev/null
+++ b/vcs-svn/trp.h
@@ -0,0 +1,220 @@
+/*
+ * C macro implementation of treaps.
+ *
+ * Usage:
+ *   #include <stdint.h>
+ *   #include "trp.h"
+ *   trp_gen(...)
+ *
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#ifndef TRP_H_
+#define TRP_H_
+
+#define MAYBE_UNUSED __attribute__((__unused__))
+
+/* Node structure. */
+struct trp_node {
+	uint32_t trpn_left;
+	uint32_t trpn_right;
+};
+
+/* Root structure. */
+struct trp_root {
+	uint32_t trp_root;
+};
+
+/* Pointer/Offset conversion. */
+#define trpn_pointer(a_base, a_offset) (a_base##_pointer(a_offset))
+#define trpn_offset(a_base, a_pointer) (a_base##_offset(a_pointer))
+#define trpn_modify(a_base, a_offset) \
+	do { \
+		if ((a_offset) < a_base##_pool.committed) { \
+			uint32_t old_offset = (a_offset);\
+			(a_offset) = a_base##_alloc(1); \
+			*trpn_pointer(a_base, a_offset) = \
+				*trpn_pointer(a_base, old_offset); \
+		} \
+	} while (0);
+
+/* Left accessors. */
+#define trp_left_get(a_base, a_field, a_node) \
+	(trpn_pointer(a_base, a_node)->a_field.trpn_left)
+#define trp_left_set(a_base, a_field, a_node, a_left) \
+	do { \
+		trpn_modify(a_base, a_node); \
+		trp_left_get(a_base, a_field, a_node) = (a_left); \
+	} while(0)
+
+/* Right accessors. */
+#define trp_right_get(a_base, a_field, a_node) \
+	(trpn_pointer(a_base, a_node)->a_field.trpn_right)
+#define trp_right_set(a_base, a_field, a_node, a_right) \
+	do { \
+		trpn_modify(a_base, a_node); \
+		trp_right_get(a_base, a_field, a_node) = (a_right); \
+	} while(0)
+
+/*
+ * Fibonacci hash function.
+ * The multiplier is the nearest prime to (2^32 times (√5 - 1)/2).
+ * See Knuth §6.4: volume 3, 3rd ed, p518.
+ */
+#define trpn_hash(a_node) (uint32_t) (2654435761u * (a_node))
+
+/* Priority accessors. */
+#define trp_prio_get(a_node) trpn_hash(a_node)
+
+/* Node initializer. */
+#define trp_node_new(a_base, a_field, a_node) \
+	do { \
+		trp_left_set(a_base, a_field, (a_node), ~0); \
+		trp_right_set(a_base, a_field, (a_node), ~0); \
+	} while(0)
+
+/* Internal utility macros. */
+#define trpn_first(a_base, a_field, a_root, r_node) \
+	do { \
+		(r_node) = (a_root); \
+		if ((r_node) == ~0) \
+			return NULL; \
+		while (~trp_left_get(a_base, a_field, (r_node))) \
+			(r_node) = trp_left_get(a_base, a_field, (r_node)); \
+	} while (0)
+
+#define trpn_rotate_left(a_base, a_field, a_node, r_node) \
+	do { \
+		(r_node) = trp_right_get(a_base, a_field, (a_node)); \
+		trp_right_set(a_base, a_field, (a_node), \
+			trp_left_get(a_base, a_field, (r_node))); \
+		trp_left_set(a_base, a_field, (r_node), (a_node)); \
+	} while(0)
+
+#define trpn_rotate_right(a_base, a_field, a_node, r_node) \
+	do { \
+		(r_node) = trp_left_get(a_base, a_field, (a_node)); \
+		trp_left_set(a_base, a_field, (a_node), \
+			trp_right_get(a_base, a_field, (r_node))); \
+		trp_right_set(a_base, a_field, (r_node), (a_node)); \
+	} while(0)
+
+#define trp_gen(a_attr, a_pre, a_type, a_field, a_base, a_cmp) \
+a_attr a_type MAYBE_UNUSED *a_pre##first(struct trp_root *treap) \
+{ \
+	uint32_t ret; \
+	trpn_first(a_base, a_field, treap->trp_root, ret); \
+	return trpn_pointer(a_base, ret); \
+} \
+a_attr a_type MAYBE_UNUSED *a_pre##next(struct trp_root *treap, a_type *node) \
+{ \
+	uint32_t ret; \
+	uint32_t offset = trpn_offset(a_base, node); \
+	if (~trp_right_get(a_base, a_field, offset)) { \
+		trpn_first(a_base, a_field, \
+			trp_right_get(a_base, a_field, offset), ret); \
+	} else { \
+		uint32_t tnode = treap->trp_root; \
+		ret = ~0; \
+		while (1) { \
+			int cmp = (a_cmp)(trpn_pointer(a_base, offset), \
+				trpn_pointer(a_base, tnode)); \
+			if (cmp < 0) { \
+				ret = tnode; \
+				tnode = trp_left_get(a_base, a_field, tnode); \
+			} else if (cmp > 0) { \
+				tnode = trp_right_get(a_base, a_field, tnode); \
+			} else { \
+				break; \
+			} \
+		} \
+	} \
+	return trpn_pointer(a_base, ret); \
+} \
+a_attr a_type MAYBE_UNUSED *a_pre##search(struct trp_root *treap, a_type *key) \
+{ \
+	int cmp; \
+	uint32_t ret = treap->trp_root; \
+	while (~ret && (cmp = (a_cmp)(key, trpn_pointer(a_base,ret)))) { \
+		if (cmp < 0) \
+			ret = trp_left_get(a_base, a_field, ret); \
+		else \
+			ret = trp_right_get(a_base, a_field, ret); \
+	} \
+	return trpn_pointer(a_base, ret); \
+} \
+a_attr uint32_t MAYBE_UNUSED a_pre##insert_recurse(uint32_t cur_node, uint32_t ins_node) \
+{ \
+	if (cur_node == ~0) { \
+		return (ins_node); \
+	} else { \
+		uint32_t ret; \
+		int cmp = (a_cmp)(trpn_pointer(a_base, ins_node), \
+					trpn_pointer(a_base, cur_node)); \
+		if (cmp < 0) { \
+			uint32_t left = a_pre##insert_recurse( \
+				trp_left_get(a_base, a_field, cur_node), ins_node); \
+			trp_left_set(a_base, a_field, cur_node, left); \
+			if (trp_prio_get(left) < trp_prio_get(cur_node)) \
+				trpn_rotate_right(a_base, a_field, cur_node, ret); \
+			else \
+				ret = cur_node; \
+		} else { \
+			uint32_t right = a_pre##insert_recurse( \
+				trp_right_get(a_base, a_field, cur_node), ins_node); \
+			trp_right_set(a_base, a_field, cur_node, right); \
+			if (trp_prio_get(right) < trp_prio_get(cur_node)) \
+				trpn_rotate_left(a_base, a_field, cur_node, ret); \
+			else \
+				ret = cur_node; \
+		} \
+		return (ret); \
+	} \
+} \
+a_attr void MAYBE_UNUSED a_pre##insert(struct trp_root *treap, a_type *node) \
+{ \
+	uint32_t offset = trpn_offset(a_base, node); \
+	trp_node_new(a_base, a_field, offset); \
+	treap->trp_root = a_pre##insert_recurse(treap->trp_root, offset); \
+} \
+a_attr uint32_t MAYBE_UNUSED a_pre##remove_recurse(uint32_t cur_node, uint32_t rem_node) \
+{ \
+	int cmp = a_cmp(trpn_pointer(a_base, rem_node), \
+			trpn_pointer(a_base, cur_node)); \
+	if (cmp == 0) { \
+		uint32_t ret; \
+		uint32_t left = trp_left_get(a_base, a_field, cur_node); \
+		uint32_t right = trp_right_get(a_base, a_field, cur_node); \
+		if (left == ~0) { \
+			if (right == ~0) \
+				return (~0); \
+		} else if (right == ~0 || trp_prio_get(left) < trp_prio_get(right)) { \
+			trpn_rotate_right(a_base, a_field, cur_node, ret); \
+			right = a_pre##remove_recurse(cur_node, rem_node); \
+			trp_right_set(a_base, a_field, ret, right); \
+			return (ret); \
+		} \
+		trpn_rotate_left(a_base, a_field, cur_node, ret); \
+		left = a_pre##remove_recurse(cur_node, rem_node); \
+		trp_left_set(a_base, a_field, ret, left); \
+		return (ret); \
+	} else if (cmp < 0) { \
+		uint32_t left = a_pre##remove_recurse( \
+			trp_left_get(a_base, a_field, cur_node), rem_node); \
+		trp_left_set(a_base, a_field, cur_node, left); \
+		return (cur_node); \
+	} else { \
+		uint32_t right = a_pre##remove_recurse( \
+			trp_right_get(a_base, a_field, cur_node), rem_node); \
+		trp_right_set(a_base, a_field, cur_node, right); \
+		return (cur_node); \
+	} \
+} \
+a_attr void MAYBE_UNUSED a_pre##remove(struct trp_root *treap, a_type *node) \
+{ \
+	treap->trp_root = a_pre##remove_recurse(treap->trp_root, \
+		trpn_offset(a_base, node)); \
+} \
+
+#endif
diff --git a/vcs-svn/trp.txt b/vcs-svn/trp.txt
new file mode 100644
index 0000000..f387aaa
--- /dev/null
+++ b/vcs-svn/trp.txt
@@ -0,0 +1,90 @@
+treap API
+=========
+
+The trp API generates a data structure and functions to handle a
+large growing set of objects stored in a pool.
+
+The caller:
+
+. Specifies parameters for the generated functions with the
+  trp_gen(static, foo_, ...) macro.
+
+. Allocates and clears a `struct trp_node` variable.
+
+. Adds new items to the set using `foo_insert`.
+
+. Can find a specific item in the set using `foo_search`.
+
+. Can iterate over items in the set using `foo_first` and `foo_next`.
+
+. Can remove an item from the set using `foo_remove`.
+
+. The set is never freed.
+
+Example:
+
+----
+struct ex_node {
+	const char *s;
+	struct trp_node ex_link;
+};
+static struct trp_root ex_base;
+obj_pool_gen(ex, struct ex_node, 4096);
+trp_gen(static, ex_, struct ex_node, ex_link, ex, strcmp)
+struct ex_node *item;
+
+item = ex_pointer(ex_alloc(1));
+item->s = "hello";
+ex_insert(&ex_base, item);
+item = ex_pointer(ex_alloc(1));
+item->s = "goodbye";
+ex_insert(&ex_base, item);
+for (item = ex_first(&ex_base); item; item = ex_next(&ex_base, item))
+	printf("%s\n", item->s);
+----
+
+Functions
+---------
+
+trp_gen(attr, foo_, node_type, link_field, pool, cmp)::
+
+	Generate a type-specific treap implementation.
++
+. The storage class for generated functions will be 'attr' (e.g., `static`).
+. Generated function names are prefixed with 'foo_' (e.g., `treap_`).
+. Treap nodes will be of type 'node_type' (e.g., `struct treap_node`).
+  This type must be a struct with at least one `struct trp_node` field
+  to point to its children.
+. The field used to access child nodes will be 'link_field'.
+. All treap nodes must lie in the 'pool' object pool.
+. Treap nodes must be totally ordered by the 'cmp' relation, with the
+  following prototype:
++
+int (*cmp)(node_type \*a, node_type \*b)
++
+and returning a value less than, equal to, or greater than zero
+according to the result of comparison.
+
+void foo_insert(struct trp_root *treap, node_type \*node)::
+
+	Insert node into treap.  If inserted multiple times,
+	a node will appear in the treap multiple times.
+
+void foo_remove(struct trp_root *treap, node_type \*node)::
+
+	Remove node from treap.  Caller must ensure node is
+	present in treap before using this function.
+
+node_type *foo_search(struct trp_root \*treap, node_type \*key)::
+
+	Search for a node that matches key.  If no match is found,
+	return what would be key's successor, were key in treap
+	(NULL if no successor).
+
+node_type *foo_first(struct trp_root \*treap)::
+
+	Find the first item from the treap, in sorted order.
+
+node_type *foo_next(struct trp_root \*treap, node_type \*node)::
+
+	Find the next item.
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/9] Add string-specific memory pool
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (3 preceding siblings ...)
  2010-06-24 10:57 ` [PATCH 4/9] Add treap implementation Jonathan Nieder
@ 2010-06-24 10:58 ` Jonathan Nieder
  2010-06-24 19:19   ` Ramkumar Ramachandra
  2010-06-24 11:01 ` [PATCH 6/9] Add stream helper library Jonathan Nieder
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 10:58 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: David Barr <david.barr@cordelta.com>

Intern strings so they can be compared by address and stored without
wasting space.

This library uses the macros in the obj_pool.h and trp.h to create a
memory pool for strings and expose an API for handling them.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile              |    4 +-
 vcs-svn/string_pool.c |  114 +++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/string_pool.h |   15 ++++++
 3 files changed, 131 insertions(+), 2 deletions(-)
 create mode 100644 vcs-svn/string_pool.c
 create mode 100644 vcs-svn/string_pool.h

diff --git a/Makefile b/Makefile
index 663a366..e11e588 100644
--- a/Makefile
+++ b/Makefile
@@ -1740,7 +1740,7 @@ ifndef NO_CURL
 endif
 XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
 	xdiff/xmerge.o xdiff/xpatience.o
-VCSSVN_OBJS =
+VCSSVN_OBJS = vcs-svn/string_pool.o
 OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
 
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
@@ -1864,7 +1864,7 @@ xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
 
 $(VCSSVN_OBJS): \
-	vcs-svn/obj_pool.h vcs-svn/trp.h
+	vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/string_pool.c b/vcs-svn/string_pool.c
new file mode 100644
index 0000000..bd5a380
--- /dev/null
+++ b/vcs-svn/string_pool.c
@@ -0,0 +1,114 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "git-compat-util.h"
+#include "trp.h"
+#include "obj_pool.h"
+#include "string_pool.h"
+
+static struct trp_root tree = { ~0 };
+
+struct node {
+	uint32_t offset;
+	struct trp_node children;
+};
+
+/* Two memory pools: one for struct node, and another for strings */
+obj_pool_gen(node, struct node, 4096);
+obj_pool_gen(string, char, 4096);
+
+static char *node_value(struct node *node)
+{
+	return node ? string_pointer(node->offset) : NULL;
+}
+
+static int node_cmp(struct node *a, struct node *b)
+{
+	return strcmp(node_value(a), node_value(b));
+}
+
+/* Build a Treap from the node structure (a trp_node w/ offset) */
+trp_gen(static, tree_, struct node, children, node, node_cmp);
+
+char *pool_fetch(uint32_t entry)
+{
+	return node_value(node_pointer(entry));
+}
+
+uint32_t pool_intern(char *key)
+{
+	/* Canonicalize key */
+	struct node *match = NULL;
+	uint32_t key_len;
+	if (key == NULL)
+		return ~0;
+	key_len = strlen(key) + 1;
+	struct node *node = node_pointer(node_alloc(1));
+	node->offset = string_alloc(key_len);
+	strcpy(node_value(node), key);
+	match = tree_search(&tree, node);
+	if (!match) {
+		tree_insert(&tree, node);
+	} else {
+		node_free(1);
+		string_free(key_len);
+		node = match;
+	}
+	return node_offset(node);
+}
+
+uint32_t pool_tok_r(char *str, const char *delim, char **saveptr)
+{
+	char *token = strtok_r(str, delim, saveptr);
+	return token ? pool_intern(token) : ~0;
+}
+
+void pool_print_seq(uint32_t len, uint32_t *seq, char delim, FILE *stream)
+{
+	uint32_t i;
+	for (i = 0; i < len && ~seq[i]; i++) {
+		fputs(pool_fetch(seq[i]), stream);
+		if (i < len - 1 && ~seq[i + 1])
+			fputc(delim, stream);
+	}
+}
+
+uint32_t pool_tok_seq(uint32_t max, uint32_t *seq, char *delim, char *str)
+{
+	char *context = NULL;
+	uint32_t length = 0, token = str ? pool_tok_r(str, delim, &context) : ~0;
+	while (length < max) {
+		seq[length++] = token;
+		if (token == ~0)
+			break;
+		token = pool_tok_r(NULL, delim, &context);
+	}
+	seq[length ? length - 1 : 0] = ~0;
+	return length;
+}
+
+void pool_init(void)
+{
+	uint32_t node;
+	uint32_t string = 0;
+	string_init();
+	while (string < string_pool.size) {
+		node = node_alloc(1);
+		node_pointer(node)->offset = string;
+		tree_insert(&tree, node_pointer(node));
+		string += strlen(string_pointer(string)) + 1;
+	}
+}
+
+void pool_commit(void)
+{
+	string_commit();
+}
+
+void pool_reset(void)
+{
+	node_reset();
+	string_reset();
+}
diff --git a/vcs-svn/string_pool.h b/vcs-svn/string_pool.h
new file mode 100644
index 0000000..085e6d7
--- /dev/null
+++ b/vcs-svn/string_pool.h
@@ -0,0 +1,15 @@
+#ifndef STRING_POOL_H_
+#define STRING_POOL_H_
+
+#include "git-compat-util.h"
+
+uint32_t pool_intern(char *key);
+char *pool_fetch(uint32_t entry);
+uint32_t pool_tok_r(char *str, const char *delim, char **saveptr);
+void pool_print_seq(uint32_t len, uint32_t *seq, char delim, FILE *stream);
+uint32_t pool_tok_seq(uint32_t max, uint32_t *seq, char *delim, char *str);
+void pool_init(void);
+void pool_commit(void);
+void pool_reset(void);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/9] Add stream helper library
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (4 preceding siblings ...)
  2010-06-24 10:58 ` [PATCH 5/9] Add string-specific memory pool Jonathan Nieder
@ 2010-06-24 11:01 ` Jonathan Nieder
  2010-06-24 21:23   ` Ramkumar Ramachandra
  2010-06-24 11:02 ` [PATCH 7/9] Add infrastructure to write revisions in fast-export format Jonathan Nieder
                   ` (3 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 11:01 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: David Barr <david.barr@cordelta.com>

This library provides thread-unsafe fgets()- and fread()-like
functions where the caller does not have to supply a buffer.  It
maintains a couple of static buffers and provides an API to use
them.

NEEDSWORK: what should buffer_copy_bytes do on error?

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile              |    5 ++-
 vcs-svn/line_buffer.c |   93 +++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/line_buffer.h |   14 +++++++
 3 files changed, 110 insertions(+), 2 deletions(-)
 create mode 100644 vcs-svn/line_buffer.c
 create mode 100644 vcs-svn/line_buffer.h

diff --git a/Makefile b/Makefile
index e11e588..8223d9b 100644
--- a/Makefile
+++ b/Makefile
@@ -1740,7 +1740,7 @@ ifndef NO_CURL
 endif
 XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
 	xdiff/xmerge.o xdiff/xpatience.o
-VCSSVN_OBJS = vcs-svn/string_pool.o
+VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o
 OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
 
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
@@ -1864,7 +1864,8 @@ xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
 
 $(VCSSVN_OBJS): \
-	vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h
+	vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \
+	vcs-svn/line_buffer.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/line_buffer.c b/vcs-svn/line_buffer.c
new file mode 100644
index 0000000..0f83426
--- /dev/null
+++ b/vcs-svn/line_buffer.c
@@ -0,0 +1,93 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "git-compat-util.h"
+
+#include "line_buffer.h"
+#include "obj_pool.h"
+
+#define LINE_BUFFER_LEN 10000
+#define COPY_BUFFER_LEN 4096
+
+/* Create memory pool for char sequence of known length */
+obj_pool_gen(blob, char, 4096);
+
+static char line_buffer[LINE_BUFFER_LEN];
+static char byte_buffer[COPY_BUFFER_LEN];
+static FILE *infile;
+
+int buffer_init(const char *filename)
+{
+	infile = filename ? fopen(filename, "r") : stdin;
+	if (!infile)
+		return -1;
+	return 0;
+}
+
+int buffer_deinit()
+{
+	fclose(infile);
+	return 0;
+}
+
+/* Read a line without trailing newline. */
+char *buffer_read_line(void)
+{
+	char *end;
+	if (!fgets(line_buffer, sizeof(line_buffer), infile))
+		/* Error or data exhausted. */
+		return NULL;
+	end = line_buffer + strlen(line_buffer);
+	if (end[-1] == '\n')
+		end[-1] = '\0';
+	else if (feof(infile))
+		; /* No newline at end of file.  That's fine. */
+	else
+		/*
+		 * Line was too long.
+		 * There is probably a saner way to deal with this,
+		 * but for now let's return an error.
+		 */
+		return NULL;
+	return line_buffer;
+}
+
+char *buffer_read_string(uint32_t len)
+{
+	char *s;
+	blob_free(blob_pool.size);
+	s = blob_pointer(blob_alloc(len + 1));
+	s[fread(s, 1, len, infile)] = '\0';
+	return ferror(infile) ? NULL : s;
+}
+
+void buffer_copy_bytes(uint32_t len)
+{
+	uint32_t in;
+	while (len > 0 && !feof(infile)) {
+		in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
+		in = fread(byte_buffer, 1, in, infile);
+		len -= in;
+		fwrite(byte_buffer, 1, in, stdout);
+		if (ferror(infile) || ferror(stdout))
+			/* NEEDSWORK: handle error. */
+			break;
+	}
+}
+
+void buffer_skip_bytes(uint32_t len)
+{
+	uint32_t in;
+	while (len > 0 && !feof(infile) && !ferror(infile)) {
+		in = len < COPY_BUFFER_LEN ? len : COPY_BUFFER_LEN;
+		in = fread(byte_buffer, 1, in, infile);
+		len -= in;
+	}
+}
+
+void buffer_reset(void)
+{
+	blob_reset();
+}
diff --git a/vcs-svn/line_buffer.h b/vcs-svn/line_buffer.h
new file mode 100644
index 0000000..631d1df
--- /dev/null
+++ b/vcs-svn/line_buffer.h
@@ -0,0 +1,14 @@
+#ifndef LINE_BUFFER_H_
+#define LINE_BUFFER_H_
+
+#include "git-compat-util.h"
+
+int buffer_init(const char *filename);
+int buffer_deinit(void);
+char *buffer_read_line(void);
+char *buffer_read_string(uint32_t len);
+void buffer_copy_bytes(uint32_t len);
+void buffer_skip_bytes(uint32_t len);
+void buffer_reset(void);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/9] Add infrastructure to write revisions in fast-export format
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (5 preceding siblings ...)
  2010-06-24 11:01 ` [PATCH 6/9] Add stream helper library Jonathan Nieder
@ 2010-06-24 11:02 ` Jonathan Nieder
  2010-06-24 19:29   ` Ramkumar Ramachandra
  2010-06-24 11:03 ` [PATCH 8/9] Add SVN dump parser Jonathan Nieder
                   ` (2 subsequent siblings)
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 11:02 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: David Barr <david.barr@cordelta.com>

repo_tree maintains the exporter's state and provides a facility to to
call fast_export, which writes objects to stdout suitable for
consumption by fast-import.

The exported functions roughly correspond to Subversion FS operations.

 . repo_add adds a file to the current commit.

 . repo_modify adds a replacement for an existing file;
   it is implemented exactly the same way, but a check could be
   added later to distinguish the two cases.

 . repo_copy copies a blob from a previous revision to the current
   commit.

 . repo_replace modifies the content of a file from the current
   commit, if and only if it exists.

 . repo_delete removes a file or directory from the current commit.

 . repo_commit calls out to fast_export to write the current commit to
   the fast-import stream in stdout.

 . repo_diff is used by the fast_export module to write the changes
   for a commit.

 . repo_reset erases the exporter's state, so valgrind can be happy.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile              |    5 +-
 vcs-svn/fast_export.c |   75 +++++++++++
 vcs-svn/fast_export.h |   14 ++
 vcs-svn/repo_tree.c   |  335 +++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/repo_tree.h   |   26 ++++
 5 files changed, 453 insertions(+), 2 deletions(-)
 create mode 100644 vcs-svn/fast_export.c
 create mode 100644 vcs-svn/fast_export.h
 create mode 100644 vcs-svn/repo_tree.c
 create mode 100644 vcs-svn/repo_tree.h

diff --git a/Makefile b/Makefile
index 8223d9b..7c66dcc 100644
--- a/Makefile
+++ b/Makefile
@@ -1740,7 +1740,8 @@ ifndef NO_CURL
 endif
 XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
 	xdiff/xmerge.o xdiff/xpatience.o
-VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o
+VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \
+	vcs-svn/repo_tree.o vcs-svn/fast_export.o
 OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
 
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
@@ -1865,7 +1866,7 @@ xdiff-interface.o $(XDIFF_OBJS): \
 
 $(VCSSVN_OBJS): \
 	vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \
-	vcs-svn/line_buffer.h
+	vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/fast_export.c b/vcs-svn/fast_export.c
new file mode 100644
index 0000000..7552803
--- /dev/null
+++ b/vcs-svn/fast_export.c
@@ -0,0 +1,75 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "git-compat-util.h"
+
+#include "fast_export.h"
+#include "line_buffer.h"
+#include "repo_tree.h"
+#include "string_pool.h"
+
+#define MAX_GITSVN_LINE_LEN 4096
+
+static uint32_t first_commit_done;
+
+void fast_export_delete(uint32_t depth, uint32_t *path)
+{
+	putchar('D');
+	putchar(' ');
+	pool_print_seq(depth, path, '/', stdout);
+	putchar('\n');
+}
+
+void fast_export_modify(uint32_t depth, uint32_t *path, uint32_t mode,
+                        uint32_t mark)
+{
+	/* Mode must be 100644, 100755, 120000, or 160000. */
+	printf("M %06o :%d ", mode, mark);
+	pool_print_seq(depth, path, '/', stdout);
+	putchar('\n');
+}
+
+static char gitsvnline[MAX_GITSVN_LINE_LEN];
+void fast_export_commit(uint32_t revision, uint32_t author, char *log,
+			uint32_t uuid, uint32_t url,
+			unsigned long timestamp)
+{
+	if (!log)
+		log = "";
+	if (~uuid && ~url) {
+		snprintf(gitsvnline, MAX_GITSVN_LINE_LEN, "\n\ngit-svn-id: %s@%d %s\n",
+				 pool_fetch(url), revision, pool_fetch(uuid));
+	} else {
+		*gitsvnline = '\0';
+	}
+	printf("commit refs/heads/master\n");
+	printf("committer %s <%s@%s> %ld +0000\n",
+		   ~author ? pool_fetch(author) : "nobody",
+		   ~author ? pool_fetch(author) : "nobody",
+		   ~uuid ? pool_fetch(uuid) : "local", timestamp);
+	printf("data %zd\n%s%s\n",
+		   strlen(log) + strlen(gitsvnline), log, gitsvnline);
+	if (!first_commit_done) {
+		if (revision > 1)
+			printf("from refs/heads/master^0\n");
+		first_commit_done = 1;
+	}
+	repo_diff(revision - 1, revision);
+	fputc('\n', stdout);
+
+	printf("progress Imported commit %d.\n\n", revision);
+}
+
+void fast_export_blob(uint32_t mode, uint32_t mark, uint32_t len)
+{
+	if (mode == REPO_MODE_LNK) {
+		/* svn symlink blobs start with "link " */
+		buffer_skip_bytes(5);
+		len -= 5;
+	}
+	printf("blob\nmark :%d\ndata %d\n", mark, len);
+	buffer_copy_bytes(len);
+	fputc('\n', stdout);
+}
diff --git a/vcs-svn/fast_export.h b/vcs-svn/fast_export.h
new file mode 100644
index 0000000..47e8f56
--- /dev/null
+++ b/vcs-svn/fast_export.h
@@ -0,0 +1,14 @@
+#ifndef FAST_EXPORT_H_
+#define FAST_EXPORT_H_
+
+#include <stdint.h>
+#include <time.h>
+
+void fast_export_delete(uint32_t depth, uint32_t *path);
+void fast_export_modify(uint32_t depth, uint32_t *path, uint32_t mode,
+			uint32_t mark);
+void fast_export_commit(uint32_t revision, uint32_t author, char *log,
+			uint32_t uuid, uint32_t url, unsigned long timestamp);
+void fast_export_blob(uint32_t mode, uint32_t mark, uint32_t len);
+
+#endif
diff --git a/vcs-svn/repo_tree.c b/vcs-svn/repo_tree.c
new file mode 100644
index 0000000..59a7434
--- /dev/null
+++ b/vcs-svn/repo_tree.c
@@ -0,0 +1,335 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "git-compat-util.h"
+
+#include "string_pool.h"
+#include "repo_tree.h"
+#include "obj_pool.h"
+#include "fast_export.h"
+
+#include "trp.h"
+
+struct repo_dirent {
+	uint32_t name_offset;
+	struct trp_node children;
+	uint32_t mode;
+	uint32_t content_offset;
+};
+
+struct repo_dir {
+	struct trp_root entries;
+};
+
+struct repo_commit {
+	uint32_t root_dir_offset;
+};
+
+/* Memory pools for commit, dir and dirent */
+obj_pool_gen(commit, struct repo_commit, 4096);
+obj_pool_gen(dir, struct repo_dir, 4096);
+obj_pool_gen(dirent, struct repo_dirent, 4096);
+
+static uint32_t active_commit;
+static uint32_t mark;
+
+static int repo_dirent_name_cmp(const void *a, const void *b);
+
+/* Treap for directory entries */
+trp_gen(static, dirent_, struct repo_dirent, children, dirent, repo_dirent_name_cmp);
+
+uint32_t next_blob_mark(void)
+{
+	return mark++;
+}
+
+static struct repo_dir *repo_commit_root_dir(struct repo_commit *commit)
+{
+	return dir_pointer(commit->root_dir_offset);
+}
+
+static struct repo_dirent *repo_first_dirent(struct repo_dir *dir)
+{
+	return dirent_first(&dir->entries);
+}
+
+static int repo_dirent_name_cmp(const void *a, const void *b)
+{
+	const struct repo_dirent *dirent1 = a, *dirent2 = b;
+	uint32_t a_offset = dirent1->name_offset;
+	uint32_t b_offset = dirent2->name_offset;
+	return (a_offset > b_offset) - (a_offset < b_offset);
+}
+
+static int repo_dirent_is_dir(struct repo_dirent *dirent)
+{
+	return dirent != NULL && dirent->mode == REPO_MODE_DIR;
+}
+
+static struct repo_dir *repo_dir_from_dirent(struct repo_dirent *dirent)
+{
+	if (!repo_dirent_is_dir(dirent))
+		return NULL;
+	return dir_pointer(dirent->content_offset);
+}
+
+static struct repo_dir *repo_clone_dir(struct repo_dir *orig_dir)
+{
+	uint32_t orig_o, new_o;
+	orig_o = dir_offset(orig_dir);
+	if (orig_o >= dir_pool.committed)
+		return orig_dir;
+	new_o = dir_alloc(1);
+	orig_dir = dir_pointer(orig_o);
+	*dir_pointer(new_o) = *orig_dir;
+	return dir_pointer(new_o);
+}
+
+static struct repo_dirent *repo_read_dirent(uint32_t revision, uint32_t *path)
+{
+	uint32_t name = 0;
+	struct repo_dirent *key = dirent_pointer(dirent_alloc(1));
+	struct repo_dir *dir = NULL;
+	struct repo_dirent *dirent = NULL;
+	dir = repo_commit_root_dir(commit_pointer(revision));
+	while (~(name = *path++)) {
+		key->name_offset = name;
+		dirent = dirent_search(&dir->entries, key);
+		if (dirent == NULL || !repo_dirent_is_dir(dirent))
+			break;
+		dir = repo_dir_from_dirent(dirent);
+	}
+	dirent_free(1);
+	return dirent;
+}
+
+static void repo_write_dirent(uint32_t *path, uint32_t mode,
+                              uint32_t content_offset, uint32_t del)
+{
+	uint32_t name, revision, dir_o = ~0, parent_dir_o = ~0;
+	struct repo_dir *dir;
+	struct repo_dirent *key;
+	struct repo_dirent *dirent = NULL;
+	revision = active_commit;
+	dir = repo_commit_root_dir(commit_pointer(revision));
+	dir = repo_clone_dir(dir);
+	commit_pointer(revision)->root_dir_offset = dir_offset(dir);
+	while (~(name = *path++)) {
+		parent_dir_o = dir_offset(dir);
+
+		key = dirent_pointer(dirent_alloc(1));
+		key->name_offset = name;
+
+		dirent = dirent_search(&dir->entries, key);
+		if (dirent == NULL)
+			dirent = key;
+		else
+			dirent_free(1);
+
+		if (dirent == key) {
+			dirent->mode = REPO_MODE_DIR;
+			dirent->content_offset = 0;
+			dirent_insert(&dir->entries, dirent);
+		}
+
+		if (dirent_offset(dirent) < dirent_pool.committed) {
+			dir_o = repo_dirent_is_dir(dirent) ?
+					dirent->content_offset : ~0;
+			dirent_remove(&dir->entries, dirent);
+			dirent = dirent_pointer(dirent_alloc(1));
+			dirent->name_offset = name;
+			dirent->mode = REPO_MODE_DIR;
+			dirent->content_offset = dir_o;
+			dirent_insert(&dir->entries, dirent);
+		}
+
+		dir = repo_dir_from_dirent(dirent);
+		dir = repo_clone_dir(dir);
+		dirent->content_offset = dir_offset(dir);
+	}
+	if (dirent == NULL)
+		return;
+	dirent->mode = mode;
+	dirent->content_offset = content_offset;
+	if (del && ~parent_dir_o)
+		dirent_remove(&dir_pointer(parent_dir_o)->entries, dirent);
+}
+
+uint32_t repo_copy(uint32_t revision, uint32_t *src, uint32_t *dst)
+{
+	uint32_t mode = 0, content_offset = 0;
+	struct repo_dirent *src_dirent;
+	src_dirent = repo_read_dirent(revision, src);
+	if (src_dirent != NULL) {
+		mode = src_dirent->mode;
+		content_offset = src_dirent->content_offset;
+		repo_write_dirent(dst, mode, content_offset, 0);
+	}
+	return mode;
+}
+
+void repo_add(uint32_t *path, uint32_t mode, uint32_t blob_mark)
+{
+	repo_write_dirent(path, mode, blob_mark, 0);
+}
+
+uint32_t repo_replace(uint32_t *path, uint32_t blob_mark)
+{
+	uint32_t mode = 0;
+	struct repo_dirent *src_dirent;
+	src_dirent = repo_read_dirent(active_commit, path);
+	if (src_dirent != NULL) {
+		mode = src_dirent->mode;
+		repo_write_dirent(path, mode, blob_mark, 0);
+	}
+	return mode;
+}
+
+void repo_modify(uint32_t *path, uint32_t mode, uint32_t blob_mark)
+{
+	struct repo_dirent *src_dirent;
+	src_dirent = repo_read_dirent(active_commit, path);
+	if (src_dirent != NULL && blob_mark == 0)
+		blob_mark = src_dirent->content_offset;
+	repo_write_dirent(path, mode, blob_mark, 0);
+}
+
+void repo_delete(uint32_t *path)
+{
+	repo_write_dirent(path, 0, 0, 1);
+}
+
+static void repo_git_add_r(uint32_t depth, uint32_t *path, struct repo_dir *dir);
+
+static void repo_git_add(uint32_t depth, uint32_t *path, struct repo_dirent *dirent)
+{
+	if (repo_dirent_is_dir(dirent))
+		repo_git_add_r(depth, path, repo_dir_from_dirent(dirent));
+	else
+		fast_export_modify(depth, path,
+		                   dirent->mode, dirent->content_offset);
+}
+
+static void repo_git_add_r(uint32_t depth, uint32_t *path, struct repo_dir *dir)
+{
+	struct repo_dirent *de = repo_first_dirent(dir);
+	while (de) {
+		path[depth] = de->name_offset;
+		repo_git_add(depth + 1, path, de);
+		de = dirent_next(&dir->entries, de);
+	}
+}
+
+static void repo_diff_r(uint32_t depth, uint32_t *path, struct repo_dir *dir1,
+                        struct repo_dir *dir2)
+{
+	struct repo_dirent *de1, *de2;
+	de1 = repo_first_dirent(dir1);
+	de2 = repo_first_dirent(dir2);
+
+	while (de1 && de2) {
+		if (de1->name_offset < de2->name_offset) {
+			path[depth] = de1->name_offset;
+			fast_export_delete(depth + 1, path);
+			de1 = dirent_next(&dir1->entries, de1);
+			continue;
+		}
+		if (de1->name_offset > de2->name_offset) {
+			path[depth] = de2->name_offset;
+			repo_git_add(depth + 1, path, de2);
+			de2 = dirent_next(&dir2->entries, de2);
+			continue;
+		}
+		path[depth] = de1->name_offset;
+
+		if (de1->mode == de2->mode &&
+		    de1->content_offset == de2->content_offset) {
+			; /* No change. */
+		} else if (repo_dirent_is_dir(de1) && repo_dirent_is_dir(de2)) {
+			repo_diff_r(depth + 1, path,
+				    repo_dir_from_dirent(de1),
+				    repo_dir_from_dirent(de2));
+		} else if (!repo_dirent_is_dir(de1) && !repo_dirent_is_dir(de2)) {
+			repo_git_add(depth + 1, path, de2);
+		} else {
+			fast_export_delete(depth + 1, path);
+			repo_git_add(depth + 1, path, de2);
+		}
+		de1 = dirent_next(&dir1->entries, de1);
+		de2 = dirent_next(&dir2->entries, de2);
+	}
+	while (de1) {
+		path[depth] = de1->name_offset;
+		fast_export_delete(depth + 1, path);
+		de1 = dirent_next(&dir1->entries, de1);
+	}
+	while (de2) {
+		path[depth] = de2->name_offset;
+		repo_git_add(depth + 1, path, de2);
+		de2 = dirent_next(&dir2->entries, de2);
+	}
+}
+
+static uint32_t path_stack[REPO_MAX_PATH_DEPTH];
+
+void repo_diff(uint32_t r1, uint32_t r2)
+{
+	repo_diff_r(0,
+	            path_stack,
+	            repo_commit_root_dir(commit_pointer(r1)),
+	            repo_commit_root_dir(commit_pointer(r2)));
+}
+
+void repo_commit(uint32_t revision, uint32_t author, char *log, uint32_t uuid,
+                 uint32_t url, unsigned long timestamp)
+{
+	fast_export_commit(revision, author, log, uuid, url, timestamp);
+	pool_commit();
+	dirent_commit();
+	dir_commit();
+	commit_commit();
+	active_commit = commit_alloc(1);
+	commit_pointer(active_commit)->root_dir_offset =
+		commit_pointer(active_commit - 1)->root_dir_offset;
+}
+
+static void mark_init(void)
+{
+	uint32_t i;
+	mark = 0;
+	for (i = 0; i < dirent_pool.size; i++)
+		if (!repo_dirent_is_dir(dirent_pointer(i)) &&
+		    dirent_pointer(i)->content_offset > mark)
+			mark = dirent_pointer(i)->content_offset;
+	mark++;
+}
+
+void repo_init() {
+	pool_init();
+	commit_init();
+	dir_init();
+	dirent_init();
+	mark_init();
+	if (commit_pool.size == 0) {
+		/* Create empty tree for commit 0. */
+		commit_alloc(1);
+		commit_pointer(0)->root_dir_offset = dir_alloc(1);
+		dir_pointer(0)->entries.trp_root = ~0;
+		dir_commit();
+		commit_commit();
+	}
+	/* Preallocate next commit, ready for changes. */
+	active_commit = commit_alloc(1);
+	commit_pointer(active_commit)->root_dir_offset =
+		commit_pointer(active_commit - 1)->root_dir_offset;
+}
+
+void repo_reset(void)
+{
+	pool_reset();
+	commit_reset();
+	dir_reset();
+	dirent_reset();
+}
diff --git a/vcs-svn/repo_tree.h b/vcs-svn/repo_tree.h
new file mode 100644
index 0000000..92a7a7b
--- /dev/null
+++ b/vcs-svn/repo_tree.h
@@ -0,0 +1,26 @@
+#ifndef REPO_TREE_H_
+#define REPO_TREE_H_
+
+#include "git-compat-util.h"
+
+#define REPO_MODE_DIR 0040000
+#define REPO_MODE_BLB 0100644
+#define REPO_MODE_EXE 0100755
+#define REPO_MODE_LNK 0120000
+
+#define REPO_MAX_PATH_LEN 4096
+#define REPO_MAX_PATH_DEPTH 1000
+
+uint32_t next_blob_mark(void);
+uint32_t repo_copy(uint32_t revision, uint32_t *src, uint32_t *dst);
+void repo_add(uint32_t *path, uint32_t mode, uint32_t blob_mark);
+uint32_t repo_replace(uint32_t *path, uint32_t blob_mark);
+void repo_modify(uint32_t *path, uint32_t mode, uint32_t blob_mark);
+void repo_delete(uint32_t *path);
+void repo_commit(uint32_t revision, uint32_t author, char *log, uint32_t uuid,
+                 uint32_t url, long unsigned timestamp);
+void repo_diff(uint32_t r1, uint32_t r2);
+void repo_init(void);
+void repo_reset(void);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 8/9] Add SVN dump parser
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (6 preceding siblings ...)
  2010-06-24 11:02 ` [PATCH 7/9] Add infrastructure to write revisions in fast-export format Jonathan Nieder
@ 2010-06-24 11:03 ` Jonathan Nieder
  2010-06-24 20:33   ` Ramkumar Ramachandra
  2010-06-24 11:07 ` [PATCH 9/9] Add a sample user for the svndump library Jonathan Nieder
  2010-06-24 13:06 ` [PATCH/RFC v2 0/9] Subversion dump parsing library Ramkumar Ramachandra
  9 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 11:03 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

From: David Barr <david.barr@cordelta.com>

svndump parses data that is in SVN dumpfile format produced by
`svnadmin dump` with the help of line_buffer and uses repo_tree and
fast_export to emit a git fast-import stream.

Based roughly on com.hydrografix.svndump 0.92 from the SvnToCCase
project at <http://svn2cc.sarovar.org/>, by Stefan Hegny and
others.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
 Makefile          |    5 +-
 vcs-svn/LICENSE   |    4 +
 vcs-svn/svndump.c |  289 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/svndump.h |    8 ++
 4 files changed, 304 insertions(+), 2 deletions(-)
 create mode 100644 vcs-svn/svndump.c
 create mode 100644 vcs-svn/svndump.h

diff --git a/Makefile b/Makefile
index 7c66dcc..e7b37e0 100644
--- a/Makefile
+++ b/Makefile
@@ -1741,7 +1741,7 @@ endif
 XDIFF_OBJS = xdiff/xdiffi.o xdiff/xprepare.o xdiff/xutils.o xdiff/xemit.o \
 	xdiff/xmerge.o xdiff/xpatience.o
 VCSSVN_OBJS = vcs-svn/string_pool.o vcs-svn/line_buffer.o \
-	vcs-svn/repo_tree.o vcs-svn/fast_export.o
+	vcs-svn/repo_tree.o vcs-svn/fast_export.o vcs-svn/svndump.o
 OBJECTS := $(GIT_OBJS) $(XDIFF_OBJS) $(VCSSVN_OBJS)
 
 dep_files := $(foreach f,$(OBJECTS),$(dir $f).depend/$(notdir $f).d)
@@ -1866,7 +1866,8 @@ xdiff-interface.o $(XDIFF_OBJS): \
 
 $(VCSSVN_OBJS): \
 	vcs-svn/obj_pool.h vcs-svn/trp.h vcs-svn/string_pool.h \
-	vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h
+	vcs-svn/line_buffer.h vcs-svn/repo_tree.h vcs-svn/fast_export.h \
+	vcs-svn/svndump.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/vcs-svn/LICENSE b/vcs-svn/LICENSE
index a3d384c..0a5e3c4 100644
--- a/vcs-svn/LICENSE
+++ b/vcs-svn/LICENSE
@@ -4,6 +4,10 @@ All rights reserved.
 Copyright (C) 2008 Jason Evans <jasone@canonware.com>.
 All rights reserved.
 
+Copyright (C) 2005 Stefan Hegny, hydrografix Consulting GmbH,
+Frankfurt/Main, Germany
+and others, see http://svn2cc.sarovar.org
+
 Redistribution and use in source and binary forms, with or without
 modification, are permitted provided that the following conditions
 are met:
diff --git a/vcs-svn/svndump.c b/vcs-svn/svndump.c
new file mode 100644
index 0000000..86714ed
--- /dev/null
+++ b/vcs-svn/svndump.c
@@ -0,0 +1,289 @@
+/*
+ * Parse and rearrange a svnadmin dump.
+ * Create the dump with:
+ * svnadmin dump --incremental -r<startrev>:<endrev> <repository> >outfile
+ *
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#include "cache.h"
+#include "repo_tree.h"
+#include "fast_export.h"
+#include "line_buffer.h"
+#include "obj_pool.h"
+#include "string_pool.h"
+
+#define NODEACT_REPLACE 4
+#define NODEACT_DELETE 3
+#define NODEACT_ADD 2
+#define NODEACT_CHANGE 1
+#define NODEACT_UNKNOWN 0
+
+#define DUMP_CTX 0
+#define REV_CTX  1
+#define NODE_CTX 2
+
+#define LENGTH_UNKNOWN (~0)
+#define DATE_RFC2822_LEN 31
+
+/* Create memory pool for log messages */
+obj_pool_gen(log, char, 4096);
+
+static char* log_copy(uint32_t length, char *log)
+{
+	char *buffer;
+	log_free(log_pool.size);
+	buffer = log_pointer(log_alloc(length));
+	strncpy(buffer, log, length);
+	return buffer;
+}
+
+static struct {
+	uint32_t action, propLength, textLength, srcRev, srcMode, mark, type;
+	uint32_t src[REPO_MAX_PATH_DEPTH], dst[REPO_MAX_PATH_DEPTH];
+} node_ctx;
+
+static struct {
+	uint32_t revision, author;
+	unsigned long timestamp;
+	char *log;
+} rev_ctx;
+
+static struct {
+	uint32_t uuid, url;
+} dump_ctx;
+
+static struct {
+	uint32_t svn_log, svn_author, svn_date, svn_executable, svn_special, uuid,
+		revision_number, node_path, node_kind, node_action,
+		node_copyfrom_path, node_copyfrom_rev, text_content_length,
+		prop_content_length, content_length;
+} keys;
+
+static void reset_node_ctx(char *fname)
+{
+	node_ctx.type = 0;
+	node_ctx.action = NODEACT_UNKNOWN;
+	node_ctx.propLength = LENGTH_UNKNOWN;
+	node_ctx.textLength = LENGTH_UNKNOWN;
+	node_ctx.src[0] = ~0;
+	node_ctx.srcRev = 0;
+	node_ctx.srcMode = 0;
+	pool_tok_seq(REPO_MAX_PATH_DEPTH, node_ctx.dst, "/", fname);
+	node_ctx.mark = 0;
+}
+
+static void reset_rev_ctx(uint32_t revision)
+{
+	rev_ctx.revision = revision;
+	rev_ctx.timestamp = 0;
+	rev_ctx.log = NULL;
+	rev_ctx.author = ~0;
+}
+
+static void reset_dump_ctx(uint32_t url)
+{
+	dump_ctx.url = url;
+	dump_ctx.uuid = ~0;
+}
+
+static void init_keys(void)
+{
+	keys.svn_log = pool_intern("svn:log");
+	keys.svn_author = pool_intern("svn:author");
+	keys.svn_date = pool_intern("svn:date");
+	keys.svn_executable = pool_intern("svn:executable");
+	keys.svn_special = pool_intern("svn:special");
+	keys.uuid = pool_intern("UUID");
+	keys.revision_number = pool_intern("Revision-number");
+	keys.node_path = pool_intern("Node-path");
+	keys.node_kind = pool_intern("Node-kind");
+	keys.node_action = pool_intern("Node-action");
+	keys.node_copyfrom_path = pool_intern("Node-copyfrom-path");
+	keys.node_copyfrom_rev = pool_intern("Node-copyfrom-rev");
+	keys.text_content_length = pool_intern("Text-content-length");
+	keys.prop_content_length = pool_intern("Prop-content-length");
+	keys.content_length = pool_intern("Content-length");
+}
+
+static void read_props(void)
+{
+	uint32_t len;
+	uint32_t key = ~0;
+	char *val = NULL;
+	char *t;
+	while ((t = buffer_read_line()) && strcmp(t, "PROPS-END")) {
+		if (!strncmp(t, "K ", 2)) {
+			len = atoi(&t[2]);
+			key = pool_intern(buffer_read_string(len));
+			buffer_read_line();
+		} else if (!strncmp(t, "V ", 2)) {
+			len = atoi(&t[2]);
+			val = buffer_read_string(len);
+			if (key == keys.svn_log) {
+				/* Value length excludes terminating nul. */
+				rev_ctx.log = log_copy(len + 1, val);
+			} else if (key == keys.svn_author) {
+				rev_ctx.author = pool_intern(val);
+			} else if (key == keys.svn_date) {
+				if (parse_date_basic(val, &rev_ctx.timestamp, NULL))
+					fprintf(stderr, "Invalid timestamp: %s\n", val);
+			} else if (key == keys.svn_executable) {
+				node_ctx.type = REPO_MODE_EXE;
+			} else if (key == keys.svn_special) {
+				node_ctx.type = REPO_MODE_LNK;
+			}
+			key = ~0;
+			buffer_read_line();
+		}
+	}
+}
+
+static void handle_node(void)
+{
+	if (node_ctx.propLength != LENGTH_UNKNOWN && node_ctx.propLength)
+		read_props();
+
+	if (node_ctx.srcRev)
+		node_ctx.srcMode = repo_copy(node_ctx.srcRev, node_ctx.src, node_ctx.dst);
+
+	if (node_ctx.textLength != LENGTH_UNKNOWN &&
+	    node_ctx.type != REPO_MODE_DIR)
+		node_ctx.mark = next_blob_mark();
+
+	if (node_ctx.action == NODEACT_DELETE) {
+		repo_delete(node_ctx.dst);
+	} else if (node_ctx.action == NODEACT_CHANGE ||
+			   node_ctx.action == NODEACT_REPLACE) {
+		if (node_ctx.action == NODEACT_REPLACE &&
+		    node_ctx.type == REPO_MODE_DIR)
+			repo_replace(node_ctx.dst, node_ctx.mark);
+		else if (node_ctx.propLength != LENGTH_UNKNOWN)
+			repo_modify(node_ctx.dst, node_ctx.type, node_ctx.mark);
+		else if (node_ctx.textLength != LENGTH_UNKNOWN)
+			node_ctx.srcMode = repo_replace(node_ctx.dst, node_ctx.mark);
+	} else if (node_ctx.action == NODEACT_ADD) {
+		if (node_ctx.srcRev && node_ctx.propLength != LENGTH_UNKNOWN)
+			repo_modify(node_ctx.dst, node_ctx.type, node_ctx.mark);
+		else if (node_ctx.srcRev && node_ctx.textLength != LENGTH_UNKNOWN)
+			node_ctx.srcMode = repo_replace(node_ctx.dst, node_ctx.mark);
+		else if ((node_ctx.type == REPO_MODE_DIR && !node_ctx.srcRev) ||
+		         node_ctx.textLength != LENGTH_UNKNOWN)
+			repo_add(node_ctx.dst, node_ctx.type, node_ctx.mark);
+	}
+
+	if (node_ctx.propLength == LENGTH_UNKNOWN && node_ctx.srcMode)
+		node_ctx.type = node_ctx.srcMode;
+
+	if (node_ctx.mark)
+		fast_export_blob(node_ctx.type, node_ctx.mark, node_ctx.textLength);
+	else if (node_ctx.textLength != LENGTH_UNKNOWN)
+		buffer_skip_bytes(node_ctx.textLength);
+}
+
+static void handle_revision(void)
+{
+	if (rev_ctx.revision)
+		repo_commit(rev_ctx.revision, rev_ctx.author, rev_ctx.log,
+			dump_ctx.uuid, dump_ctx.url, rev_ctx.timestamp);
+}
+
+void svndump_read(char *url)
+{
+	char *val;
+	char *t;
+	uint32_t active_ctx = DUMP_CTX;
+	uint32_t len;
+	uint32_t key;
+
+	reset_dump_ctx(pool_intern(url));
+	while ((t = buffer_read_line())) {
+		val = strstr(t, ": ");
+		if (!val)
+			continue;
+		*val++ = '\0';
+		*val++ = '\0';
+		key = pool_intern(t);
+
+		if (key == keys.uuid) {
+			dump_ctx.uuid = pool_intern(val);
+		} else if (key == keys.revision_number) {
+			if (active_ctx == NODE_CTX)
+				handle_node();
+			if (active_ctx != DUMP_CTX)
+				handle_revision();
+			active_ctx = REV_CTX;
+			reset_rev_ctx(atoi(val));
+		} else if (key == keys.node_path) {
+			if (active_ctx == NODE_CTX)
+				handle_node();
+			active_ctx = NODE_CTX;
+			reset_node_ctx(val);
+		} else if (key == keys.node_kind) {
+			if (!strcmp(val, "dir"))
+				node_ctx.type = REPO_MODE_DIR;
+			else if (!strcmp(val, "file"))
+				node_ctx.type = REPO_MODE_BLB;
+			else
+				fprintf(stderr, "Unknown node-kind: %s\n", val);
+		} else if (key == keys.node_action) {
+			if (!strcmp(val, "delete")) {
+				node_ctx.action = NODEACT_DELETE;
+			} else if (!strcmp(val, "add")) {
+				node_ctx.action = NODEACT_ADD;
+			} else if (!strcmp(val, "change")) {
+				node_ctx.action = NODEACT_CHANGE;
+			} else if (!strcmp(val, "replace")) {
+				node_ctx.action = NODEACT_REPLACE;
+			} else {
+				fprintf(stderr, "Unknown node-action: %s\n", val);
+				node_ctx.action = NODEACT_UNKNOWN;
+			}
+		} else if (key == keys.node_copyfrom_path) {
+			pool_tok_seq(REPO_MAX_PATH_DEPTH, node_ctx.src, "/", val);
+		} else if (key == keys.node_copyfrom_rev) {
+			node_ctx.srcRev = atoi(val);
+		} else if (key == keys.text_content_length) {
+			node_ctx.textLength = atoi(val);
+		} else if (key == keys.prop_content_length) {
+			node_ctx.propLength = atoi(val);
+		} else if (key == keys.content_length) {
+			len = atoi(val);
+			buffer_read_line();
+			if (active_ctx == REV_CTX) {
+				read_props();
+			} else if (active_ctx == NODE_CTX) {
+				handle_node();
+				active_ctx = REV_CTX;
+			} else {
+				fprintf(stderr, "Unexpected content length header: %d\n", len);
+				buffer_skip_bytes(len);
+			}
+		}
+	}
+	if (active_ctx == NODE_CTX)
+		handle_node();
+	if (active_ctx != DUMP_CTX)
+		handle_revision();
+}
+
+void svndump_init(const char *filename)
+{
+	buffer_init(filename);
+	repo_init();
+	reset_dump_ctx(~0);
+	reset_rev_ctx(0);
+	reset_node_ctx(NULL);
+	init_keys();
+}
+
+void svndump_reset(void)
+{
+	log_reset();
+	buffer_reset();
+	repo_reset();
+	reset_dump_ctx(~0);
+	reset_rev_ctx(0);
+	reset_node_ctx(NULL);
+}
diff --git a/vcs-svn/svndump.h b/vcs-svn/svndump.h
new file mode 100644
index 0000000..38ad544
--- /dev/null
+++ b/vcs-svn/svndump.h
@@ -0,0 +1,8 @@
+#ifndef SVNDUMP_H_
+#define SVNDUMP_H_
+
+void svndump_init(const char *filename);
+void svndump_read(char *url);
+void svndump_reset(void);
+
+#endif
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (7 preceding siblings ...)
  2010-06-24 11:03 ` [PATCH 8/9] Add SVN dump parser Jonathan Nieder
@ 2010-06-24 11:07 ` Jonathan Nieder
  2010-06-24 20:17   ` Ramkumar Ramachandra
  2010-06-30  2:09   ` Sam Vilain
  2010-06-24 13:06 ` [PATCH/RFC v2 0/9] Subversion dump parsing library Ramkumar Ramachandra
  9 siblings, 2 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 11:07 UTC (permalink / raw)
  To: git
  Cc: Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

The svn-fe tool takes a Subversion dump file as input and produces
a fast-import stream as output.  This can be useful as a low-level
tool in building other importers, or for debugging the vcs-svn
library.

 make svn-fe
 make svn-fe.1

to test.

NEEDSWORK: litters cwd with useless .bin files.
But I hope it is enough to show the idea.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Thanks for reading.  Thoughts welcome.

 contrib/svn-fe/.gitignore |    3 ++
 contrib/svn-fe/Makefile   |   63 +++++++++++++++++++++++++++++++++++++++++++++
 contrib/svn-fe/svn-fe.c   |   43 ++++++++++++++++++++++++++++++
 contrib/svn-fe/svn-fe.txt |   56 ++++++++++++++++++++++++++++++++++++++++
 4 files changed, 165 insertions(+), 0 deletions(-)
 create mode 100644 contrib/svn-fe/.gitignore
 create mode 100644 contrib/svn-fe/Makefile
 create mode 100644 contrib/svn-fe/svn-fe.c
 create mode 100644 contrib/svn-fe/svn-fe.txt

diff --git a/contrib/svn-fe/.gitignore b/contrib/svn-fe/.gitignore
new file mode 100644
index 0000000..27a33b6
--- /dev/null
+++ b/contrib/svn-fe/.gitignore
@@ -0,0 +1,3 @@
+/*.xml
+/*.1
+/*.html
diff --git a/contrib/svn-fe/Makefile b/contrib/svn-fe/Makefile
new file mode 100644
index 0000000..4cc8d15
--- /dev/null
+++ b/contrib/svn-fe/Makefile
@@ -0,0 +1,63 @@
+all:: svn-fe$X
+
+CC = gcc
+RM = rm -f
+MV = mv
+
+CFLAGS = -g -O2 -Wall
+LDFLAGS =
+ALL_CFLAGS = $(CFLAGS)
+ALL_LDFLAGS = $(LDFLAGS)
+EXTLIBS =
+
+GIT_LIB = ../../libgit.a
+VCSSVN_LIB = ../../vcs-svn/lib.a
+LIBS = $(VCSSVN_LIB) $(GIT_LIB) $(EXTLIBS)
+
+QUIET_SUBDIR0 = +$(MAKE) -C # space to separate -C and subdir
+QUIET_SUBDIR1 =
+
+ifneq ($(findstring $(MAKEFLAGS),w),w)
+PRINT_DIR = --no-print-directory
+else # "make -w"
+NO_SUBDIR = :
+endif
+
+ifneq ($(findstring $(MAKEFLAGS),s),s)
+ifndef V
+	QUIET_CC      = @echo '   ' CC $@;
+	QUIET_LINK    = @echo '   ' LINK $@;
+	QUIET_SUBDIR0 = +@subdir=
+	QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo '   ' SUBDIR $$subdir; \
+	                $(MAKE) $(PRINT_DIR) -C $$subdir
+endif
+endif
+
+svn-fe$X: svn-fe.o $(VCSSVN_LIB) $(GIT_LIB)
+	$(QUIET_LINK)$(CC) $(ALL_CFLAGS) -o $@ svn-fe.o \
+		$(ALL_LDFLAGS) $(LIBS)
+
+svn-fe.o: svn-fe.c ../../vcs-svn/svndump.h
+	$(QUIET_CC)$(CC) -o $*.o -c $(ALL_CFLAGS) $<
+
+svn-fe.html: svn-fe.txt
+	$(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \
+		MAN_TXT=../contrib/svn-fe/svn-fe.txt \
+		../contrib/svn-fe/$@
+
+svn-fe.1: svn-fe.txt
+	$(QUIET_SUBDIR0)../../Documentation $(QUIET_SUBDIR1) \
+		MAN_TXT=../contrib/svn-fe/svn-fe.txt \
+		../contrib/svn-fe/$@
+	$(MV) ../../Documentation/svn-fe.1 .
+
+../../vcs-svn/lib.a: FORCE
+	$(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) vcs-svn/lib.a
+
+../../libgit.a: FORCE
+	$(QUIET_SUBDIR0)../.. $(QUIET_SUBDIR1) libgit.a
+
+clean:
+	$(RM) svn-fe$X svn-fe.o svn-fe.html svn-fe.xml svn-fe.1
+
+.PHONY: all clean FORCE
diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
new file mode 100644
index 0000000..d84dd4f
--- /dev/null
+++ b/contrib/svn-fe/svn-fe.c
@@ -0,0 +1,43 @@
+/*
+ * Parse and rearrange a svnadmin dump.
+ * Create the dump with:
+ * svnadmin dump --incremental -r<startrev>:<endrev> <repository> >outfile
+ *
+ * Copyright (C) 2010 David Barr <david.barr@cordelta.com>.
+ * All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ * 1. Redistributions of source code must retain the above copyright
+ *    notice(s), this list of conditions and the following disclaimer
+ *    unmodified other than the allowable addition of one or more
+ *    copyright notices.
+ * 2. Redistributions in binary form must reproduce the above copyright
+ *    notice(s), this list of conditions and the following disclaimer in
+ *    the documentation and/or other materials provided with the
+ *    distribution.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER(S) ``AS IS'' AND ANY
+ * EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
+ * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ * PURPOSE ARE DISCLAIMED.  IN NO EVENT SHALL THE COPYRIGHT HOLDER(S) BE
+ * LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR
+ * CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF
+ * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR
+ * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,
+ * WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE
+ * OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE,
+ * EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#include <stdlib.h>
+#include "../../vcs-svn/svndump.h"
+
+int main(int argc, char **argv)
+{
+	svndump_init(NULL);
+	svndump_read((argc > 1) ? argv[1] : NULL);
+	svndump_reset();
+	return 0;
+}
diff --git a/contrib/svn-fe/svn-fe.txt b/contrib/svn-fe/svn-fe.txt
new file mode 100644
index 0000000..bfd3a68
--- /dev/null
+++ b/contrib/svn-fe/svn-fe.txt
@@ -0,0 +1,56 @@
+svn-fe(1)
+=========
+
+NAME
+----
+svn-fe - convert an SVN "dumpfile" to a fast-import stream
+
+SYNOPSIS
+--------
+svnadmin dump --incremental REPO | svn-fe [url] | git fast-import
+
+DESCRIPTION
+-----------
+Converts a textual representation of a Subversion repository into
+input suitable for git-fast-import(1) and similar importers.
+
+INPUT FORMAT
+------------
+Subversion's repository dump format is documented in full in
+`notes/dump-load-format.txt` from the Subversion source tree.
+Files in this format can be generated using the 'svnadmin dump' or
+'svk admin dump' command.
+
+OUTPUT FORMAT
+-------------
+The fast-import format is documented by the git-fast-import(1)
+manual page.
+
+NOTES
+-----
+Subversion dumps do not record a separate author and committer for
+each revision, nor a separate display name and email address for
+each author.  Like git-svn(1), 'svn-fe' will use the name
+
+---------
+user <user@UUID>
+---------
+
+as committer, where 'user' is the value of the `svn:author` property
+and 'UUID' the repository's identifier.
+
+To support incremental imports, 'svn-fe' will put a `git-svn-id`
+line at the end of each commit log message if passed an url on the
+command line.  This line has the form `git-svn-id: URL@REVNO UUID`.
+
+Empty directories and unknown properties are silently discarded.
+
+The resulting repository will generally require further processing
+to put each project in its own repository and to separate the history
+of each branch.  The 'git filter-branch --subdirectory-filter' command
+may be useful for this purpose.
+
+SEE ALSO
+--------
+git-svn(1), svn2git(1), svk(1), git-filter-branch(1), git-fast-import(1),
+https://svn.apache.org/repos/asf/subversion/trunk/notes/dump-load-format.txt
-- 
1.7.1

^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH/RFC v2 0/9] Subversion dump parsing library
  2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
                   ` (8 preceding siblings ...)
  2010-06-24 11:07 ` [PATCH 9/9] Add a sample user for the svndump library Jonathan Nieder
@ 2010-06-24 13:06 ` Ramkumar Ramachandra
  2010-06-24 18:24   ` Jonathan Nieder
  2010-06-24 21:26   ` Jonathan Nieder
  9 siblings, 2 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 13:06 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Junio C Hamano

(+CC: Junio)

Hi Jonathan,

Jonathan Nieder wrote:
> Ram last sent this series a couple of weeks ago[1], and it was
> merged to pu then as rr/svn-export.  Here’s another iteration
> of the same for discussion, now including David Barr’s program
> that demonstrates the functionality.

There's a lot of work happening in the exporter and the series will
probably need to be re-rolled again: I recommend the following.
1. Review this series thoroughly, but don't actually merge it because
it's going to be re-rolled soon.
2. Split the series in two: The infrastructure part in vcs-svn/ will
be re-rolled later. Another part for contrib/ should be made into a
separate series and merged.
3. As soon as the client is complete, I'll roll a series that puts it
in vcs-svn/ (infrastructure again).
4. Finally, I'll roll out a series for the remote helper itself that
puts it in $GIT_ROOT.

To summarize, there should be four series:
1. David's exporter in vcs-svn/ (yet to be re-rolled): Either Jonathan
or I will handle this.
2. David's independent svn-fe program in contrib/: Jonathan will handle this.
3. The RA client to generate a full-text dumpfile on-the-fly: This
isn't done yet; as soon as it's finished, I'll roll a series.
4. The remote helper: This done, but it doesn't make sense to send
this in before everything else has been merged.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH/RFC v2 0/9] Subversion dump parsing library
  2010-06-24 13:06 ` [PATCH/RFC v2 0/9] Subversion dump parsing library Ramkumar Ramachandra
@ 2010-06-24 18:24   ` Jonathan Nieder
  2010-06-24 21:26   ` Jonathan Nieder
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 18:24 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf,
	Junio C Hamano, Eric Wong

Ramkumar Ramachandra wrote:

> There's a lot of work happening in the exporter and the series will
> probably need to be re-rolled again: I recommend the following.
> 1. Review this series thoroughly, but don't actually merge it because
> it's going to be re-rolled soon.

To be more specific: although svndump.c, fast_export.c, and
repo_tree.c may need to be a bit slushy now while we figure out the
incremental import story, the rest of the series is not likely to
change much in broad strokes.  I get the impression that facilities
like strbuf and string_list benefit greatly from early review, so in
particular let me mention the facilities this series adds:

 . obj_pool, an array of fixed-size records that can be written to
   disk.  This is just begging to be implemented with mmap; maybe
   in the future the compat/mmap.c shim can be tweaked to support
   faking it.

 . treap, a multiset datastructure built on top of obj_pool.  I
   suspect API cleanups would be welcome here: it’s a bit more
   unwieldly than string_list at the moment.

 . string_pool, a collection of interned strings built on top of
   treap.

 . line_buffer, a simple fread()/fgets() wrapper with a static buffer.

I would find feedback on these (or patches :)) especially welcome.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/9] Export parse_date_basic() to convert a date string to  timestamp
  2010-06-24 10:51 ` [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
@ 2010-06-24 18:32   ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 18:32 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Hi Jonathan,

This specific patch is clear. I have some issues with some of the
other parts in the series.

Jonathan Nieder wrote:
> approxidate() is not appropriate for reading machine-written dates
> because it guesses instead of erroring out on malformed dates.
> parse_date() is less convenient since it returns its output as a
> string.  So export the underlying function that writes a timestamp.

Right. I couldn't justify exposing it in the series that's in master now.

> While at it, change the return value to match the usual convention:
> return 0 for success and -1 for failure.

Since I'm to blame for this change,
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] Add memory pool library
  2010-06-24 10:53 ` [PATCH 3/9] Add memory pool library Jonathan Nieder
@ 2010-06-24 18:43   ` Ramkumar Ramachandra
  2010-06-24 18:55     ` Jonathan Nieder
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 18:43 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Hi Jonathan,

Jonathan Nieder wrote:
> From: David Barr <david.barr@cordelta.com>
>
> Add a memory pool library implemented using C macros. The obj_pool_gen()
> macro creates a type-specific memory pool API.

Until some sort of mmap is implemented, I doubt we can do much better
than this. By marking all the generated functions MAYBE_UNUSED, we're
actually suppressing more warnings than intended. Maybe we can avoid
it and somehow find a way to mark only those functions that are really
unused? Perhaps an extra parameter in obj_pool_gen?

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] Add memory pool library
  2010-06-24 18:43   ` Ramkumar Ramachandra
@ 2010-06-24 18:55     ` Jonathan Nieder
  2010-06-24 19:37       ` Ramkumar Ramachandra
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 18:55 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Hi Ram,

Ramkumar Ramachandra wrote:

> By marking all the generated functions MAYBE_UNUSED, we're
> actually suppressing more warnings than intended. Maybe we can avoid
> it and somehow find a way to mark only those functions that are really
> unused?

If we used templates instead of macros, a smart compiler would notice
which functions are _never_ used.  But sticking to C, I think it is
fine to rely on humans checking by hand for now.  (FWIW no obj_pool
functions are unused at the moment.)

I tried leaving out the MAYBE_UNUSED for foo_init() before I realized
that calling it is optional.

> Perhaps an extra parameter in obj_pool_gen?

Filling out such a list for each caller sounds to me like more trouble
than it’s worth.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] Add treap implementation
  2010-06-24 10:57 ` [PATCH 4/9] Add treap implementation Jonathan Nieder
@ 2010-06-24 19:08   ` Ramkumar Ramachandra
  2010-06-24 19:22     ` Jonathan Nieder
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 19:08 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi again,

Jonathan Nieder wrote:
> From: Jason Evans <jasone@canonware.com>
>
> Provide macros to generate a type-specific treap implementation and
> various functions to operate on it. It uses obj_pool.h to store memory
> nodes in a treap.  Previously committed nodes are never removed from
> the pool; after any *_commit operation, it is assumed (correctly, in
> the case of svn-fast-export) that someone else must care about them.

This is likely to change in a few days. David is currently working on
a Java implementation of a immutable ternary treap and will
re-implement it as C macros. See the ternary-treap branch.

>  $(VCSSVN_OBJS): \
> -       vcs-svn/obj_pool.h
> +       vcs-svn/obj_pool.h vcs-svn/trp.h

Interesting how you've shown this in every patch :)

> +/*
> + * Fibonacci hash function.
> + * The multiplier is the nearest prime to (2^32 times (√5 - 1)/2).
> + * See Knuth §6.4: volume 3, 3rd ed, p518.
> + */

Um, is it alright to put non-ascii characters in a file containing
code? I haven't seen such a thing in any of the other files. Will some
old compilers complain while parsing?

> --- /dev/null
> +++ b/vcs-svn/trp.txt
> @@ -0,0 +1,90 @@
> +treap API

The documentation is good, but I don't see it merged into the tree.
Perhaps send a patch to David? Also, you might want to include the
technical explanation for using treaps from the commit message here?

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/9] Add string-specific memory pool
  2010-06-24 10:58 ` [PATCH 5/9] Add string-specific memory pool Jonathan Nieder
@ 2010-06-24 19:19   ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 19:19 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi,

Jonathan Nieder wrote:
> Intern strings so they can be compared by address and stored without
> wasting space.

It's unlikely that this'll change much, if at all. It should go in
without any issues. I'm not sure this is appropriate since I've
already signed off, but anyway:
Reviewed-by: Ramkumar Ramachandra <artagnon@gmail.com>

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/9] Add treap implementation
  2010-06-24 19:08   ` Ramkumar Ramachandra
@ 2010-06-24 19:22     ` Jonathan Nieder
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 19:22 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Ramkumar Ramachandra wrote:

> David is currently working on
> a Java implementation of a immutable ternary treap and will
> re-implement it as C macros. See the ternary-treap branch.

I assume his blog post from a week or so ago[1] is relevant.

The trie-based version should be interesting but less generic (which
is fine, since we only have two users and both use string keys).

>>  $(VCSSVN_OBJS): \
>> -       vcs-svn/obj_pool.h
>> +       vcs-svn/obj_pool.h vcs-svn/trp.h
>
> Interesting how you've shown this in every patch :)

It would have been less messy to use

 $(VCSSVN_OBJS): \
	vcs-svn/obj_pool.h \
	vcs-svn/trp.h \
	... \

or

 $(VCSSVN_OBJS): vcs-svn/obj_pool.h
 $(VCSSVN_OBJS): vcs-svn/trp.h
 ...

I just can’t bring myself to care much. :)

> > +/*
> > + * Fibonacci hash function.
> > + * The multiplier is the nearest prime to (2^32 times (√5 - 1)/2).
> > + * See Knuth §6.4: volume 3, 3rd ed, p518.
> > + */
> 
> Um, is it alright to put non-ascii characters in a file containing
> code?

Yes, UTF-8 is sometimes used in comments for people’s names.  See
builtin/branch.c, for example.

> The documentation is good, but I don't see it merged into the tree.
> Perhaps send a patch to David?

Yes.  I should send some other patches, too, to minimize the delta.

> Also, you might want to include the
> technical explanation for using treaps from the commit message here?

Good idea, thanks!

Jonathan

[1] http://barrbrain.github.com/2010/06/19/relocatable-immutable-randomised-ternary-search-tree-part-2.html

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/9] Add infrastructure to write revisions in fast-export  format
  2010-06-24 11:02 ` [PATCH 7/9] Add infrastructure to write revisions in fast-export format Jonathan Nieder
@ 2010-06-24 19:29   ` Ramkumar Ramachandra
  2010-06-24 19:36     ` Jonathan Nieder
  2010-06-24 19:49     ` Jonathan Nieder
  0 siblings, 2 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 19:29 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi again,

Jonathan Nieder wrote:
> repo_tree maintains the exporter's state and provides a facility to to
> call fast_export, which writes objects to stdout suitable for
> consumption by fast-import.

These files will also change significantly in a few days- see the
ternary_treap branch.

> The exported functions roughly correspond to Subversion FS operations.

This description is sufficient for the commit message.

>  . repo_add adds a file to the current commit.
>
>  . repo_modify adds a replacement for an existing file;
>   it is implemented exactly the same way, but a check could be
>   added later to distinguish the two cases.
>
>  . repo_copy copies a blob from a previous revision to the current
>   commit.
>
>  . repo_replace modifies the content of a file from the current
>   commit, if and only if it exists.
>
>  . repo_delete removes a file or directory from the current commit.
>
>  . repo_commit calls out to fast_export to write the current commit to
>   the fast-import stream in stdout.
>
>  . repo_diff is used by the fast_export module to write the changes
>   for a commit.
>
>  . repo_reset erases the exporter's state, so valgrind can be happy.

This is like API documentation- should it go into the commit message?
Maybe put this in a a dedicated repo_tree.txt like trp.h?

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/9] Add infrastructure to write revisions in fast-export format
  2010-06-24 19:29   ` Ramkumar Ramachandra
@ 2010-06-24 19:36     ` Jonathan Nieder
  2010-06-24 19:49     ` Jonathan Nieder
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 19:36 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Ramkumar Ramachandra wrote:

> This is like API documentation- should it go into the commit message?
> Maybe put this in a a dedicated repo_tree.txt like trp.h?

Good question.  The answer depends on whether we want to keep it
maintained.

Since this is not meant to be a general-purpose API, I thought it
simplest to document it in the commit message and let future commit
messages describe the purpose of future changes.  We are not
approaching any limit on the length of a commit object, after all.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] Add memory pool library
  2010-06-24 18:55     ` Jonathan Nieder
@ 2010-06-24 19:37       ` Ramkumar Ramachandra
  2010-06-24 20:06         ` Jonathan Nieder
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 19:37 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Hi Jonathan,

Jonathan Nieder wrote:
> If we used templates instead of macros, a smart compiler would notice
> which functions are _never_ used.  But sticking to C, I think it is
> fine to rely on humans checking by hand for now.  (FWIW no obj_pool
> functions are unused at the moment.)

> Filling out such a list for each caller sounds to me like more trouble
> than it’s worth.

Okay, I have an idea. We'll document it as a comment so people will be
able to see which exact functions were unused in this import. The
following functions are unused: blob_init, blob_offset, blob_commit,
commit_free, commit_offset, dir_free, node_init, node_commit,
string_offset, tree_first, tree_next, tree_remove, log_init,
log_offset, and log_commit.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/9] Add infrastructure to write revisions in fast-export format
  2010-06-24 19:29   ` Ramkumar Ramachandra
  2010-06-24 19:36     ` Jonathan Nieder
@ 2010-06-24 19:49     ` Jonathan Nieder
  2010-06-24 21:14       ` Ramkumar Ramachandra
  1 sibling, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 19:49 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

On Thu, Jun 24, 2010 at 09:29:29PM +0200, Ramkumar Ramachandra wrote:
> Jonathan Nieder wrote:

>>  . repo_reset erases the exporter's state, so valgrind can be happy.
> 
> This is like API documentation- should it go into the commit message?

A more terse summary, to save readers time:

 . repo_add, repo_modify, repo_copy, repo_replace, and repo_delete
   update the current commit, based roughly on the corresponding
   Subversion FS operation.

 . repo_commit calls out to fast_export to write the current commit to
   the fast-import stream in stdout.

 . repo_diff is used by the fast_export module to write the changes
   for a commit.

 . repo_reset erases the exporter's state, so valgrind can be happy.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] Add memory pool library
  2010-06-24 19:37       ` Ramkumar Ramachandra
@ 2010-06-24 20:06         ` Jonathan Nieder
  2010-06-24 20:20           ` Ramkumar Ramachandra
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 20:06 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Ramkumar Ramachandra wrote:

> The
> following functions are unused: blob_init, blob_offset, blob_commit,
> commit_free, commit_offset, dir_free, node_init, node_commit,
> string_offset, tree_first, tree_next, tree_remove, log_init,
> log_offset, and log_commit.

Thanks for this list.  Do you think it’s worth automating its
production?  i.e., a masochistic person could write a script to
compile with the __attribute__((unused)) suppressed, parse warnings to
find unused functions, and then take an intersection of sets to
confirm that no family of functions is unused.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 11:07 ` [PATCH 9/9] Add a sample user for the svndump library Jonathan Nieder
@ 2010-06-24 20:17   ` Ramkumar Ramachandra
  2010-06-24 20:30     ` Jonathan Nieder
  2010-06-30  2:09   ` Sam Vilain
  1 sibling, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 20:17 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi Jonathan,

Jonathan Nieder wrote:
> NEEDSWORK: litters cwd with useless .bin files.
> But I hope it is enough to show the idea.

How do you propose we solve this? Maybe using a generic
$TEMP_DIRECTORY like /tmp in Unix and then getting rid of the files
after the export is complete?

> +QUIET_SUBDIR0 = +$(MAKE) -C # space to separate -C and subdir
> +QUIET_SUBDIR1 =

> +ifneq ($(findstring $(MAKEFLAGS),s),s)
> +ifndef V
> +       QUIET_CC      = @echo '   ' CC $@;
> +       QUIET_LINK    = @echo '   ' LINK $@;
> +       QUIET_SUBDIR0 = +@subdir=
> +       QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo '   ' SUBDIR $$subdir; \
> +                       $(MAKE) $(PRINT_DIR) -C $$subdir
> +endif
> +endif

I saw this in the Git Makefile too, but I didn't understand the logic
behind it. Could you explain it to me?
Note: I couldn't understand most of the Makefile, so I just skipped it
when I found similar declarations in the Git Makefile.

> diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
> new file mode 100644
> index 0000000..d84dd4f
> --- /dev/null
> +++ b/contrib/svn-fe/svn-fe.c
> @@ -0,0 +1,43 @@
> +/*
> + * Parse and rearrange a svnadmin dump.
> + * Create the dump with:
> + * svnadmin dump --incremental -r<startrev>:<endrev> <repository> >outfile
> + *
> + * Copyright (C) 2010 David Barr <david.barr@cordelta.com>.
> + * All rights reserved.

That's a huge license header applies just to the trivial five-line
program, right? Is it necessary at all?

> +#include <stdlib.h>
> +#include "../../vcs-svn/svndump.h"

Inelegant. Why not include ../../vcs-svn in the path you're searching
for headers?

> +svnadmin dump --incremental REPO | svn-fe [url] | git fast-import

If the user doesn't have a clue about SVN, they won't know what REPO
is here: Without knowing anything about svnadmin, I'd naively try it
with a remote repository. Maybe include a note about having to mirror
a complete repository locally using svnsync (or otherwise) first?

> +Converts a textual representation of a Subversion repository into
> +input suitable for git-fast-import(1) and similar importers.

To be more specific, "Subversion dumpfile (version: 2)" from FILE(1).

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 3/9] Add memory pool library
  2010-06-24 20:06         ` Jonathan Nieder
@ 2010-06-24 20:20           ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 20:20 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Hi again,

Jonathan Nieder wrote:
> Thanks for this list.  Do you think it’s worth automating its
> production?  i.e., a masochistic person could write a script to
> compile with the __attribute__((unused)) suppressed, parse warnings to
> find unused functions, and then take an intersection of sets to
> confirm that no family of functions is unused.

Er, I think that's a bit of an overkill :p
Someone editing the code can always suppress MAYBE_UNUSED and check
with our list of unused functions in the comment by hand.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/9] Introduce vcs-svn lib
  2010-06-24 10:52 ` [PATCH 2/9] Introduce vcs-svn lib Jonathan Nieder
@ 2010-06-24 20:27   ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 20:27 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi,

Jonathan Nieder wrote:
> Teach the build system to build a separate library for the
> upcoming subversion interop support.
>
> The resulting vcs-svn/lib.a does not contain any code, nor is
> it built during a normal build.  This is just scaffolding for
> later changes.

This is very elegant indeed!
Acked-by: Ramkumar Ramachandra <artagnon@gmail.com>

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 20:17   ` Ramkumar Ramachandra
@ 2010-06-24 20:30     ` Jonathan Nieder
  2010-06-24 20:42       ` Ramkumar Ramachandra
  0 siblings, 1 reply; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 20:30 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Ramkumar Ramachandra wrote:
> Jonathan Nieder wrote:

>> NEEDSWORK: litters cwd with useless .bin files.
>> But I hope it is enough to show the idea.
>
> How do you propose we solve this?

Turn off persistence until it is ready.  At that point, we will need
to access the target git repo anyway, so we can keep extra metadata in
the .git directory.

> > +QUIET_SUBDIR0 = +$(MAKE) -C # space to separate -C and subdir
> > +QUIET_SUBDIR1 =
> 
> > +ifneq ($(findstring $(MAKEFLAGS),s),s)
> > +ifndef V
> > +       QUIET_CC      = @echo '   ' CC $@;
> > +       QUIET_LINK    = @echo '   ' LINK $@;
> > +       QUIET_SUBDIR0 = +@subdir=
> > +       QUIET_SUBDIR1 = ;$(NO_SUBDIR) echo '   ' SUBDIR $$subdir; \
> > +                       $(MAKE) $(PRINT_DIR) -C $$subdir
> > +endif
> > +endif
> 
> I saw this in the Git Makefile too, but I didn't understand the logic
> behind it. Could you explain it to me?

See commit 74f2b2a.

Summary: this produces the

    CC foo.o

lines.  The idea is that long command lines distract from what is more
important, which is the compiler output.  The behavior can be turned
off with “make V=1” or “make -s”.

>> diff --git a/contrib/svn-fe/svn-fe.c b/contrib/svn-fe/svn-fe.c
>> new file mode 100644
>> index 0000000..d84dd4f
>> --- /dev/null
>> +++ b/contrib/svn-fe/svn-fe.c
>> @@ -0,0 +1,43 @@
>> +/*
>> + * Parse and rearrange a svnadmin dump.
>> + * Create the dump with:
>> + * svnadmin dump --incremental -r<startrev>:<endrev> <repository> >outfile
>> + *
>> + * Copyright (C) 2010 David Barr <david.barr@cordelta.com>.
>> + * All rights reserved.
>
> That's a huge license header applies just to the trivial five-line
> program, right? Is it necessary at all?

I dunno.  I included the license header instead of refering to LICENSE
because this file tends to be installed in /usr/share/doc/git/contrib
and LICENSE does not.

Maybe the file should get a simpler license?  e.g.:

 This file is in the public domain.
 You may freely use, modify, distribute, and relicense it.

>> +#include <stdlib.h>
>> +#include "../../vcs-svn/svndump.h"
>
> Inelegant. Why not include ../../vcs-svn in the path you're searching
> for headers?

Right, this should be changed to

 #include <stdlib.h>
 #include "vcs-svn/svndump.h"

>> +svnadmin dump --incremental REPO | svn-fe [url] | git fast-import
>
> If the user doesn't have a clue about SVN, they won't know what REPO
> is here: Without knowing anything about svnadmin, I'd naively try it
> with a remote repository. Maybe include a note about having to mirror
> a complete repository locally using svnsync (or otherwise) first?

Sounds reasonable.  Care to suggest wording?

>> +Converts a textual representation of a Subversion repository into
>> +input suitable for git-fast-import(1) and similar importers.
>
> To be more specific, "Subversion dumpfile (version: 2)" from FILE(1).

Do version 3 dumpfiles fail?

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 8/9] Add SVN dump parser
  2010-06-24 11:03 ` [PATCH 8/9] Add SVN dump parser Jonathan Nieder
@ 2010-06-24 20:33   ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 20:33 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi,

Jonathan Nieder wrote:
> svndump parses data that is in SVN dumpfile format produced by
> `svnadmin dump` with the help of line_buffer and uses repo_tree and
> fast_export to emit a git fast-import stream.

This hasn't changed from last time, and isn't expected to change in
future. Should go in without any issues, although I'd love it if we
could somehow refactor the huge if-else tree. Again, I'm not sure if
this is appropriate since I've already signed off:
Reviewed-by: Ramkumar Ramachandra <artagnon@gmail.com>

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 20:30     ` Jonathan Nieder
@ 2010-06-24 20:42       ` Ramkumar Ramachandra
  2010-06-24 20:52         ` Jonathan Nieder
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 20:42 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Hi again,

Jonathan Nieder wrote:
> Turn off persistence until it is ready.  At that point, we will need
> to access the target git repo anyway, so we can keep extra metadata in
> the .git directory.

Unfortunately, turning off persistence isn't so easy now because the
branch has been merged into master now, and it's difficult to sort out
just the commits that correspond to persistence and rebase.
Yes, the remote helper can use the .git directory, but I thought we
wanted to keep this in contrib/ even after the remote helper is
merged?

> See commit 74f2b2a.
>
> Summary: this produces the
>
>    CC foo.o
>
> lines.  The idea is that long command lines distract from what is more
> important, which is the compiler output.  The behavior can be turned
> off with “make V=1” or “make -s”.

Ah. Black magic :)

> Maybe the file should get a simpler license?  e.g.:
>
>  This file is in the public domain.
>  You may freely use, modify, distribute, and relicense it.

Yes, I like this.

> Sounds reasonable.  Care to suggest wording?

Something along "REPO is a path to a Subversion repository mirrored on
the local disk. Remote Subversion repositories can be mirrored on
local disk using the `svnsync` command."

> Do version 3 dumpfiles fail?

Yes, they do. We aren't parsing the extra headers anywhere, and
deltified dumps aren't supported.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 20:42       ` Ramkumar Ramachandra
@ 2010-06-24 20:52         ` Jonathan Nieder
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 20:52 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Ramkumar Ramachandra wrote:

> Unfortunately, turning off persistence isn't so easy now

Hmm, couldn’t one just leave out some pool_init() calls?

Of course, that would make pool_init() and related functionality unused,
and we may or may not want to remove it while at it.

> Yes, the remote helper can use the .git directory, but I thought we
> wanted to keep this in contrib/ even after the remote helper is
> merged?

For incremental imports, I suspect the standalone svn-fe is going to
need to invoke ‘git fast-import’ itself, which implies knowledge of
the location of the .git directory.

For undeltified nonincremental imports, there is no problem. :)

> Jonathan Nieder wrote:

>> Maybe the file should get a simpler license?  e.g.:
>>
>>  This file is in the public domain.
>>  You may freely use, modify, distribute, and relicense it.
>
> Yes, I like this.

David, would this be okay? (for the short svn-fe.c file only)

>> Do version 3 dumpfiles fail?
>
> Yes, they do. We aren't parsing the extra headers anywhere, and
> deltified dumps aren't supported.

Worth documenting indeed.

Thanks again.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 7/9] Add infrastructure to write revisions in fast-export  format
  2010-06-24 19:49     ` Jonathan Nieder
@ 2010-06-24 21:14       ` Ramkumar Ramachandra
  0 siblings, 0 replies; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 21:14 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf, Eric Wong

Jonathan Nieder wrote:
> A more terse summary, to save readers time:
>
>  . repo_add, repo_modify, repo_copy, repo_replace, and repo_delete
>   update the current commit, based roughly on the corresponding
>   Subversion FS operation.
>
>  . repo_commit calls out to fast_export to write the current commit to
>   the fast-import stream in stdout.
>
>  . repo_diff is used by the fast_export module to write the changes
>   for a commit.
>
>  . repo_reset erases the exporter's state, so valgrind can be happy.

Looks good.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 6/9] Add stream helper library
  2010-06-24 11:01 ` [PATCH 6/9] Add stream helper library Jonathan Nieder
@ 2010-06-24 21:23   ` Ramkumar Ramachandra
  2010-06-24 21:29     ` Jonathan Nieder
  0 siblings, 1 reply; 36+ messages in thread
From: Ramkumar Ramachandra @ 2010-06-24 21:23 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Jonathan Nieder wrote:
> This library provides thread-unsafe fgets()- and fread()-like
> functions where the caller does not have to supply a buffer.  It
> maintains a couple of static buffers and provides an API to use
> them.

Few (no?) changes since last time.
Just a quick reminder: eventually, we might be able to factor out this
line_buffer thing completely; it's quite non-trivial though.

-- Ram

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH/RFC v2 0/9] Subversion dump parsing library
  2010-06-24 13:06 ` [PATCH/RFC v2 0/9] Subversion dump parsing library Ramkumar Ramachandra
  2010-06-24 18:24   ` Jonathan Nieder
@ 2010-06-24 21:26   ` Jonathan Nieder
  1 sibling, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 21:26 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf,
	Junio C Hamano, Eric Wong

Ramkumar Ramachandra wrote:

> I'm in favor of the split mainly because it doesn't make sense to
> re-roll your contrib/ patch everytime the infrastructure in vcs-svn/
> changes
[...]
> I want to get the contrib/ patch in so people can test
> the infrastructure everytime it's re-rolled and goes to `pu`.

Yes, I agree: ideally this should be two branches:

  rr/svn-export for infrastructure
  xx/svn-fe for the frontend

with the latter based on the former.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 6/9] Add stream helper library
  2010-06-24 21:23   ` Ramkumar Ramachandra
@ 2010-06-24 21:29     ` Jonathan Nieder
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-06-24 21:29 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: git, David Michael Barr, Sverre Rabbelier, Daniel Shahaf

Ramkumar Ramachandra wrote:
> Jonathan Nieder wrote:

>> This library provides thread-unsafe fgets()- and fread()-like
>> functions where the caller does not have to supply a buffer.  It
>> maintains a couple of static buffers and provides an API to use
>> them.
>
> Few (no?) changes since last time.

I simplified it a bit by getting rid of pushback.

 static uint32_t line_buffer_len = 0;
 static uint32_t line_len = 0;

Sorry, I should have mentioned so.

Jonathan

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 9/9] Add a sample user for the svndump library
  2010-06-24 11:07 ` [PATCH 9/9] Add a sample user for the svndump library Jonathan Nieder
  2010-06-24 20:17   ` Ramkumar Ramachandra
@ 2010-06-30  2:09   ` Sam Vilain
  1 sibling, 0 replies; 36+ messages in thread
From: Sam Vilain @ 2010-06-30  2:09 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: git, Ramkumar Ramachandra, David Michael Barr, Sverre Rabbelier,
	Daniel Shahaf

On Thu, 2010-06-24 at 06:07 -0500, Jonathan Nieder wrote:
> +To support incremental imports, 'svn-fe' will put a `git-svn-id`
> +line at the end of each commit log message if passed an url on the
> +command line.  This line has the form `git-svn-id: URL@REVNO UUID`.

If you are importing from an svk mirror or svnsync mirror, it will be
required to rewrite this portion.

> +Empty directories and unknown properties are silently discarded.

Yeah.  These should probably be carried over in this pass.  Revision
properties could possibly be converted to extra RFC822-style headers in
the commit message.  Directory properties can go under $dir/.svnfe-props
(use an empty file to mark empty directories) and $filename.svnfe-props
- it is up to the data mining phase whether it wants to actually do
anything with that data later.

Sam

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 3/9] Add memory pool library
  2010-07-16 10:13 ` Jonathan Nieder
@ 2010-07-16 10:16   ` Jonathan Nieder
  0 siblings, 0 replies; 36+ messages in thread
From: Jonathan Nieder @ 2010-07-16 10:16 UTC (permalink / raw)
  To: Ramkumar Ramachandra
  Cc: Git Mailing List, David Michael Barr, Sverre Rabbelier, Junio C Hamano

From: David Barr <david.barr@cordelta.com>

Add a memory pool library implemented using C macros. The
obj_pool_gen() macro creates a type-specific memory pool.

The memory pool library is distinguished from the existing specialized
allocators in alloc.c by using a contiguous block for all allocations.
This means that on one hand, long-lived pointers have to be written as
offsets, since the base address changes as the pool grows, but on the
other hand, the entire pool can be easily written to the file system.
This could allow the memory pool to persist between runs of an
application.

For the svn importer, such a facility is useful because each svn
revision can copy trees and files from any previous revision.  The
relevant information for all revisions has to persist somehow to
support incremental runs.

Signed-off-by: David Barr <david.barr@cordelta.com>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
---
Stripped out pool_init.  And added tests!  There is not really
much an allocator can do, so it is fun to play around with.

 .gitignore         |    1 +
 Makefile           |    4 +-
 t/t0080-vcs-svn.sh |   79 +++++++++++++++++++++++++++++++++++
 test-obj-pool.c    |  116 ++++++++++++++++++++++++++++++++++++++++++++++++++++
 vcs-svn/obj_pool.h |   61 +++++++++++++++++++++++++++
 5 files changed, 260 insertions(+), 1 deletions(-)
 create mode 100755 t/t0080-vcs-svn.sh
 create mode 100644 test-obj-pool.c
 create mode 100644 vcs-svn/obj_pool.h

diff --git a/.gitignore b/.gitignore
index 14e2b6b..1e64a6a 100644
--- a/.gitignore
+++ b/.gitignore
@@ -167,6 +167,7 @@
 /test-genrandom
 /test-index-version
 /test-match-trees
+/test-obj-pool
 /test-parse-options
 /test-path-utils
 /test-run-command
diff --git a/Makefile b/Makefile
index d6a779b..3b873cd 100644
--- a/Makefile
+++ b/Makefile
@@ -409,6 +409,7 @@ TEST_PROGRAMS_NEED_X += test-delta
 TEST_PROGRAMS_NEED_X += test-dump-cache-tree
 TEST_PROGRAMS_NEED_X += test-genrandom
 TEST_PROGRAMS_NEED_X += test-match-trees
+TEST_PROGRAMS_NEED_X += test-obj-pool
 TEST_PROGRAMS_NEED_X += test-parse-options
 TEST_PROGRAMS_NEED_X += test-path-utils
 TEST_PROGRAMS_NEED_X += test-run-command
@@ -1863,7 +1864,8 @@ xdiff-interface.o $(XDIFF_OBJS): \
 	xdiff/xinclude.h xdiff/xmacros.h xdiff/xdiff.h xdiff/xtypes.h \
 	xdiff/xutils.h xdiff/xprepare.h xdiff/xdiffi.h xdiff/xemit.h
 
-$(VCSSVN_OBJS):
+$(VCSSVN_OBJS): \
+	vcs-svn/obj_pool.h
 endif
 
 exec_cmd.s exec_cmd.o: EXTRA_CPPFLAGS = \
diff --git a/t/t0080-vcs-svn.sh b/t/t0080-vcs-svn.sh
new file mode 100755
index 0000000..3f29496
--- /dev/null
+++ b/t/t0080-vcs-svn.sh
@@ -0,0 +1,79 @@
+#!/bin/sh
+
+test_description='check infrastructure for svn importer'
+
+. ./test-lib.sh
+uint32_max=4294967295
+
+test_expect_success 'obj pool: store data' '
+	cat <<-\EOF >expected &&
+	0
+	1
+	EOF
+
+	test-obj-pool <<-\EOF >actual &&
+	alloc one 16
+	set one 13
+	test one 13
+	reset one
+	EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'obj pool: NULL is offset ~0' '
+	echo "$uint32_max" >expected &&
+	echo null one | test-obj-pool >actual &&
+	test_cmp expected actual
+'
+
+test_expect_success 'obj pool: out-of-bounds access' '
+	cat <<-EOF >expected &&
+	0
+	0
+	$uint32_max
+	$uint32_max
+	16
+	20
+	$uint32_max
+	EOF
+
+	test-obj-pool <<-\EOF >actual &&
+	alloc one 16
+	alloc two 16
+	offset one 20
+	offset two 20
+	alloc one 5
+	offset one 20
+	free one 1
+	offset one 20
+	reset one
+	reset two
+	EOF
+	test_cmp expected actual
+'
+
+test_expect_success 'obj pool: high-water mark' '
+	cat <<-\EOF >expected &&
+	0
+	0
+	10
+	20
+	20
+	20
+	EOF
+
+	test-obj-pool <<-\EOF >actual &&
+	alloc one 10
+	committed one
+	alloc one 10
+	commit one
+	committed one
+	alloc one 10
+	free one 20
+	committed one
+	reset one
+	EOF
+	test_cmp expected actual
+'
+
+test_done
diff --git a/test-obj-pool.c b/test-obj-pool.c
new file mode 100644
index 0000000..5018863
--- /dev/null
+++ b/test-obj-pool.c
@@ -0,0 +1,116 @@
+/*
+ * test-obj-pool.c: code to exercise the svn importer's object pool
+ */
+
+#include "cache.h"
+#include "vcs-svn/obj_pool.h"
+
+enum pool { POOL_ONE, POOL_TWO };
+obj_pool_gen(one, int, 1)
+obj_pool_gen(two, int, 4096)
+
+static uint32_t strtouint32(const char *s)
+{
+	char *end;
+	uintmax_t n = strtoumax(s, &end, 10);
+	if (*s == '\0' || (*end != '\n' && *end != '\0'))
+		die("invalid offset: %s", s);
+	return (uint32_t) n;
+}
+
+static void handle_command(const char *command, enum pool pool, const char *arg)
+{
+	switch (*command) {
+	case 'a':
+		if (!prefixcmp(command, "alloc ")) {
+			uint32_t n = strtouint32(arg);
+			printf("%"PRIu32"\n",
+				pool == POOL_ONE ?
+				one_alloc(n) : two_alloc(n));
+			return;
+		}
+	case 'c':
+		if (!prefixcmp(command, "commit ")) {
+			pool == POOL_ONE ? one_commit() : two_commit();
+			return;
+		}
+		if (!prefixcmp(command, "committed ")) {
+			printf("%"PRIu32"\n",
+				pool == POOL_ONE ?
+				one_pool.committed : two_pool.committed);
+			return;
+		}
+	case 'f':
+		if (!prefixcmp(command, "free ")) {
+			uint32_t n = strtouint32(arg);
+			pool == POOL_ONE ? one_free(n) : two_free(n);
+			return;
+		}
+	case 'n':
+		if (!prefixcmp(command, "null ")) {
+			printf("%"PRIu32"\n",
+				pool == POOL_ONE ?
+				one_offset(NULL) : two_offset(NULL));
+			return;
+		}
+	case 'o':
+		if (!prefixcmp(command, "offset ")) {
+			uint32_t n = strtouint32(arg);
+			printf("%"PRIu32"\n",
+				pool == POOL_ONE ?
+				one_offset(one_pointer(n)) :
+				two_offset(two_pointer(n)));
+			return;
+		}
+	case 'r':
+		if (!prefixcmp(command, "reset ")) {
+			pool == POOL_ONE ? one_reset() : two_reset();
+			return;
+		}
+	case 's':
+		if (!prefixcmp(command, "set ")) {
+			uint32_t n = strtouint32(arg);
+			if (pool == POOL_ONE)
+				*one_pointer(n) = 1;
+			else
+				*two_pointer(n) = 1;
+			return;
+		}
+	case 't':
+		if (!prefixcmp(command, "test ")) {
+			uint32_t n = strtouint32(arg);
+			printf("%d\n", pool == POOL_ONE ?
+				*one_pointer(n) : *two_pointer(n));
+			return;
+		}
+	default:
+		die("unrecognized command: %s", command);
+	}
+}
+
+static void handle_line(const char *line)
+{
+	const char *arg = strchr(line, ' ');
+	enum pool pool;
+
+	if (arg && !prefixcmp(arg + 1, "one"))
+		pool = POOL_ONE;
+	else if (arg && !prefixcmp(arg + 1, "two"))
+		pool = POOL_TWO;
+	else
+		die("no pool specified: %s", line);
+
+	handle_command(line, pool, arg + strlen("one "));
+}
+
+int main(int argc, char *argv[])
+{
+	struct strbuf sb = STRBUF_INIT;
+	if (argc != 1)
+		usage("test-obj-str < script");
+
+	while (strbuf_getline(&sb, stdin, '\n') != EOF)
+		handle_line(sb.buf);
+	strbuf_release(&sb);
+	return 0;
+}
diff --git a/vcs-svn/obj_pool.h b/vcs-svn/obj_pool.h
new file mode 100644
index 0000000..deb6eb8
--- /dev/null
+++ b/vcs-svn/obj_pool.h
@@ -0,0 +1,61 @@
+/*
+ * Licensed under a two-clause BSD-style license.
+ * See LICENSE for details.
+ */
+
+#ifndef OBJ_POOL_H_
+#define OBJ_POOL_H_
+
+#include "git-compat-util.h"
+
+#define MAYBE_UNUSED __attribute__((__unused__))
+
+#define obj_pool_gen(pre, obj_t, initial_capacity) \
+static struct { \
+	uint32_t committed; \
+	uint32_t size; \
+	uint32_t capacity; \
+	obj_t *base; \
+} pre##_pool = {0, 0, 0, NULL}; \
+static MAYBE_UNUSED uint32_t pre##_alloc(uint32_t count) \
+{ \
+	uint32_t offset; \
+	if (pre##_pool.size + count > pre##_pool.capacity) { \
+		while (pre##_pool.size + count > pre##_pool.capacity) \
+			if (pre##_pool.capacity) \
+				pre##_pool.capacity *= 2; \
+			else \
+				pre##_pool.capacity = initial_capacity; \
+		pre##_pool.base = realloc(pre##_pool.base, \
+					pre##_pool.capacity * sizeof(obj_t)); \
+	} \
+	offset = pre##_pool.size; \
+	pre##_pool.size += count; \
+	return offset; \
+} \
+static MAYBE_UNUSED void pre##_free(uint32_t count) \
+{ \
+	pre##_pool.size -= count; \
+} \
+static MAYBE_UNUSED uint32_t pre##_offset(obj_t *obj) \
+{ \
+	return obj == NULL ? ~0 : obj - pre##_pool.base; \
+} \
+static MAYBE_UNUSED obj_t *pre##_pointer(uint32_t offset) \
+{ \
+	return offset >= pre##_pool.size ? NULL : &pre##_pool.base[offset]; \
+} \
+static MAYBE_UNUSED void pre##_commit(void) \
+{ \
+	pre##_pool.committed = pre##_pool.size; \
+} \
+static MAYBE_UNUSED void pre##_reset(void) \
+{ \
+	free(pre##_pool.base); \
+	pre##_pool.base = NULL; \
+	pre##_pool.size = 0; \
+	pre##_pool.capacity = 0; \
+	pre##_pool.committed = 0; \
+}
+
+#endif
-- 
1.7.2.rc2

^ permalink raw reply related	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2010-07-16 10:17 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-24 10:50 [PATCH/RFC v2 0/9] Subversion dump parsing library Jonathan Nieder
2010-06-24 10:51 ` [PATCH 1/9] Export parse_date_basic() to convert a date string to timestamp Jonathan Nieder
2010-06-24 18:32   ` Ramkumar Ramachandra
2010-06-24 10:52 ` [PATCH 2/9] Introduce vcs-svn lib Jonathan Nieder
2010-06-24 20:27   ` Ramkumar Ramachandra
2010-06-24 10:53 ` [PATCH 3/9] Add memory pool library Jonathan Nieder
2010-06-24 18:43   ` Ramkumar Ramachandra
2010-06-24 18:55     ` Jonathan Nieder
2010-06-24 19:37       ` Ramkumar Ramachandra
2010-06-24 20:06         ` Jonathan Nieder
2010-06-24 20:20           ` Ramkumar Ramachandra
2010-06-24 10:57 ` [PATCH 4/9] Add treap implementation Jonathan Nieder
2010-06-24 19:08   ` Ramkumar Ramachandra
2010-06-24 19:22     ` Jonathan Nieder
2010-06-24 10:58 ` [PATCH 5/9] Add string-specific memory pool Jonathan Nieder
2010-06-24 19:19   ` Ramkumar Ramachandra
2010-06-24 11:01 ` [PATCH 6/9] Add stream helper library Jonathan Nieder
2010-06-24 21:23   ` Ramkumar Ramachandra
2010-06-24 21:29     ` Jonathan Nieder
2010-06-24 11:02 ` [PATCH 7/9] Add infrastructure to write revisions in fast-export format Jonathan Nieder
2010-06-24 19:29   ` Ramkumar Ramachandra
2010-06-24 19:36     ` Jonathan Nieder
2010-06-24 19:49     ` Jonathan Nieder
2010-06-24 21:14       ` Ramkumar Ramachandra
2010-06-24 11:03 ` [PATCH 8/9] Add SVN dump parser Jonathan Nieder
2010-06-24 20:33   ` Ramkumar Ramachandra
2010-06-24 11:07 ` [PATCH 9/9] Add a sample user for the svndump library Jonathan Nieder
2010-06-24 20:17   ` Ramkumar Ramachandra
2010-06-24 20:30     ` Jonathan Nieder
2010-06-24 20:42       ` Ramkumar Ramachandra
2010-06-24 20:52         ` Jonathan Nieder
2010-06-30  2:09   ` Sam Vilain
2010-06-24 13:06 ` [PATCH/RFC v2 0/9] Subversion dump parsing library Ramkumar Ramachandra
2010-06-24 18:24   ` Jonathan Nieder
2010-06-24 21:26   ` Jonathan Nieder
2010-07-15 16:22 [PATCH 0/8] Resurrect rr/svn-export Ramkumar Ramachandra
2010-07-16 10:13 ` Jonathan Nieder
2010-07-16 10:16   ` [PATCH 3/9] Add memory pool library Jonathan Nieder

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.