All of lore.kernel.org
 help / color / mirror / Atom feed
* git-svn SVN 1.7 fix, take 2
@ 2012-07-24 21:46 Michael G Schwern
  2012-07-24 21:51 ` Junio C Hamano
  2012-07-24 22:02 ` Jonathan Nieder
  0 siblings, 2 replies; 14+ messages in thread
From: Michael G Schwern @ 2012-07-24 21:46 UTC (permalink / raw)
  To: git, gitster; +Cc: Robin H. Johnson, Eric Wong, Ben Walton, Jonathan Nieder

It's post OSCON so I can take another crack at this again.

I'm struggling with how best to present all this to you folks.  There's
etiquette for how one presents a git pull request... but there's conflicting
etiquette about how one presents patches to a mailing list.  I'm not sure
which bit of which applies when here.  Documentation/SubmittingPatches focuses
on single patches and basic commit etiquette.

A big one is "do not blast 10 emails to a mailing list" but I gather that's ok
here if a submission needs 10 commits to be well expressed and its done via
git-send-email?  And then if patch #3 needs revision I'm to do it in a rebase
and resend the whole 10 commits?  Am I to think of git-send-email less as a
means of sending patches to a mailing list and more as a git transport mechanism?

I'm trying to bust it up into easier to digest pieces.  I came into this cold
without much knowledge of the problem ("something to do with
canonicalization") and no knowledge of the code.  While each commit is sharp,
the work as a whole is mixed up.

Here's the first pieces, as I see them, along with their branches.  The whole
work is in https://github.com/schwern/git/tree/git-svn/fix-canonical

* Change the Makefile.PL so it automatically finds the .pm files.
https://github.com/schwern/git/tree/git-svn/easier_modules
(Going to remove the Error.pm movement as off-topic)

* Extract each of the internal Git::* packages from inside git-svn.
https://github.com/schwern/git/tree/git-svn/extract_classes

There's five classes, and I did each in at least two commits.  First is a
straight cut & paste with no further changes.  Second (or more) fixes it so
things work again.  This is better for review (if it were done in a single
commit the real change would be lost in the cut & paste), but it means you
have a commit that breaks thing which will be a problem for bisecting.  I'm
inclined to stick with two commits and you folks can squash them if you decide
bisecting is more important.

The Git::SVN extraction is more complicated than the rest, so I'll probably do
that separately and bust it up into a few commits.

Next I'm going to...

1) Submit easier_modules.
2) Break up the Git::SVN fix into more commits.
3) Submit the Git::SVN extraction.


-- 
Reality is that which, when you stop believing in it, doesn't go away.
    -- Phillip K. Dick

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 21:46 git-svn SVN 1.7 fix, take 2 Michael G Schwern
@ 2012-07-24 21:51 ` Junio C Hamano
  2012-07-24 23:06   ` Michael G Schwern
  2012-07-24 22:02 ` Jonathan Nieder
  1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2012-07-24 21:51 UTC (permalink / raw)
  To: Michael G Schwern
  Cc: git, Robin H. Johnson, Eric Wong, Ben Walton, Jonathan Nieder

Michael G Schwern <schwern@pobox.com> writes:

> A big one is "do not blast 10 emails to a mailing list" but I gather that's ok
> here if a submission needs 10 commits to be well expressed and its done via
> git-send-email?  And then if patch #3 needs revision I'm to do it in a rebase
> and resend the whole 10 commits?  Am I to think of git-send-email less as a
> means of sending patches to a mailing list and more as a git transport mechanism?

Yes, yes and whatever (even though I think send-email is just a
better MUA/MSA when you want to send patches and isn't restricted
for a _git_ transport, I do not think it matters how you look at it).

> I'm trying to bust it up into easier to digest pieces.  I came into this cold
> without much knowledge of the problem ("something to do with
> canonicalization") and no knowledge of the code.

Perhaps it is a good idea to lurk and see how others submit their
topics first?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 21:46 git-svn SVN 1.7 fix, take 2 Michael G Schwern
  2012-07-24 21:51 ` Junio C Hamano
@ 2012-07-24 22:02 ` Jonathan Nieder
  2012-07-24 22:50   ` Michael G Schwern
  1 sibling, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2012-07-24 22:02 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, Robin H. Johnson, Eric Wong, Ben Walton

Hi,

Michael G Schwern wrote:

> I'm trying to bust it up into easier to digest pieces.

I have a crazy idea.  You might not like it, but maybe it's worth giving
it a try.

[...]
> The Git::SVN extraction is more complicated than the rest, so I'll probably do
> that separately and bust it up into a few commits.

All of these changes are supposed to have zero functional effect,
right?

Could you send the first five patches that *are* supposed to have a
functional effect?  I know that they will not apply cleanly to git-svn
from git "master" or on top of each other; that's fine with me.  If
the approach looks right, interested people (read: probably Ben or I :))
can make the corresponding change in the code layout from "master".
Afterwards, we can look into all that refactoring to make later
changes easier.

What do you think?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 22:02 ` Jonathan Nieder
@ 2012-07-24 22:50   ` Michael G Schwern
  2012-07-24 23:03     ` Jonathan Nieder
  2012-07-24 23:31     ` Jonathan Nieder
  0 siblings, 2 replies; 14+ messages in thread
From: Michael G Schwern @ 2012-07-24 22:50 UTC (permalink / raw)
  To: Jonathan Nieder; +Cc: git, gitster, Robin H. Johnson, Eric Wong, Ben Walton

On 2012.7.24 3:02 PM, Jonathan Nieder wrote:
>> The Git::SVN extraction is more complicated than the rest, so I'll probably do
>> that separately and bust it up into a few commits.
> 
> All of these changes are supposed to have zero functional effect,
> right?

Right.  They're just class extraction refactoring.  I don't even need to make
new classes, they already exist.  It's just moving them into their own file.


> Could you send the first five patches that *are* supposed to have a
> functional effect?  I know that they will not apply cleanly to git-svn
> from git "master" or on top of each other; that's fine with me.  If
> the approach looks right, interested people (read: probably Ben or I :))
> can make the corresponding change in the code layout from "master".
> Afterwards, we can look into all that refactoring to make later
> changes easier.
> 
> What do you think?

I think that would be a lot of extra work for me, create a big mess and be
harder to understand. :-/

Since I'm creating new files to store the classes, the functional changes
cannot be applied without the class extractions, so I'd have to rewrite them.
 And they will be harder to review since the main git-svn code and the class
code is mixed up in one file.  And they won't have unit tests, since git-svn
cannot be loaded without it running.

The Git::SVN extraction isn't really complicated, it just requires a handful
more work than the other classes.  Just a few method extractions.  The rest
are essentially cut, paste & fix lexicals.

I'm not sure what you're going for, but you can glance through the existing
changes here
https://github.com/schwern/git/commits/git-svn/fix-canonical

or you can do grab it as a remote and...

  git log -p schwern/git-svn/extract-classes..schwern/git-svn/fix-canonical

That should give you the information you need... if I understand what you
need.  I feel like I don't and we're talking past each other.


-- 
Ahh email, my old friend.  Do you know that revenge is a dish that is best
served cold?  And it is very cold on the Internet!

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 22:50   ` Michael G Schwern
@ 2012-07-24 23:03     ` Jonathan Nieder
  2012-07-24 23:31     ` Jonathan Nieder
  1 sibling, 0 replies; 14+ messages in thread
From: Jonathan Nieder @ 2012-07-24 23:03 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, Robin H. Johnson, Eric Wong, Ben Walton

Hi again,

Michael G Schwern wrote:
> On 2012.7.24 3:02 PM, Jonathan Nieder wrote:

>> Could you send the first five patches that *are* supposed to have a
>> functional effect?  I know that they will not apply cleanly to git-svn
>> from git "master" or on top of each other; that's fine with me.  If
>> the approach looks right, interested people (read: probably Ben or I :))
>> can make the corresponding change in the code layout from "master".
[...]
> I think that would be a lot of extra work for me, create a big mess and be
> harder to understand. :-/
>
> Since I'm creating new files to store the classes, the functional changes
> cannot be applied without the class extractions, so I'd have to rewrite them.

I don't understand.  Didn't I ask to send the changes as-is and say
that *I or someone else interested* would do the work to get them to
apply?

[...]
> That should give you the information you need... if I understand what you
> need.  I feel like I don't and we're talking past each other.

Basically, you are offering a code fix that's at least worth $500.
Lots of people have wanted the bug fixed for a long time.  As long as
it does not involve sacrificing maintainability, we should apply the
fix as soon as possible!  It's great that you've done this work.

Meanwhile that change is being held hostage by lots of cleanups that
are unquestionably going in the right direction but are going to be a
pain in the neck to safely apply.  And no one has reviewed your fix
that comes _after_ the cleanups.  Maybe the fix goes in a wrong
direction --- we don't know yet.  Maybe once we understand the fix
we'll have ideas that obsolete the previous cleanups and can more
simply accomplish the same thing by organizing the code a different
way.

You are saying that, to make your life easier, we should take your
cleanups and fixes as-is, all in one big pull.  Maybe you're right!
But it will take a lot longer this way than applying a smaller set of
changes that just fixes the bug.

I am saying that that, before anything else, it would be helpful for
us to *see* the relevant patches and understand your fix.  You are
more knowledgeable than anyone else about your code, so presumably it
should be straightforward to pick out which patches are the important
ones.  Using "git format-patch -1 <commit>" you can get a patch for
each.  Then you can use your favorite text editor to edit their
subject lines and change descriptions to describe what they do and
where they fall in the series of patches you are sending.  Finally you
can use your favorite mail user agent (e.g., "git send-email") to send
out the patches.

Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 21:51 ` Junio C Hamano
@ 2012-07-24 23:06   ` Michael G Schwern
  2012-07-24 23:12     ` Jonathan Nieder
  0 siblings, 1 reply; 14+ messages in thread
From: Michael G Schwern @ 2012-07-24 23:06 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: git, Robin H. Johnson, Eric Wong, Ben Walton, Jonathan Nieder

On 2012.7.24 2:51 PM, Junio C Hamano wrote:
> Michael G Schwern <schwern@pobox.com> writes:
> 
>> A big one is "do not blast 10 emails to a mailing list" but I gather that's ok
>> here if a submission needs 10 commits to be well expressed and its done via
>> git-send-email?  And then if patch #3 needs revision I'm to do it in a rebase
>> and resend the whole 10 commits?  Am I to think of git-send-email less as a
>> means of sending patches to a mailing list and more as a git transport mechanism?
> 
> Yes, yes and whatever (even though I think send-email is just a
> better MUA/MSA when you want to send patches and isn't restricted
> for a _git_ transport, I do not think it matters how you look at it).

#3 was not intended as a dig.  If I can think about git-send-email like a
funny way to do a git-push then that fits better in my head.  I worry about
sending too many emails to a list at once.  I don't worry about sending too
many commits in one push.


>> I'm trying to bust it up into easier to digest pieces.  I came into this cold
>> without much knowledge of the problem ("something to do with
>> canonicalization") and no knowledge of the code.
> 
> Perhaps it is a good idea to lurk and see how others submit their
> topics first?

While I use git heavily I'm not invested in working on it.  I work on a lot of
projects.  I'd like to be able to do the work, submit it, work through review,
and get out without joining another mailing list and studying their culture.

Is there a document I could look at for submitting a large body of work, or
could I help improve SubmittingPatches to document the process better?


-- 
I do have a cause though. It's obscenity. I'm for it.
    - Tom Lehrer

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 23:06   ` Michael G Schwern
@ 2012-07-24 23:12     ` Jonathan Nieder
  0 siblings, 0 replies; 14+ messages in thread
From: Jonathan Nieder @ 2012-07-24 23:12 UTC (permalink / raw)
  To: Michael G Schwern
  Cc: Junio C Hamano, git, Robin H. Johnson, Eric Wong, Ben Walton

Michael G Schwern wrote:

> While I use git heavily I'm not invested in working on it.  I work on a lot of
> projects.  I'd like to be able to do the work, submit it, work through review,
> and get out without joining another mailing list and studying their culture.

An alternative approach would be to _just_ explain how your patch
series works, and then say

	"Ok, here is the git branch.  Have fun with it.  If someone wants to
	incorporate it, that's great.  Let me know if you have any questions."

That's totally fine.  Then someone else can submit the patches and
help to massage the patches into a form that is ready for application
on your behalf.

> Is there a document I could look at for submitting a large body of work, or
> could I help improve SubmittingPatches to document the process better?

I thought SubmittingPatches did describe how to send patch series.
Could you point to the section that was lacking?  Or are you just
still in disbelief that people review each patch in a thread, one by
one, sending emails to each other?

Grateful but impatient,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 22:50   ` Michael G Schwern
  2012-07-24 23:03     ` Jonathan Nieder
@ 2012-07-24 23:31     ` Jonathan Nieder
  2012-07-24 23:45       ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2012-07-24 23:31 UTC (permalink / raw)
  To: Michael G Schwern; +Cc: git, gitster, Robin H. Johnson, Eric Wong, Ben Walton

Michael G Schwern wrote:

>   git log -p schwern/git-svn/extract-classes..schwern/git-svn/fix-canonical
>
> That should give you the information you need...

I guess so.  May we have your sign-off on these changes?  (A simple
reply of "yes" is enough, no need to resend patches to do this.)

Here it is in patch form for reviewers.  If I understand correctly,
the idea is to replace accesses to $gs->{path} with calls to a
$gs->path function that canonicalizes (and likewise for s/path/url/).

There are probably other subtleties, but that seems to be the gist.
---
 git-svn.perl                          |   96 +++++++--------------
 perl/Git/SVN.pm                       |  148 +++++++++++++++++++++------------
 perl/Git/SVN/Fetcher.pm               |    2 +-
 perl/Git/SVN/Migration.pm             |    6 +-
 perl/Git/SVN/Ra.pm                    |   96 ++++++++++++---------
 perl/Git/SVN/Utils.pm                 |  123 ++++++++++++++++++++++++++-
 t/Git-SVN/Ra/fix_dir.t                |   24 ++++++
 t/lib-git-svn.sh                      |    2 +
 t/t9118-git-svn-funky-branch-names.sh |    6 +-
 t/t9119-git-svn-info.sh               |    7 +-
 10 files changed, 341 insertions(+), 169 deletions(-)

diff --git a/git-svn.perl b/git-svn.perl
index 7b54f43f..74aeb4df 100755
--- a/git-svn.perl
+++ b/git-svn.perl
@@ -12,7 +12,12 @@ $VERSION = '@@GIT_VERSION@@';
 
 use Git::SVN;
 
-use Git::SVN::Utils qw(fatal can_compress);
+use Git::SVN::Utils qw(
+    fatal
+    can_compress
+    canonicalize_path
+    canonicalize_url
+);
 
 # From which subdir have we been invoked?
 my $cmd_dir_prefix = eval {
@@ -100,8 +105,6 @@ BEGIN {
 	Memoize::memoize 'Git::config_bool';
 }
 
-my ($SVN);
-
 $sha1 = qr/[a-f\d]{40}/;
 $sha1_short = qr/[a-f\d]{4,40}/;
 my ($_stdin, $_help, $_edit,
@@ -518,7 +521,6 @@ sub cmd_init {
 	}
 	my $url = shift or die "SVN repository location required ",
 	                       "as a command-line argument\n";
-	$url = canonicalize_url($url);
 	init_subdir(@_);
 	do_git_init_db();
 
@@ -1199,7 +1201,7 @@ sub cmd_show_ignore {
 	my ($url, $rev, $uuid, $gs) = working_head_info('HEAD');
 	$gs ||= Git::SVN->new;
 	my $r = (defined $_revision ? $_revision : $gs->ra->get_latest_revnum);
-	$gs->prop_walk($gs->{path}, $r, sub {
+	$gs->prop_walk($gs->path, $r, sub {
 		my ($gs, $path, $props) = @_;
 		print STDOUT "\n# $path\n";
 		my $s = $props->{'svn:ignore'} or return;
@@ -1215,7 +1217,7 @@ sub cmd_show_externals {
 	my ($url, $rev, $uuid, $gs) = working_head_info('HEAD');
 	$gs ||= Git::SVN->new;
 	my $r = (defined $_revision ? $_revision : $gs->ra->get_latest_revnum);
-	$gs->prop_walk($gs->{path}, $r, sub {
+	$gs->prop_walk($gs->path, $r, sub {
 		my ($gs, $path, $props) = @_;
 		print STDOUT "\n# $path\n";
 		my $s = $props->{'svn:externals'} or return;
@@ -1230,7 +1232,7 @@ sub cmd_create_ignore {
 	my ($url, $rev, $uuid, $gs) = working_head_info('HEAD');
 	$gs ||= Git::SVN->new;
 	my $r = (defined $_revision ? $_revision : $gs->ra->get_latest_revnum);
-	$gs->prop_walk($gs->{path}, $r, sub {
+	$gs->prop_walk($gs->path, $r, sub {
 		my ($gs, $path, $props) = @_;
 		# $path is of the form /path/to/dir/
 		$path = '.' . $path;
@@ -1260,36 +1262,12 @@ sub cmd_mkdirs {
 	$gs->mkemptydirs($_revision);
 }
 
-sub canonicalize_path {
-	my ($path) = @_;
-	my $dot_slash_added = 0;
-	if (substr($path, 0, 1) ne "/") {
-		$path = "./" . $path;
-		$dot_slash_added = 1;
-	}
-	# File::Spec->canonpath doesn't collapse x/../y into y (for a
-	# good reason), so let's do this manually.
-	$path =~ s#/+#/#g;
-	$path =~ s#/\.(?:/|$)#/#g;
-	$path =~ s#/[^/]+/\.\.##g;
-	$path =~ s#/$##g;
-	$path =~ s#^\./## if $dot_slash_added;
-	$path =~ s#^/##;
-	$path =~ s#^\.$##;
-	return $path;
-}
-
-sub canonicalize_url {
-	my ($url) = @_;
-	$url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
-	return $url;
-}
-
 # get_svnprops(PATH)
 # ------------------
 # Helper for cmd_propget and cmd_proplist below.
 sub get_svnprops {
 	my $path = shift;
+
 	my ($url, $rev, $uuid, $gs) = working_head_info('HEAD');
 	$gs ||= Git::SVN->new;
 
@@ -1298,7 +1276,8 @@ sub get_svnprops {
 	$path = $cmd_dir_prefix . $path;
 	fatal("No such file or directory: $path") unless -e $path;
 	my $is_dir = -d $path ? 1 : 0;
-	$path = $gs->{path} . '/' . $path;
+
+	$path = $gs->path . '/' . $path if $gs->path;
 
 	# canonicalize the path (otherwise libsvn will abort or fail to
 	# find the file)
@@ -1399,8 +1378,8 @@ sub cmd_commit_diff {
 			fatal("Needed URL or usable git-svn --id in ",
 			      "the command-line\n", $usage);
 		}
-		$url = $gs->{url};
-		$svn_path = $gs->{path};
+		$url = $gs->url;
+		$svn_path = $gs->path;
 	}
 	unless (defined $_revision) {
 		fatal("-r|--revision is a required argument\n", $usage);
@@ -1434,25 +1413,6 @@ sub cmd_commit_diff {
 	}
 }
 
-sub escape_uri_only {
-	my ($uri) = @_;
-	my @tmp;
-	foreach (split m{/}, $uri) {
-		s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
-		push @tmp, $_;
-	}
-	join('/', @tmp);
-}
-
-sub escape_url {
-	my ($url) = @_;
-	if ($url =~ m#^([^:]+)://([^/]*)(.*)$#) {
-		my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
-		$url = "$scheme://$domain$uri";
-	}
-	$url;
-}
-
 sub cmd_info {
 	my $path = canonicalize_path(defined($_[0]) ? $_[0] : ".");
 	my $fullpath = canonicalize_path($cmd_dir_prefix . $path);
@@ -1476,21 +1436,21 @@ sub cmd_info {
 	# canonicalize_path() will return "" to make libsvn 1.5.x happy,
 	$path = "." if $path eq "";
 
-	my $full_url = $url . ($fullpath eq "" ? "" : "/$fullpath");
+	my $full_url = canonicalize_url( $url . ($fullpath eq "" ? "" : "/$fullpath") );
 
 	if ($_url) {
-		print escape_url($full_url), "\n";
+		print $full_url, "\n";
 		return;
 	}
 
 	my $result = "Path: $path\n";
 	$result .= "Name: " . basename($path) . "\n" if $file_type ne "dir";
-	$result .= "URL: " . escape_url($full_url) . "\n";
+	$result .= "URL: $full_url\n";
 
 	eval {
 		my $repos_root = $gs->repos_root;
 		Git::SVN::remove_username($repos_root);
-		$result .= "Repository Root: " . escape_url($repos_root) . "\n";
+		$result .= "Repository Root: " . canonicalize_url($repos_root) . "\n";
 	};
 	if ($@) {
 		$result .= "Repository Root: (offline)\n";
@@ -1638,7 +1598,10 @@ sub post_fetch_checkout {
 
 sub complete_svn_url {
 	my ($url, $path) = @_;
-	$path =~ s#/+$##;
+
+	$path = canonicalize_path($path);
+
+	# The path is not a URL
 	if ($path !~ m#^[a-z\+]+://#) {
 		if (!defined $url || $url !~ m#^[a-z\+]+://#) {
 			fatal("E: '$path' is not a complete URL ",
@@ -1655,7 +1618,8 @@ sub complete_url_ls_init {
 		print STDERR "W: $switch not specified\n";
 		return;
 	}
-	$repo_path =~ s#/+$##;
+
+	$repo_path = canonicalize_path($repo_path);
 	if ($repo_path =~ m#^[a-z\+]+://#) {
 		$ra = Git::SVN::Ra->new($repo_path);
 		$repo_path = '';
@@ -1666,18 +1630,18 @@ sub complete_url_ls_init {
 			      "and a separate URL is not specified");
 		}
 	}
-	my $url = $ra->{url};
+	my $url = $ra->url;
 	my $gs = Git::SVN->init($url, undef, undef, undef, 1);
 	my $k = "svn-remote.$gs->{repo_id}.url";
 	my $orig_url = eval { command_oneline(qw/config --get/, $k) };
-	if ($orig_url && ($orig_url ne $gs->{url})) {
+	if ($orig_url && ($orig_url ne $gs->url)) {
 		die "$k already set: $orig_url\n",
-		    "wanted to set to: $gs->{url}\n";
+		    "wanted to set to: $gs->url\n";
 	}
-	command_oneline('config', $k, $gs->{url}) unless $orig_url;
-	my $remote_path = "$gs->{path}/$repo_path";
+	command_oneline('config', $k, $gs->url) unless $orig_url;
+
+	my $remote_path = canonicalize_path($gs->path . "/$repo_path");
 	$remote_path =~ s{%([0-9A-F]{2})}{chr hex($1)}ieg;
-	$remote_path =~ s#/+#/#g;
 	$remote_path =~ s#^/##g;
 	$remote_path .= "/*" if $remote_path !~ /\*/;
 	my ($n) = ($switch =~ /^--(\w+)/);
diff --git a/perl/Git/SVN.pm b/perl/Git/SVN.pm
index 247ee1d0..ebfa8da5 100644
--- a/perl/Git/SVN.pm
+++ b/perl/Git/SVN.pm
@@ -23,7 +23,12 @@ BEGIN {
 }
 
 use Git qw(command command_oneline command_noisy command_output_pipe command_close_pipe);
-use Git::SVN::Utils qw(fatal can_compress);
+use Git::SVN::Utils qw(
+    fatal
+    can_compress
+    canonicalize_url
+    canonicalize_path
+);
 
 our $_follow_parent  = 1;
 our $_minimize_url   = 'unset';
@@ -310,12 +315,12 @@ sub init_remote_config {
 				print STDERR "Using higher level of URL: ",
 					     "$url => $min_url\n";
 			}
-			my $old_path = $self->{path};
-			$self->{path} = $url;
-			$self->{path} =~ s!^\Q$min_url\E(/|$)!!;
+			my $old_path = $self->path;
+			$url =~ s!^\Q$min_url\E(/|$)!!;
 			if (length $old_path) {
-				$self->{path} .= "/$old_path";
+				$url .= "/$old_path";
 			}
+			$self->path($url);
 			$url = $min_url;
 		}
 	}
@@ -339,13 +344,15 @@ sub init_remote_config {
 	unless ($no_write) {
 		command_noisy('config',
 			      "svn-remote.$self->{repo_id}.url", $url);
-		$self->{path} =~ s{^/}{};
-		$self->{path} =~ s{%([0-9A-F]{2})}{chr hex($1)}ieg;
+		my $path = $self->path;
+		$path =~ s{^/}{};
+		$path =~ s{%([0-9A-F]{2})}{chr hex($1)}ieg;
+		$self->path($path);
 		command_noisy('config', '--add',
 			      "svn-remote.$self->{repo_id}.fetch",
-			      "$self->{path}:".$self->refname);
+			      $self->path.":".$self->refname);
 	}
-	$self->{url} = $url;
+	$self->url($url);
 }
 
 sub find_by_url { # repos_root and, path are optional
@@ -431,20 +438,19 @@ sub new {
 		}
 	}
 	my $self = _new($class, $repo_id, $ref_id, $path);
-	if (!defined $self->{path} || !length $self->{path}) {
+	if (!defined $self->path || !length $self->path) {
 		my $fetch = command_oneline('config', '--get',
 		                            "svn-remote.$repo_id.fetch",
 		                            ":$ref_id\$") or
 		     die "Failed to read \"svn-remote.$repo_id.fetch\" ",
 		         "\":$ref_id\$\" in config\n";
-		($self->{path}, undef) = split(/\s*:\s*/, $fetch);
+		my($path) = split(/\s*:\s*/, $fetch);
+		$self->path($path);
 	}
-	$self->{path} =~ s{/+}{/}g;
-	$self->{path} =~ s{\A/}{};
-	$self->{path} =~ s{/\z}{};
-	$self->{url} = command_oneline('config', '--get',
-	                               "svn-remote.$repo_id.url") or
+	my $url = command_oneline('config', '--get',
+	                          "svn-remote.$repo_id.url") or
                   die "Failed to read \"svn-remote.$repo_id.url\" in config\n";
+	$self->url($url);
 	$self->{pushurl} = eval { command_oneline('config', '--get',
 	                          "svn-remote.$repo_id.pushurl") };
 	$self->rebuild;
@@ -548,7 +554,7 @@ sub _set_svm_vars {
 		# username is of no interest
 		$src =~ s{(^[a-z\+]*://)[^/@]*@}{$1};
 
-		my $replace = $ra->{url};
+		my $replace = $ra->url;
 		$replace .= "/$path" if length $path;
 
 		my $section = "svn-remote.$self->{repo_id}";
@@ -563,20 +569,21 @@ sub _set_svm_vars {
 	}
 
 	my $r = $ra->get_latest_revnum;
-	my $path = $self->{path};
+	my $path = $self->path;
 	my %tried;
 	while (length $path) {
-		unless ($tried{"$self->{url}/$path"}) {
+		my $try = $self->url . "/$path";
+		unless ($tried{$try}) {
 			return $ra if $self->read_svm_props($ra, $path, $r);
-			$tried{"$self->{url}/$path"} = 1;
+			$tried{$try} = 1;
 		}
 		$path =~ s#/?[^/]+$##;
 	}
 	die "Path: '$path' should be ''\n" if $path ne '';
 	return $ra if $self->read_svm_props($ra, $path, $r);
-	$tried{"$self->{url}/$path"} = 1;
+	$tried{$self->url."/$path"} = 1;
 
-	if ($ra->{repos_root} eq $self->{url}) {
+	if ($ra->{repos_root} eq $self->url) {
 		die @err, (map { "  $_\n" } keys %tried), "\n";
 	}
 
@@ -586,20 +593,21 @@ sub _set_svm_vars {
 	$path = $ra->{svn_path};
 	$ra = Git::SVN::Ra->new($ra->{repos_root});
 	while (length $path) {
-		unless ($tried{"$ra->{url}/$path"}) {
+		my $try = $ra->url ."/$path";
+		unless ($tried{$try}) {
 			$ok = $self->read_svm_props($ra, $path, $r);
 			last if $ok;
-			$tried{"$ra->{url}/$path"} = 1;
+			$tried{$try} = 1;
 		}
 		$path =~ s#/?[^/]+$##;
 	}
 	die "Path: '$path' should be ''\n" if $path ne '';
 	$ok ||= $self->read_svm_props($ra, $path, $r);
-	$tried{"$ra->{url}/$path"} = 1;
+	$tried{$ra->url ."/$path"} = 1;
 	if (!$ok) {
 		die @err, (map { "  $_\n" } keys %tried), "\n";
 	}
-	Git::SVN::Ra->new($self->{url});
+	Git::SVN::Ra->new($self->url);
 }
 
 sub svnsync {
@@ -666,7 +674,7 @@ sub ra_uuid {
 		if (!$@ && $uuid && $uuid =~ /^([a-f\d\-]{30,})$/i) {
 			$self->{ra_uuid} = $uuid;
 		} else {
-			die "ra_uuid called without URL\n" unless $self->{url};
+			die "ra_uuid called without URL\n" unless $self->url;
 			$self->{ra_uuid} = $self->ra->get_uuid;
 			tmp_config('--add', $key, $self->{ra_uuid});
 		}
@@ -690,7 +698,7 @@ sub repos_root {
 
 sub ra {
 	my ($self) = shift;
-	my $ra = Git::SVN::Ra->new($self->{url});
+	my $ra = Git::SVN::Ra->new($self->url);
 	$self->_set_repos_root($ra->{repos_root});
 	if ($self->use_svm_props && !$self->{svm}) {
 		if ($self->no_metadata) {
@@ -724,7 +732,7 @@ sub prop_walk {
 	$path =~ s#^/*#/#g;
 	my $p = $path;
 	# Strip the irrelevant part of the path.
-	$p =~ s#^/+\Q$self->{path}\E(/|$)#/#;
+	$p =~ s#^/+\Q@{[$self->path]}\E(/|$)#/#;
 	# Ensure the path is terminated by a `/'.
 	$p =~ s#/*$#/#;
 
@@ -745,7 +753,7 @@ sub prop_walk {
 
 	foreach (sort keys %$dirent) {
 		next if $dirent->{$_}->{kind} != $SVN::Node::dir;
-		$self->prop_walk($self->{path} . $p . $_, $rev, $sub);
+		$self->prop_walk($self->path . $p . $_, $rev, $sub);
 	}
 }
 
@@ -915,20 +923,20 @@ sub rewrite_uuid {
 
 sub metadata_url {
 	my ($self) = @_;
-	($self->rewrite_root || $self->{url}) .
-	   (length $self->{path} ? '/' . $self->{path} : '');
+	($self->rewrite_root || $self->url) .
+	   (length $self->path ? '/' . $self->path : '');
 }
 
 sub full_url {
 	my ($self) = @_;
-	$self->{url} . (length $self->{path} ? '/' . $self->{path} : '');
+	return canonicalize_url( $self->url . (length $self->path ? '/' . $self->path : '') );
 }
 
 sub full_pushurl {
 	my ($self) = @_;
 	if ($self->{pushurl}) {
-		return $self->{pushurl} . (length $self->{path} ? '/' .
-		       $self->{path} : '');
+		return $self->{pushurl} . (length $self->path ? '/' .
+		       $self->path : '');
 	} else {
 		return $self->full_url;
 	}
@@ -1044,20 +1052,20 @@ sub do_git_commit {
 
 sub match_paths {
 	my ($self, $paths, $r) = @_;
-	return 1 if $self->{path} eq '';
-	if (my $path = $paths->{"/$self->{path}"}) {
+	return 1 if $self->path eq '';
+	if (my $path = $paths->{"/".$self->path}) {
 		return ($path->{action} eq 'D') ? 0 : 1;
 	}
-	$self->{path_regex} ||= qr/^\/\Q$self->{path}\E\//;
+	$self->{path_regex} ||= qr{^/\Q@{[$self->path]}\E/};
 	if (grep /$self->{path_regex}/, keys %$paths) {
 		return 1;
 	}
 	my $c = '';
-	foreach (split m#/#, $self->{path}) {
+	foreach (split m#/#, $self->path) {
 		$c .= "/$_";
 		next unless ($paths->{$c} &&
 		             ($paths->{$c}->{action} =~ /^[AR]$/));
-		if ($self->ra->check_path($self->{path}, $r) ==
+		if ($self->ra->check_path($self->path, $r) ==
 		    $SVN::Node::dir) {
 			return 1;
 		}
@@ -1071,14 +1079,14 @@ sub find_parent_branch {
 	unless (defined $paths) {
 		my $err_handler = $SVN::Error::handler;
 		$SVN::Error::handler = \&Git::SVN::Ra::skip_unknown_revs;
-		$self->ra->get_log([$self->{path}], $rev, $rev, 0, 1, 1,
+		$self->ra->get_log([$self->path], $rev, $rev, 0, 1, 1,
 				   sub { $paths = $_[0] });
 		$SVN::Error::handler = $err_handler;
 	}
 	return undef unless defined $paths;
 
 	# look for a parent from another branch:
-	my @b_path_components = split m#/#, $self->{path};
+	my @b_path_components = split m#/#, $self->path;
 	my @a_path_components;
 	my $i;
 	while (@b_path_components) {
@@ -1089,14 +1097,12 @@ sub find_parent_branch {
 	return undef unless defined $i && defined $i->{copyfrom_path};
 	my $branch_from = $i->{copyfrom_path};
 	if (@a_path_components) {
-		print STDERR "branch_from: $branch_from => ";
 		$branch_from .= '/'.join('/', @a_path_components);
-		print STDERR $branch_from, "\n";
 	}
 	my $r = $i->{copyfrom_rev};
 	my $repos_root = $self->ra->{repos_root};
-	my $url = $self->ra->{url};
-	my $new_url = $url . $branch_from;
+	my $url = $self->ra->url;
+	my $new_url = canonicalize_url($url . $branch_from);
 	print STDERR  "Found possible branch point: ",
 	              "$new_url => ", $self->full_url, ", $r\n"
 	              unless $::_q > 1;
@@ -1110,7 +1116,7 @@ sub find_parent_branch {
 			($base, $head) = parse_revision_argument(0, $r);
 		} else {
 			if ($r0 < $r) {
-				$gs->ra->get_log([$gs->{path}], $r0 + 1, $r, 1,
+				$gs->ra->get_log([$gs->path], $r0 + 1, $r, 1,
 					0, 1, sub { $base = $_[1] - 1 });
 			}
 		}
@@ -1132,7 +1138,7 @@ sub find_parent_branch {
 			# at the moment), so we can't rely on it
 			$self->{last_rev} = $r0;
 			$self->{last_commit} = $parent;
-			$ed = Git::SVN::Fetcher->new($self, $gs->{path});
+			$ed = Git::SVN::Fetcher->new($self, $gs->path);
 			$gs->ra->gs_do_switch($r0, $rev, $gs,
 					      $self->full_url, $ed)
 			  or die "SVN connection failed somewhere...\n";
@@ -1231,7 +1237,7 @@ sub mkemptydirs {
 		close $fh;
 	}
 
-	my $strip = qr/\A\Q$self->{path}\E(?:\/|$)/;
+	my $strip = qr/\A\Q@{[$self->path]}\E(?:\/|$)/;
 	foreach my $d (sort keys %empty_dirs) {
 		$d = uri_decode($d);
 		$d =~ s/$strip//;
@@ -1425,7 +1431,7 @@ sub find_extra_svk_parents {
 	for my $ticket ( @tickets ) {
 		my ($uuid, $path, $rev) = split /:/, $ticket;
 		if ( $uuid eq $self->ra_uuid ) {
-			my $url = $self->{url};
+			my $url = $self->url;
 			my $repos_root = $url;
 			my $branch_from = $path;
 			$branch_from =~ s{^/}{};
@@ -1671,7 +1677,7 @@ sub find_extra_svn_parents {
 	# are now marked as merge, we can add the tip as a parent.
 	my @merges = split "\n", $mergeinfo;
 	my @merge_tips;
-	my $url = $self->{url};
+	my $url = $self->url;
 	my $uuid = $self->ra_uuid;
 	my %ranges;
 	for my $merge ( @merges ) {
@@ -1854,7 +1860,7 @@ sub make_log_entry {
 		$commit_email ||= "$author\@$uuid";
 	} elsif ($self->use_svnsync_props) {
 		my $full_url = $self->svnsync->{url};
-		$full_url .= "/$self->{path}" if length $self->{path};
+		$full_url .= "/".$self->path if length $self->path;
 		remove_username($full_url);
 		my $uuid = $self->svnsync->{uuid};
 		$log_entry{metadata} = "$full_url\@$rev $uuid";
@@ -1901,7 +1907,7 @@ sub set_tree {
 	                tree_b => $tree,
 	                editor_cb => sub {
 			       $self->set_tree_cb($log_entry, $tree, @_) },
-	                svn_path => $self->{path} );
+	                svn_path => $self->path );
 	if (!Git::SVN::Editor->new(\%ed_opts)->apply_diff) {
 		print "No changes\nr$self->{last_rev} = $tree\n";
 	}
@@ -2272,12 +2278,44 @@ sub _new {
 
 	$_[3] = $path = '' unless (defined $path);
 	mkpath([$dir]);
-	bless {
+	my $obj = bless {
 		ref_id => $ref_id, dir => $dir, index => "$dir/index",
-	        path => $path, config => "$ENV{GIT_DIR}/svn/config",
+	        config => "$ENV{GIT_DIR}/svn/config",
 	        map_root => "$dir/.rev_map", repo_id => $repo_id }, $class;
+
+	# Ensure it gets canonicalized
+	$obj->path($path);
+
+	return $obj;
+}
+
+
+sub path {
+    my $self = shift;
+
+    if( @_ ) {
+        my $path = shift;
+        $self->{path} = canonicalize_path($path);
+        return;
+    }
+
+    return $self->{path};
 }
 
+
+sub url {
+    my $self = shift;
+
+    if( @_ ) {
+        my $url = shift;
+        $self->{url} = canonicalize_url($url);
+        return;
+    }
+
+    return $self->{url};
+}
+
+
 # for read-only access of old .rev_db formats
 sub unlink_rev_db_symlink {
 	my ($self) = @_;
diff --git a/perl/Git/SVN/Fetcher.pm b/perl/Git/SVN/Fetcher.pm
index 69486efe..b606fe5a 100644
--- a/perl/Git/SVN/Fetcher.pm
+++ b/perl/Git/SVN/Fetcher.pm
@@ -84,7 +84,7 @@ sub _mark_empty_symlinks {
 	chomp(my $empty_blob = `git hash-object -t blob --stdin < /dev/null`);
 	my ($ls, $ctx) = command_output_pipe(qw/ls-tree -r -z/, $cmt);
 	local $/ = "\0";
-	my $pfx = defined($switch_path) ? $switch_path : $git_svn->{path};
+	my $pfx = defined($switch_path) ? $switch_path : $git_svn->path;
 	$pfx .= '/' if length($pfx);
 	while (<$ls>) {
 		chomp;
diff --git a/perl/Git/SVN/Migration.pm b/perl/Git/SVN/Migration.pm
index b17fe003..5a9d3c71 100644
--- a/perl/Git/SVN/Migration.pm
+++ b/perl/Git/SVN/Migration.pm
@@ -180,14 +180,14 @@ sub minimize_connections {
 		my $ra = Git::SVN::Ra->new($url);
 
 		# skip existing cases where we already connect to the root
-		if (($ra->{url} eq $ra->{repos_root}) ||
+		if (($ra->url eq $ra->{repos_root}) ||
 		    ($ra->{repos_root} eq $repo_id)) {
-			$root_repos->{$ra->{url}} = $repo_id;
+			$root_repos->{$ra->url} = $repo_id;
 			next;
 		}
 
 		my $root_ra = Git::SVN::Ra->new($ra->{repos_root});
-		my $root_path = $ra->{url};
+		my $root_path = $ra->url;
 		$root_path =~ s#^\Q$ra->{repos_root}\E(/|$)##;
 		foreach my $path (keys %$fetch) {
 			my $ref_id = $fetch->{$path};
diff --git a/perl/Git/SVN/Ra.pm b/perl/Git/SVN/Ra.pm
index 23ff43e8..7bd6db6f 100644
--- a/perl/Git/SVN/Ra.pm
+++ b/perl/Git/SVN/Ra.pm
@@ -8,6 +8,11 @@ BEGIN {
 	@ISA = qw(SVN::Ra);
 }
 
+use Git::SVN::Utils qw(
+    canonicalize_url
+    canonicalize_path
+);
+
 my ($ra_invalid, $can_do_switch, %ignored_err, $RA);
 
 BEGIN {
@@ -62,29 +67,10 @@ sub _auth_providers () {
 	\@rv;
 }
 
-sub escape_uri_only {
-	my ($uri) = @_;
-	my @tmp;
-	foreach (split m{/}, $uri) {
-		s/([^~\w.%+-]|%(?![a-fA-F0-9]{2}))/sprintf("%%%02X",ord($1))/eg;
-		push @tmp, $_;
-	}
-	join('/', @tmp);
-}
-
-sub escape_url {
-	my ($url) = @_;
-	if ($url =~ m#^(https?)://([^/]+)(.*)$#) {
-		my ($scheme, $domain, $uri) = ($1, $2, escape_uri_only($3));
-		$url = "$scheme://$domain$uri";
-	}
-	$url;
-}
-
 sub new {
 	my ($class, $url) = @_;
 	$url =~ s!/+$!!;
-	return $RA if ($RA && $RA->{url} eq $url);
+	return $RA if ($RA && $RA->url eq $url);
 
 	::_req_svn();
 
@@ -115,19 +101,37 @@ sub new {
 			$Git::SVN::Prompt::_no_auth_cache = 1;
 		}
 	} # no warnings 'once'
-	my $self = SVN::Ra->new(url => escape_url($url), auth => $baton,
+	my $self = SVN::Ra->new(url => canonicalize_url($url), auth => $baton,
 	                      config => $config,
 			      pool => SVN::Pool->new,
 	                      auth_provider_callbacks => $callbacks);
-	$self->{url} = $url;
+	$RA = bless $self, $class;
+
+	# Make sure its canonicalized
+	$self->url($url);
 	$self->{svn_path} = $url;
 	$self->{repos_root} = $self->get_repos_root;
 	$self->{svn_path} =~ s#^\Q$self->{repos_root}\E(/|$)##;
 	$self->{cache} = { check_path => { r => 0, data => {} },
 	                   get_dir => { r => 0, data => {} } };
-	$RA = bless $self, $class;
+
+	return $RA;
 }
 
+
+sub url {
+    my $self = shift;
+
+    if( @_ ) {
+        my $url = shift;
+        $self->{url} = canonicalize_url($url);
+        return;
+    }
+
+    return $self->{url};
+}
+
+
 sub check_path {
 	my ($self, $path, $r) = @_;
 	my $cache = $self->{cache}->{check_path};
@@ -146,6 +150,9 @@ sub check_path {
 
 sub get_dir {
 	my ($self, $dir, $r) = @_;
+
+        $dir = $self->fix_dir($dir);
+
 	my $cache = $self->{cache}->{get_dir};
 	if ($r == $cache->{r}) {
 		if (my $x = $cache->{data}->{$dir}) {
@@ -164,6 +171,20 @@ sub get_dir {
 	wantarray ? (\%dirents, $r, $props) : \%dirents;
 }
 
+sub fix_dir {
+    my $self = shift;
+    my $dir = shift;
+
+    # SVN::Ra->get_dir will fail if given foo/.. and the
+    # canonicalize functions will not fix that.
+    $dir =~ s{[^/]+/\.\./?}{}g;
+
+    # It also doesn't like a single dot.
+    $dir =~ s{^\.$}{};
+
+    return $dir;
+}
+
 sub DESTROY {
 	# do not call the real DESTROY since we store ourselves in $RA
 }
@@ -195,6 +216,7 @@ sub get_log {
 				qw/copyfrom_path copyfrom_rev action/;
 			if ($s{'copyfrom_path'}) {
 				$s{'copyfrom_path'} =~ s/$prefix_regex//;
+				$s{'copyfrom_path'} = canonicalize_path($s{'copyfrom_path'});
 			}
 			$_[0]{$p} = \%s;
 		}
@@ -246,7 +268,7 @@ sub get_commit_editor {
 sub gs_do_update {
 	my ($self, $rev_a, $rev_b, $gs, $editor) = @_;
 	my $new = ($rev_a == $rev_b);
-	my $path = $gs->{path};
+	my $path = $gs->path;
 
 	if ($new && -e $gs->{index}) {
 		unlink $gs->{index} or die
@@ -282,17 +304,17 @@ sub gs_do_update {
 # svn_ra_reparent didn't work before 1.4)
 sub gs_do_switch {
 	my ($self, $rev_a, $rev_b, $gs, $url_b, $editor) = @_;
-	my $path = $gs->{path};
+	my $path = $gs->path;
 	my $pool = SVN::Pool->new;
 
-	my $full_url = $self->{url};
+	my $full_url = $self->url;
 	my $old_url = $full_url;
-	$full_url .= '/' . $path if length $path;
+	$full_url = canonicalize_url( $full_url . '/' . $path ) if length $path;
 	my ($ra, $reparented);
 
 	if ($old_url =~ m#^svn(\+ssh)?://# ||
 	    ($full_url =~ m#^https?://# &&
-	     escape_url($full_url) ne $full_url)) {
+	     $full_url ne $full_url)) {
 		$_[0] = undef;
 		$self = undef;
 		$RA = undef;
@@ -300,12 +322,12 @@ sub gs_do_switch {
 		$ra_invalid = 1;
 	} elsif ($old_url ne $full_url) {
 		SVN::_Ra::svn_ra_reparent($self->{session}, $full_url, $pool);
-		$self->{url} = $full_url;
+		$self->url($full_url);
 		$reparented = 1;
 	}
 
 	$ra ||= $self;
-	$url_b = escape_url($url_b);
+	$url_b = canonicalize_url($url_b);
 	my $reporter = $ra->do_switch($rev_b, '', 1, $url_b, $editor, $pool);
 	my @lock = (::compare_svn_version('1.2.0') >= 0) ? (undef) : ();
 	$reporter->set_path('', $rev_a, 0, @lock, $pool);
@@ -313,7 +335,7 @@ sub gs_do_switch {
 
 	if ($reparented) {
 		SVN::_Ra::svn_ra_reparent($self->{session}, $old_url, $pool);
-		$self->{url} = $old_url;
+		$self->url($old_url);
 	}
 
 	$pool->clear;
@@ -326,7 +348,7 @@ sub longest_common_path {
 	my $common_max = scalar @$gsv;
 
 	foreach my $gs (@$gsv) {
-		my @tmp = split m#/#, $gs->{path};
+		my @tmp = split m#/#, $gs->path;
 		my $p = '';
 		foreach (@tmp) {
 			$p .= length($p) ? "/$_" : $_;
@@ -362,7 +384,7 @@ sub gs_fetch_loop_common {
 	my $inc = $_log_window_size;
 	my ($min, $max) = ($base, $head < $base + $inc ? $head : $base + $inc);
 	my $longest_path = longest_common_path($gsv, $globs);
-	my $ra_url = $self->{url};
+	my $ra_url = $self->url;
 	my $find_trailing_edge;
 	while (1) {
 		my %revs;
@@ -508,7 +530,7 @@ sub match_globs {
 				 ($self->check_path($p, $r) !=
 				  $SVN::Node::dir));
 			next unless $p =~ /$g->{path}->{regex}/;
-			$exists->{$p} = Git::SVN->init($self->{url}, $p, undef,
+			$exists->{$p} = Git::SVN->init($self->url, $p, undef,
 					 $g->{ref}->full_path($de), 1);
 		}
 	}
@@ -532,7 +554,7 @@ sub match_globs {
 			next if ($self->check_path($pathname, $r) !=
 			         $SVN::Node::dir);
 			$exists->{$pathname} = Git::SVN->init(
-			                      $self->{url}, $pathname, undef,
+			                      $self->url, $pathname, undef,
 			                      $g->{ref}->full_path($p), 1);
 		}
 		my $c = '';
@@ -548,7 +570,7 @@ sub match_globs {
 
 sub minimize_url {
 	my ($self) = @_;
-	return $self->{url} if ($self->{url} eq $self->{repos_root});
+	return $self->url if ($self->url eq $self->{repos_root});
 	my $url = $self->{repos_root};
 	my @components = split(m!/!, $self->{svn_path});
 	my $c = '';
@@ -568,7 +590,7 @@ sub can_do_switch {
 	unless (defined $can_do_switch) {
 		my $pool = SVN::Pool->new;
 		my $rep = eval {
-			$self->do_switch(1, '', 0, $self->{url},
+			$self->do_switch(1, '', 0, $self->url,
 			                 SVN::Delta::Editor->new, $pool);
 		};
 		if ($@) {
diff --git a/perl/Git/SVN/Utils.pm b/perl/Git/SVN/Utils.pm
index d8f63e68..e7087435 100644
--- a/perl/Git/SVN/Utils.pm
+++ b/perl/Git/SVN/Utils.pm
@@ -5,10 +5,54 @@ use warnings;
 
 use base qw(Exporter);
 
-our @EXPORT_OK = qw(fatal can_compress);
+our @EXPORT_OK = qw(
+    fatal
+    can_compress
+    canonicalize_path
+    canonicalize_url
+);
 
+
+=head1 NAME
+
+Git::SVN::Utils - utility functions used across Git::SVN
+
+=head1 SYNOPSIS
+
+    use Git::SVN::Utils qw(functions to import);
+
+=head1 DESCRIPTION
+
+This module contains functions which are useful across many different
+parts of Git::SVN.  Mostly it's a place to put utility functions
+rather than duplicate the code or have classes grabbing at other
+classes.
+
+=head1 FUNCTIONS
+
+All functions can be imported only on request.
+
+=head3 fatal
+
+    fatal(@message);
+
+Display a message and exit with a fatal error code.
+
+=cut
+
+# Note: not certain why this is in use instead of die.  Probably because
+# the exit code of die is 255?  Doesn't appear to be used consistently.
 sub fatal (@) { print STDERR "@_\n"; exit 1 }
 
+
+=head3 can_compress
+
+    my $can_compress = can_compress;
+
+Returns true if Compress::Zlib is available, false otherwise.
+
+=cut
+
 my $can_compress;
 sub can_compress {
     return $can_compress if defined $can_compress;
@@ -16,4 +60,81 @@ sub can_compress {
     return $can_compress = eval { require Compress::Zlib; } ? 1 : 0;
 }
 
+
+=head3 canonicalize_path
+
+    my $canonical_path = canonicalize_path($path);
+
+Cleans up the $path so its acceptable to the SVN API's strict idea of
+what a path should look like.
+
+=cut
+
+# SVN 1.7 changed its canonicalization API, so decide on the best way to handle the problem.
+sub canonicalize_path {
+    my $path = shift;
+
+    # The 1.7 way to do it
+    if( defined &SVN::_Core::svn_dirent_canonicalize ) {
+        return SVN::_Core::svn_dirent_canonicalize($path);
+    }
+    # The 1.6 way to do it
+    elsif( defined &SVN::_Core::svn_path_canonicalize ) {
+        return SVN::_Core::svn_path_canonicalize($path);
+    }
+    # No SVN API canonicalization is available, do it ourselves
+    else {
+        return _canonicalize_path_ourselves($path);
+    }
+}
+
+
+# A pure Perl implementation of path canonicalization in case the SVN
+# API ones are unavailable.
+sub _canonicalize_path_ourselves {
+	my ($path) = @_;
+	my $dot_slash_added = 0;
+	if (substr($path, 0, 1) ne "/") {
+		$path = "./" . $path;
+		$dot_slash_added = 1;
+	}
+	# File::Spec->canonpath doesn't collapse x/../y into y (for a
+	# good reason), so let's do this manually.
+	$path =~ s#/+#/#g;
+	$path =~ s#/\.(?:/|$)#/#g;
+	$path =~ s#/[^/]+/\.\.##g;
+	$path =~ s#/$##g;
+	$path =~ s#^\./## if $dot_slash_added;
+	$path =~ s#^/##;
+	$path =~ s#^\.$##;
+	return $path;
+}
+
+
+=head3 canonicalize_url
+
+    my $canonical_url = canonicalize_url($url);
+
+Cleans up the $url so its acceptable to the SVN API's strict idea of
+what a URL should look like.
+
+=cut
+
+# SVN 1.7 changed its canonicalization API, so decide on the best way to handle the problem.
+sub canonicalize_url {
+    my $url = shift;
+
+    # The 1.7 way to do it
+    if( defined &SVN::_Core::svn_uri_canonicalize ) {
+        $url = SVN::_Core::svn_uri_canonicalize($url);
+    }
+    # There wasn't a 1.6 way to do it, so we do it ourself.
+    else {
+        $url =~ s#^([^:]+://[^/]*/)(.*)$#$1 . canonicalize_path($2)#e;
+    }
+
+    return $url;
+}
+
+
 1;
diff --git a/t/Git-SVN/Ra/fix_dir.t b/t/Git-SVN/Ra/fix_dir.t
new file mode 100644
index 00000000..ec0c858a
--- /dev/null
+++ b/t/Git-SVN/Ra/fix_dir.t
@@ -0,0 +1,24 @@
+#!/usr/bin/env perl
+
+use strict;
+use warnings;
+
+use Test::More 'no_plan';
+
+my $CLASS = 'Git::SVN::Ra';
+require_ok $CLASS;
+
+my %tests = (
+    "foo/.."            => "",
+    "/foo/.."           => "/",
+    "."                 => "",
+    "foo/../"           => "",
+    "foo/bar/baz"       => "foo/bar/baz",
+    "foo/bar/../baz"    => "foo/baz",
+);
+
+for my $have (keys %tests) {
+    my $want = $tests{$have};
+
+    is $CLASS->fix_dir($have), $want, "fix_dir $have => $want";
+}
diff --git a/t/lib-git-svn.sh b/t/lib-git-svn.sh
index 199f22c2..c6911cdf 100644
--- a/t/lib-git-svn.sh
+++ b/t/lib-git-svn.sh
@@ -68,6 +68,8 @@ svn_cmd () {
 	svn "$orig_svncmd" --config-dir "$svnconf" "$@"
 }
 
+SVN_VERSION=`svn_cmd --version | sed -n -e 's/^svn, version \(1\.[0-9]*\.[0-9]*\).*$/\1/p'`
+
 prepare_httpd () {
 	for d in \
 		"$SVN_HTTPD_PATH" \
diff --git a/t/t9118-git-svn-funky-branch-names.sh b/t/t9118-git-svn-funky-branch-names.sh
index 63fc982c..66787acc 100755
--- a/t/t9118-git-svn-funky-branch-names.sh
+++ b/t/t9118-git-svn-funky-branch-names.sh
@@ -32,6 +32,10 @@ test_expect_success 'setup svnrepo' '
 	start_httpd
 	'
 
+# SVN 1.7 will truncate "not-a%40{0]" to just "not-a".
+# Look at what SVN wound up naming the branch and use that.
+non_reflog=`svn_cmd ls "$svnrepo/pr ject/branches" | grep not-a | sed 's/\///'`
+
 test_expect_success 'test clone with funky branch names' '
 	git svn clone -s "$svnrepo/pr ject" project &&
 	(
@@ -42,7 +46,7 @@ test_expect_success 'test clone with funky branch names' '
 		git rev-parse "refs/remotes/%2Eleading_dot" &&
 		git rev-parse "refs/remotes/trailing_dot%2E" &&
 		git rev-parse "refs/remotes/trailing_dotlock%2Elock" &&
-		git rev-parse "refs/remotes/not-a%40{0}reflog"
+		git rev-parse "refs/remotes/$non_reflog"
 	)
 	'
 
diff --git a/t/t9119-git-svn-info.sh b/t/t9119-git-svn-info.sh
index ff19695e..9b6075c6 100755
--- a/t/t9119-git-svn-info.sh
+++ b/t/t9119-git-svn-info.sh
@@ -6,14 +6,11 @@ test_description='git svn info'
 
 . ./lib-git-svn.sh
 
-# Tested with: svn, version 1.4.4 (r25188)
-# Tested with: svn, version 1.6.[12345689]
-v=`svn_cmd --version | sed -n -e 's/^svn, version \(1\.[0-9]*\.[0-9]*\).*$/\1/p'`
-case $v in
+case $SVN_VERSION in
 1.[456].*)
 	;;
 *)
-	skip_all="skipping svn-info test (SVN version: $v not supported)"
+	skip_all="skipping svn-info test (SVN version: $SVN_VERSION not supported)"
 	test_done
 	;;
 esac

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 23:31     ` Jonathan Nieder
@ 2012-07-24 23:45       ` Junio C Hamano
  2012-07-25  1:00         ` Michael G Schwern
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2012-07-24 23:45 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Michael G Schwern, git, Robin H. Johnson, Eric Wong, Ben Walton

Jonathan Nieder <jrnieder@gmail.com> writes:

> Michael G Schwern wrote:
>
>>   git log -p schwern/git-svn/extract-classes..schwern/git-svn/fix-canonical
>>
>> That should give you the information you need...
>
> I guess so.  May we have your sign-off on these changes?  (A simple
> reply of "yes" is enough, no need to resend patches to do this.)
>
> Here it is in patch form for reviewers.  If I understand correctly,
> the idea is to replace accesses to $gs->{path} with calls to a
> $gs->path function that canonicalizes (and likewise for s/path/url/).
>
> There are probably other subtleties, but that seems to be the gist.

The impression I am getting is that the updated code wants to handle
URL and paths without any funny encoding, but it is unclear from my
cursory read (e.g. what goes on with escape_url?).

>  	if ($old_url =~ m#^svn(\+ssh)?://# ||
>  	    ($full_url =~ m#^https?://# &&
> -	     escape_url($full_url) ne $full_url)) {
> +	     $full_url ne $full_url)) {

How can the latter part of this conditional be true?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-24 23:45       ` Junio C Hamano
@ 2012-07-25  1:00         ` Michael G Schwern
  2012-07-25  4:53           ` Jonathan Nieder
  0 siblings, 1 reply; 14+ messages in thread
From: Michael G Schwern @ 2012-07-25  1:00 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, git, Robin H. Johnson, Eric Wong, Ben Walton

On 2012.7.24 4:45 PM, Junio C Hamano wrote:
>>>   git log -p schwern/git-svn/extract-classes..schwern/git-svn/fix-canonical
>>>
>>> That should give you the information you need...
>>
>> I guess so.  May we have your sign-off on these changes?  (A simple
>> reply of "yes" is enough, no need to resend patches to do this.)
>>
>> Here it is in patch form for reviewers.  If I understand correctly,
>> the idea is to replace accesses to $gs->{path} with calls to a
>> $gs->path function that canonicalizes (and likewise for s/path/url/).
>>
>> There are probably other subtleties, but that seems to be the gist.
> 
> The impression I am getting is that the updated code wants to handle
> URL and paths without any funny encoding, but it is unclear from my
> cursory read (e.g. what goes on with escape_url?).

No, now it's just canonicalizing as early as possible.  Preferably within the
object accessor rather than at the point of use.  So in the code below,
$full_url is already escaped/canonicalized.

In general this blob patch isn't going to make a lot of overall sense.  I'm
working with Jonathan to get it submitted in manageable pieces.


>>  	if ($old_url =~ m#^svn(\+ssh)?://# ||
>>  	    ($full_url =~ m#^https?://# &&
>> -	     escape_url($full_url) ne $full_url)) {
>> +	     $full_url ne $full_url)) {
> 
> How can the latter part of this conditional be true?

Good point.  More importantly, what was it trying to accomplish before and
does it need to be preserved?

"If the URL is svn OR its http and needs to be escaped... do something
special".  I don't really understand what the special stuff in the following
block is.  Anything that undef's the invocant (ie. $self) is probably broken.

a51cdb0c0420ee3bef26bbd1a9aa75e1d464e5b7 and
2a679c7a3148978a3f58f1c12100383638e744c5 shed some light.  2a679 looks like it
specifically holds off on escaping $full_url.  It would be very nice if that
was not necessary.  It would be helpful if the bug mentioned in 2a679 could be
reproduced to see if it still applies or can be dealt with in another way.


-- 
185. My name is not a killing word.
    -- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
           http://skippyslist.com/list/

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-25  1:00         ` Michael G Schwern
@ 2012-07-25  4:53           ` Jonathan Nieder
  2012-07-25  5:36             ` Michael G Schwern
  0 siblings, 1 reply; 14+ messages in thread
From: Jonathan Nieder @ 2012-07-25  4:53 UTC (permalink / raw)
  To: Michael G Schwern
  Cc: Junio C Hamano, git, Robin H. Johnson, Eric Wong, Ben Walton

Michael G Schwern wrote:

> No, now it's just canonicalizing as early as possible.  Preferably within the
> object accessor rather than at the point of use.  So in the code below,
> $full_url is already escaped/canonicalized.

Let's start with this.

Is svn_path_canonicalize() idempotent?  What does it do when it
encounters a percent-sign?

Thanks,
Jonathan

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-25  4:53           ` Jonathan Nieder
@ 2012-07-25  5:36             ` Michael G Schwern
  2012-07-25  7:14               ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Michael G Schwern @ 2012-07-25  5:36 UTC (permalink / raw)
  To: Jonathan Nieder
  Cc: Junio C Hamano, git, Robin H. Johnson, Eric Wong, Ben Walton

On 2012.7.24 9:53 PM, Jonathan Nieder wrote:
> Michael G Schwern wrote:
> 
>> No, now it's just canonicalizing as early as possible.  Preferably within the
>> object accessor rather than at the point of use.  So in the code below,
>> $full_url is already escaped/canonicalized.
> 
> Let's start with this.
> 
> Is svn_path_canonicalize() idempotent?  What does it do when it
> encounters a percent-sign?

Nothing, because paths are not URI escaped. :)

You probably meant svn_uri_canonicalize().  And no, it does not double escape,
so its safe to escape as early as possible.

   use SVN::Core;

   my $uri = "http://www.example.com/ foo";

    print SVN::_Core::svn_uri_canonicalize(
        SVN::_Core::svn_uri_canonicalize($uri)
    );

That produces "http://www.example.com/%20foo".

The API docs don't say it specifically, but if it were otherwise it would be
impossible to use.  You'd have to check first if anything were escaped before
canonicalizing.  And a user couldn't pass in an escaped URL without risking it
being double escaped.

http://subversion.apache.org/docs/api/latest/svn__dirent__uri_8h.html#a8bae33a2fbf86857869f7b0e39a553e7


-- 
'All anyone gets in a mirror is themselves,' she said. 'But what you
gets in a good gumbo is everything.'
    -- "Witches Abroad" by Terry Prachett

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-25  5:36             ` Michael G Schwern
@ 2012-07-25  7:14               ` Junio C Hamano
  2012-07-25  9:53                 ` Michael G Schwern
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2012-07-25  7:14 UTC (permalink / raw)
  To: Michael G Schwern
  Cc: Jonathan Nieder, git, Robin H. Johnson, Eric Wong, Ben Walton

Michael G Schwern <schwern@pobox.com> writes:

> On 2012.7.24 9:53 PM, Jonathan Nieder wrote:
>> Michael G Schwern wrote:
>> 
>>> No, now it's just canonicalizing as early as possible.  Preferably within the
>>> object accessor rather than at the point of use.  So in the code below,
>>> $full_url is already escaped/canonicalized.
>> 
>> Let's start with this.
>> 
>> Is svn_path_canonicalize() idempotent?  What does it do when it
>> encounters a percent-sign?
>
> Nothing, because paths are not URI escaped. :)
>
> You probably meant svn_uri_canonicalize().  And no, it does not double escape,
> so its safe to escape as early as possible.

Are you saying that the function assumes that a local pathname would
not have '%' in it, returns its input as-is when it sees one, and if
the caller really needs to express a path with '%' in it, it is the
responsibility of the caller to escape it?

That makes it even more confusing....

>    my $uri = "http://www.example.com/ foo";
>
>     print SVN::_Core::svn_uri_canonicalize(
>         SVN::_Core::svn_uri_canonicalize($uri)
>     );
>
> That produces "http://www.example.com/%20foo".

In other words, if your DocumentRoot was /var/www and you have a
directory /var/www/per%cent you want to expose to the outside world,
you have to say "http://www.example.com/per%25cent" yourself and the
"canonicalize" function will be an identity function?

I have this vague suspicion that Jonathan was asking about what your
Git::SVN::Utils::canonicalize_path() sub does, so all of the above
might be moot, though...

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git-svn SVN 1.7 fix, take 2
  2012-07-25  7:14               ` Junio C Hamano
@ 2012-07-25  9:53                 ` Michael G Schwern
  0 siblings, 0 replies; 14+ messages in thread
From: Michael G Schwern @ 2012-07-25  9:53 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Jonathan Nieder, git, Robin H. Johnson, Eric Wong, Ben Walton

On 2012.7.25 12:14 AM, Junio C Hamano wrote:
>> Nothing, because paths are not URI escaped. :)
>>
>> You probably meant svn_uri_canonicalize().  And no, it does not double escape,
>> so its safe to escape as early as possible.
> 
> Are you saying that the function assumes that a local pathname would
                                                        ^^^^^^^^

URI and path canonicalization are done differently and by different functions.
 svn_uri_canonicalize() vs svn_dirent_canonicalize().  Or maybe you're
referring to the path portion of the URL?  I don't think that makes a
difference for what you're asking, but its important to keep in mind.


> not have '%' in it, returns its input as-is when it sees one, and if
> the caller really needs to express a path with '%' in it, it is the
> responsibility of the caller to escape it?

It appears that if the % is followed by hex it assumes its an escape.
Otherwise it escapes it.  Thus...

   http://www.google.com/per%%nt -> http://www.google.com/per%25%25nt
   http://www.google.com/per%ant -> http://www.google.com/per%25ant
   http://www.google.com/per%cent -> http://www.google.com/per%CEnt

Which makes sense if the idea is to not double escape.


> That makes it even more confusing....

Straight out of the RFC.
http://pretty-rfc.herokuapp.com/RFC3986#when-to-percent-encode

    Implementations must not percent-encode or decode the same string more
    than once, as decoding an already decoded string might lead to
    misinterpreting a percent data octet as the beginning of a percent-
    encoding, or vice versa in the case of percent-encoding an already
    percent-encoded string.

It makes it far simpler to use.  You can't read the mind of the user, but its
a fair guess that they're not really thinking too deeply about how escaping
works.  It makes URI and path canonicalization safer and simpler.  Otherwise
you'd need to keep track of whether a thing was already escaped or not!  Just
begging for loads of bugs.  (If SVN were using URI and path objects they'd
just take care of it and none of this would be a problem in the first place).

This way you have no double escaping concerns.  No need to track if a thing is
already canonicalized.  Do it as often and as early as you like.  Making a
corner case a little harder is a small price to pay for making the common case
much, much easier.

This also appears to be what Firefox does.


>>    my $uri = "http://www.example.com/ foo";
>>
>>     print SVN::_Core::svn_uri_canonicalize(
>>         SVN::_Core::svn_uri_canonicalize($uri)
>>     );
>>
>> That produces "http://www.example.com/%20foo".
> 
> In other words, if your DocumentRoot was /var/www and you have a
> directory /var/www/per%cent you want to expose to the outside world,
> you have to say "http://www.example.com/per%25cent" yourself and the
> "canonicalize" function will be an identity function?

Yes.  It can be made to work better.

There's a number of places in the code which effectively do this:

    my $full_url = $url . '/' . $path;

And I was canonicalizing them like this:

    my $full_url = canonicalize_url($url . '/' . $path);

I'd been pondering whether it would be worthwhile to have a function which
added a path to a base URL and canonicalized.  Now I see that yes, it would be
to deal with this corner case.

    my $full_url = append_path_to_url($url, $path);

That would properly URI encode any % in $path before appending and then
canonicalizing the whole thing as a URI.

I'm pretty sure the code in master doesn't handle this at all.


> I have this vague suspicion that Jonathan was asking about what your
> Git::SVN::Utils::canonicalize_path() sub does, so all of the above
> might be moot, though...

Its just a pass through to the SVN API.


-- 
44. I am not the atheist chaplain.
    -- The 213 Things Skippy Is No Longer Allowed To Do In The U.S. Army
           http://skippyslist.com/list/

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2012-07-25  9:58 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-07-24 21:46 git-svn SVN 1.7 fix, take 2 Michael G Schwern
2012-07-24 21:51 ` Junio C Hamano
2012-07-24 23:06   ` Michael G Schwern
2012-07-24 23:12     ` Jonathan Nieder
2012-07-24 22:02 ` Jonathan Nieder
2012-07-24 22:50   ` Michael G Schwern
2012-07-24 23:03     ` Jonathan Nieder
2012-07-24 23:31     ` Jonathan Nieder
2012-07-24 23:45       ` Junio C Hamano
2012-07-25  1:00         ` Michael G Schwern
2012-07-25  4:53           ` Jonathan Nieder
2012-07-25  5:36             ` Michael G Schwern
2012-07-25  7:14               ` Junio C Hamano
2012-07-25  9:53                 ` Michael G Schwern

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.