All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHv2] submodule: Port resolve_relative_url from shell to C
@ 2016-01-12 23:35 Stefan Beller
  2016-01-13  0:10 ` [PATCHv3] " Stefan Beller
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Beller @ 2016-01-12 23:35 UTC (permalink / raw)
  To: gitster, git; +Cc: peff, j6t, jens.lehmann, Stefan Beller

Later on we want to deprecate the `git submodule init` command and make
it implicit in other submodule commands. As these other commands are
written in C already, we'd need the init functionality in C, too.
The `resolve_relative_url` function is a major part of that init
functionality, so start by porting this function to C.

As I was porting the functionality I noticed some odds with the inputs.
To fully understand the situation I added some logging to the function
temporarily to capture all calls to the function throughout the test
suite. Duplicates have been removed and all unique testing inputs have
been recorded into t0060.

Signed-off-by: Stefan Beller <sbeller@google.com>
---

I tried considering an alternative implementation for `relative_url`,
which is not a direct translation of the shell code. There are some
advanced path/url functions, such as `normalize_path_copy`, however
using that function is not straightforward as it seems. The idea would
have been to use that on a concatenation of remoteurl and url, however
there are cases like ("foo/." "../.") to result in "foo/.", so we really
need to count the slashes ourselves.

This applies on top of current master (2.7.0).

Thanks,
Stefan

---
 builtin/submodule--helper.c | 191 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 +------------------
 t/t0060-path-utils.sh       |  41 ++++++++++
 3 files changed, 236 insertions(+), 77 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..db9d627 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -9,6 +9,195 @@
 #include "submodule-config.h"
 #include "string-list.h"
 #include "run-command.h"
+#include "remote.h"
+#include "refs.h"
+#include "connect.h"
+
+static char *get_default_remote(void)
+{
+	char *dest = NULL, *ret;
+	unsigned char sha1[20];
+	int flag;
+	struct strbuf sb = STRBUF_INIT;
+	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
+
+	if (!refname)
+		die("No such ref: HEAD");
+
+	refname = shorten_unambiguous_ref(refname, 0);
+	strbuf_addf(&sb, "branch.%s.remote", refname);
+	if (git_config_get_string(sb.buf, &dest))
+		ret = xstrdup("origin");
+	else
+		ret = xstrdup(dest);
+
+	strbuf_release(&sb);
+	return ret;
+}
+
+static int starts_with_dot_slash(const char *str)
+{
+	return str[0] == '.' && is_dir_sep(str[1]);
+}
+
+static int starts_with_dot_dot_slash(const char *str)
+{
+	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
+}
+
+static char *last_dir_separator(char *str)
+{
+	char* p = str + strlen(str);
+	while (p-- != str)
+		if (is_dir_sep(*p))
+			return p;
+	return NULL;
+}
+
+/*
+ * The `url` argument is the URL that navigates to the submodule origin
+ * repo. When relative, this URL is relative to the superproject origin
+ * URL repo. The `up_path` argument, if specified, is the relative
+ * path that navigates from the submodule working tree to the superproject
+ * working tree. Returns the origin URL of the submodule.
+ *
+ * Return either an absolute URL or filesystem path (if the superproject
+ * origin URL is an absolute URL or filesystem path, respectively) or a
+ * relative file system path (if the superproject origin URL is a relative
+ * file system path).
+ *
+ * When the output is a relative file system path, the path is either
+ * relative to the submodule working tree, if up_path is specified, or to
+ * the superproject working tree otherwise.
+ */
+static char *relative_url(const char *remote_url,
+				const char *url,
+				const char *up_path)
+{
+	int is_relative = 0;
+	int colonsep = 0;
+	char *out;
+	char *remoteurl = xstrdup(remote_url);
+	struct strbuf sb = STRBUF_INIT;
+	size_t len;
+
+	len = strlen(remoteurl);
+	if (is_dir_sep(remoteurl[len]))
+		remoteurl[len] = '\0';
+
+	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
+		is_relative = 0;
+	else {
+		is_relative = 1;
+
+		/* prepend a './' to ensure all relative remoteurls start with
+		 * './' or '../' */
+		if (!starts_with_dot_slash(remoteurl) &&
+		    !starts_with_dot_dot_slash(remoteurl)) {
+			strbuf_reset(&sb);
+			strbuf_addf(&sb, "./%s", remoteurl);
+			free(remoteurl);
+			remoteurl = strbuf_detach(&sb, NULL);
+		}
+	}
+	/* When the url starts with '../', remove that and the last directory
+	 * in remoteurl */
+	while (url) {
+		if (starts_with_dot_dot_slash(url)) {
+			char *rfind;
+			url += 3;
+
+			rfind = last_dir_separator(remoteurl);
+			if (rfind)
+				*rfind = '\0';
+			else {
+				rfind = strrchr(remoteurl, ':');
+				if (rfind) {
+					*rfind = '\0';
+					colonsep = 1;
+				} else {
+					if (is_relative || !strcmp(".", remoteurl))
+						die(N_("cannot strip one component off url '%s'"), remoteurl);
+					else
+						remoteurl = ".";
+				}
+			}
+		} else if (starts_with_dot_slash(url)) {
+			url += 2;
+		} else
+			break;
+	}
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
+
+	if (starts_with_dot_slash(sb.buf))
+		out = xstrdup(sb.buf + 2);
+	else
+		out = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	free(remoteurl);
+	if (!up_path || !is_relative)
+		return out;
+	else {
+		strbuf_addf(&sb, "%s%s", up_path, out);
+
+		free(out);
+		return strbuf_detach(&sb, NULL);
+	}
+}
+
+static int resolve_relative_url(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl = NULL;
+	char *remote = get_default_remote();
+	const char *up_path = NULL;
+	char *res;
+	const char *url;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (argc != 2 && argc != 3)
+		die("BUG: resolve_relative_url only accepts one or two arguments");
+
+	url = argv[1];
+	strbuf_addf(&sb, "remote.%s.url", remote);
+	free(remote);
+
+	if (git_config_get_string(sb.buf, &remoteurl))
+		/* the repository is its own authoritative upstream */
+		remoteurl = xgetcwd();
+
+	if (argc == 3)
+		up_path = argv[2];
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
+
+static int resolve_relative_url_test(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl, *res;
+	const char *up_path, *url;
+
+	if (argc != 4)
+		die("BUG: resolve_relative_url only accepts three arguments: <up_path> <remoteurl> <url>");
+
+	up_path = argv[1];
+	remoteurl = xstrdup(argv[2]);
+	url = argv[3];
+
+	if (!strcmp(up_path, "(null)"))
+		up_path = NULL;
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
 
 struct module_list {
 	const struct cache_entry **entries;
@@ -264,6 +453,8 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"resolve-relative-url", resolve_relative_url},
+	{"resolve-relative-url-test", resolve_relative_url_test},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..3e409af 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -46,79 +46,6 @@ prefix=
 custom_name=
 depth=
 
-# The function takes at most 2 arguments. The first argument is the
-# URL that navigates to the submodule origin repo. When relative, this URL
-# is relative to the superproject origin URL repo. The second up_path
-# argument, if specified, is the relative path that navigates
-# from the submodule working tree to the superproject working tree.
-#
-# The output of the function is the origin URL of the submodule.
-#
-# The output will either be an absolute URL or filesystem path (if the
-# superproject origin URL is an absolute URL or filesystem path,
-# respectively) or a relative file system path (if the superproject
-# origin URL is a relative file system path).
-#
-# When the output is a relative file system path, the path is either
-# relative to the submodule working tree, if up_path is specified, or to
-# the superproject working tree otherwise.
-resolve_relative_url ()
-{
-	remote=$(get_default_remote)
-	remoteurl=$(git config "remote.$remote.url") ||
-		remoteurl=$(pwd) # the repository is its own authoritative upstream
-	url="$1"
-	remoteurl=${remoteurl%/}
-	sep=/
-	up_path="$2"
-
-	case "$remoteurl" in
-	*:*|/*)
-		is_relative=
-		;;
-	./*|../*)
-		is_relative=t
-		;;
-	*)
-		is_relative=t
-		remoteurl="./$remoteurl"
-		;;
-	esac
-
-	while test -n "$url"
-	do
-		case "$url" in
-		../*)
-			url="${url#../}"
-			case "$remoteurl" in
-			*/*)
-				remoteurl="${remoteurl%/*}"
-				;;
-			*:*)
-				remoteurl="${remoteurl%:*}"
-				sep=:
-				;;
-			*)
-				if test -z "$is_relative" || test "." = "$remoteurl"
-				then
-					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
-				else
-					remoteurl=.
-				fi
-				;;
-			esac
-			;;
-		./*)
-			url="${url#./}"
-			;;
-		*)
-			break;;
-		esac
-	done
-	remoteurl="$remoteurl$sep${url%/}"
-	echo "${is_relative:+${up_path}}${remoteurl#./}"
-}
-
 # Resolve a path to be relative to another path.  This is intended for
 # converting submodule paths when git-submodule is run in a subdirectory
 # and only handles paths where the directory separator is '/'.
@@ -281,7 +208,7 @@ cmd_add()
 		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
 
 		# dereference source url relative to parent's url
-		realrepo=$(resolve_relative_url "$repo") || exit
+		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
 		;;
 	*:*|/*)
 		# absolute url
@@ -485,7 +412,7 @@ cmd_init()
 			# Possibly a url relative to parent
 			case "$url" in
 			./*|../*)
-				url=$(resolve_relative_url "$url") || exit
+				url=$(git submodule--helper resolve-relative-url "$url") || exit
 				;;
 			esac
 			git config submodule."$name".url "$url" ||
@@ -1190,9 +1117,9 @@ cmd_sync()
 			# guarantee a trailing /
 			up_path=${up_path%/}/ &&
 			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
+			sub_origin_url=$(git submodule--helper resolve-relative-url "$url" "$up_path") &&
 			# path from superproject work tree to submodule origin repo
-			super_config_url=$(resolve_relative_url "$url") || exit
+			super_config_url=$(git submodule--helper resolve-relative-url "$url") || exit
 			;;
 		*)
 			sub_origin_url="$url"
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 627ef85..2ae1bbd 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -19,6 +19,12 @@ relative_path() {
 	"test \"\$(test-path-utils relative_path '$1' '$2')\" = '$expected'"
 }
 
+test_submodule_relative_url() {
+	expected="$4"
+	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" \
+	"test \"\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3')\" = '$expected'"
+}
+
 test_git_path() {
 	test_expect_success "git-path $1 $2 => $3" "
 		$1 git rev-parse --git-path $2 >actual &&
@@ -286,4 +292,39 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
 test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
 test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
 
+test_submodule_relative_url "(null)" "../foo/bar" "../bar/a/b/c" "../foo/bar/a/b/c"
+test_submodule_relative_url "../../../" "../foo/bar" "../bar/a/b/c" "../../../../foo/bar/a/b/c"
+test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/submodule" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo" "../submodule" "../submodule"
+test_submodule_relative_url "../" "../foo" "../submodule" "../../submodule"
+test_submodule_relative_url "(null)" "./foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "./foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "./foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "./foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "//somewhere else/repo" "../subrepo" "//somewhere else/subrepo"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/subsuper_update_r" "../subsubsuper_update_r" "/u//trash directory.t7406-submodule-update/subsubsuper_update_r"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/super_update_r2" "../subsuper_update_r" "/u//trash directory.t7406-submodule-update/subsuper_update_r"
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm/." "../." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm" "./." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "../" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic" "./å äö" "/u/trash directory.t7400-submodule-basic/å äö"
+test_submodule_relative_url "(null)" "/u/trash directory.t7403-submodule-sync/." "../submodule" "/u/trash directory.t7403-submodule-sync/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7407-submodule-foreach/submodule" "../submodule" "/u/trash directory.t7407-submodule-foreach/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7409-submodule-detached-worktree/home2/../remote" "../bundle1" "/u/trash directory.t7409-submodule-detached-worktree/home2/../bundle1"
+test_submodule_relative_url "(null)" "/u/trash directory.t7613-merge-submodule/submodule_update_repo" "./." "/u/trash directory.t7613-merge-submodule/submodule_update_repo/."
+test_submodule_relative_url "(null)" "file:///tmp/repo" "../subrepo" "file:///tmp/subrepo"
+test_submodule_relative_url "(null)" "foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "helper:://hostname/repo" "../subrepo" "helper:://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname/repo" "../subrepo" "ssh://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname:22/repo" "../subrepo" "ssh://hostname:22/subrepo"
+test_submodule_relative_url "(null)" "user@host:path/to/repo" "../subrepo" "user@host:path/to/subrepo"
+test_submodule_relative_url "(null)" "user@host:repo" "../subrepo" "user@host:subrepo"
+
 test_done
-- 
2.7.0.9.g9d9d16f.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCHv3] submodule: Port resolve_relative_url from shell to C
  2016-01-12 23:35 [PATCHv2] submodule: Port resolve_relative_url from shell to C Stefan Beller
@ 2016-01-13  0:10 ` Stefan Beller
  2016-01-13  7:42   ` Johannes Sixt
  0 siblings, 1 reply; 4+ messages in thread
From: Stefan Beller @ 2016-01-13  0:10 UTC (permalink / raw)
  To: gitster, git; +Cc: peff, j6t, jens.lehmann, Stefan Beller

Later on we want to deprecate the `git submodule init` command and make
it implicit in other submodule commands. As these other commands are
written in C already, we'd need the init functionality in C, too.
The `resolve_relative_url` function is a major part of that init
functionality, so start by porting this function to C.

As I was porting the functionality I noticed some odds with the inputs.
To fully understand the situation I added some logging to the function
temporarily to capture all calls to the function throughout the test
suite. Duplicates have been removed and all unique testing inputs have
been recorded into t0060.

Signed-off-by: Stefan Beller <sbeller@google.com>
---
I tried considering an alternative implementation for `relative_url`,
which is not a direct translation of the shell code. There are some
advanced path/url functions, such as `normalize_path_copy`, however
using that function is not straightforward as it seems. The idea would
have been to use that on a concatenation of remoteurl and url, however
there are cases like ("foo/." "../.") to result in "foo/.", so we really
need to count the slashes ourselves.

This applies on top of current master (2.7.0).

Thanks,
Stefan

Interdiff to v2 (sent half an hour ago):

        diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
        index db9d627..3dd8008 100644
        --- a/builtin/submodule--helper.c
        +++ b/builtin/submodule--helper.c
        @@ -90,8 +90,8 @@ static char *relative_url(const char *remote_url,
                else {
                        is_relative = 1;
         
        -               /* prepend a './' to ensure all relative remoteurls start with
        -                * './' or '../' */
        +               /* Prepend a './' to ensure all relative remoteurls start
        +                * with './' or '../'. */
                        if (!starts_with_dot_slash(remoteurl) &&
                            !starts_with_dot_dot_slash(remoteurl)) {
                                strbuf_reset(&sb);
        @@ -100,8 +100,8 @@ static char *relative_url(const char *remote_url,
                                remoteurl = strbuf_detach(&sb, NULL);
                        }
                }
        -       /* When the url starts with '../', remove that and the last directory
        -        * in remoteurl */
        +       /* When the url starts with '../', remove that and the
        +        * last directory in remoteurl. */
                while (url) {
                        if (starts_with_dot_dot_slash(url)) {
                                char *rfind;

---
 builtin/submodule--helper.c | 191 ++++++++++++++++++++++++++++++++++++++++++++
 git-submodule.sh            |  81 +------------------
 t/t0060-path-utils.sh       |  41 ++++++++++
 3 files changed, 236 insertions(+), 77 deletions(-)

diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index f4c3eff..3dd8008 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -9,6 +9,195 @@
 #include "submodule-config.h"
 #include "string-list.h"
 #include "run-command.h"
+#include "remote.h"
+#include "refs.h"
+#include "connect.h"
+
+static char *get_default_remote(void)
+{
+	char *dest = NULL, *ret;
+	unsigned char sha1[20];
+	int flag;
+	struct strbuf sb = STRBUF_INIT;
+	const char *refname = resolve_ref_unsafe("HEAD", 0, sha1, &flag);
+
+	if (!refname)
+		die("No such ref: HEAD");
+
+	refname = shorten_unambiguous_ref(refname, 0);
+	strbuf_addf(&sb, "branch.%s.remote", refname);
+	if (git_config_get_string(sb.buf, &dest))
+		ret = xstrdup("origin");
+	else
+		ret = xstrdup(dest);
+
+	strbuf_release(&sb);
+	return ret;
+}
+
+static int starts_with_dot_slash(const char *str)
+{
+	return str[0] == '.' && is_dir_sep(str[1]);
+}
+
+static int starts_with_dot_dot_slash(const char *str)
+{
+	return str[0] == '.' && str[1] == '.' && is_dir_sep(str[2]);
+}
+
+static char *last_dir_separator(char *str)
+{
+	char* p = str + strlen(str);
+	while (p-- != str)
+		if (is_dir_sep(*p))
+			return p;
+	return NULL;
+}
+
+/*
+ * The `url` argument is the URL that navigates to the submodule origin
+ * repo. When relative, this URL is relative to the superproject origin
+ * URL repo. The `up_path` argument, if specified, is the relative
+ * path that navigates from the submodule working tree to the superproject
+ * working tree. Returns the origin URL of the submodule.
+ *
+ * Return either an absolute URL or filesystem path (if the superproject
+ * origin URL is an absolute URL or filesystem path, respectively) or a
+ * relative file system path (if the superproject origin URL is a relative
+ * file system path).
+ *
+ * When the output is a relative file system path, the path is either
+ * relative to the submodule working tree, if up_path is specified, or to
+ * the superproject working tree otherwise.
+ */
+static char *relative_url(const char *remote_url,
+				const char *url,
+				const char *up_path)
+{
+	int is_relative = 0;
+	int colonsep = 0;
+	char *out;
+	char *remoteurl = xstrdup(remote_url);
+	struct strbuf sb = STRBUF_INIT;
+	size_t len;
+
+	len = strlen(remoteurl);
+	if (is_dir_sep(remoteurl[len]))
+		remoteurl[len] = '\0';
+
+	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
+		is_relative = 0;
+	else {
+		is_relative = 1;
+
+		/* Prepend a './' to ensure all relative remoteurls start
+		 * with './' or '../'. */
+		if (!starts_with_dot_slash(remoteurl) &&
+		    !starts_with_dot_dot_slash(remoteurl)) {
+			strbuf_reset(&sb);
+			strbuf_addf(&sb, "./%s", remoteurl);
+			free(remoteurl);
+			remoteurl = strbuf_detach(&sb, NULL);
+		}
+	}
+	/* When the url starts with '../', remove that and the
+	 * last directory in remoteurl. */
+	while (url) {
+		if (starts_with_dot_dot_slash(url)) {
+			char *rfind;
+			url += 3;
+
+			rfind = last_dir_separator(remoteurl);
+			if (rfind)
+				*rfind = '\0';
+			else {
+				rfind = strrchr(remoteurl, ':');
+				if (rfind) {
+					*rfind = '\0';
+					colonsep = 1;
+				} else {
+					if (is_relative || !strcmp(".", remoteurl))
+						die(N_("cannot strip one component off url '%s'"), remoteurl);
+					else
+						remoteurl = ".";
+				}
+			}
+		} else if (starts_with_dot_slash(url)) {
+			url += 2;
+		} else
+			break;
+	}
+	strbuf_reset(&sb);
+	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
+
+	if (starts_with_dot_slash(sb.buf))
+		out = xstrdup(sb.buf + 2);
+	else
+		out = xstrdup(sb.buf);
+	strbuf_reset(&sb);
+
+	free(remoteurl);
+	if (!up_path || !is_relative)
+		return out;
+	else {
+		strbuf_addf(&sb, "%s%s", up_path, out);
+
+		free(out);
+		return strbuf_detach(&sb, NULL);
+	}
+}
+
+static int resolve_relative_url(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl = NULL;
+	char *remote = get_default_remote();
+	const char *up_path = NULL;
+	char *res;
+	const char *url;
+	struct strbuf sb = STRBUF_INIT;
+
+	if (argc != 2 && argc != 3)
+		die("BUG: resolve_relative_url only accepts one or two arguments");
+
+	url = argv[1];
+	strbuf_addf(&sb, "remote.%s.url", remote);
+	free(remote);
+
+	if (git_config_get_string(sb.buf, &remoteurl))
+		/* the repository is its own authoritative upstream */
+		remoteurl = xgetcwd();
+
+	if (argc == 3)
+		up_path = argv[2];
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
+
+static int resolve_relative_url_test(int argc, const char **argv, const char *prefix)
+{
+	char *remoteurl, *res;
+	const char *up_path, *url;
+
+	if (argc != 4)
+		die("BUG: resolve_relative_url only accepts three arguments: <up_path> <remoteurl> <url>");
+
+	up_path = argv[1];
+	remoteurl = xstrdup(argv[2]);
+	url = argv[3];
+
+	if (!strcmp(up_path, "(null)"))
+		up_path = NULL;
+
+	res = relative_url(remoteurl, url, up_path);
+	printf("%s\n", res);
+
+	free(res);
+	return 0;
+}
 
 struct module_list {
 	const struct cache_entry **entries;
@@ -264,6 +453,8 @@ static struct cmd_struct commands[] = {
 	{"list", module_list},
 	{"name", module_name},
 	{"clone", module_clone},
+	{"resolve-relative-url", resolve_relative_url},
+	{"resolve-relative-url-test", resolve_relative_url_test},
 };
 
 int cmd_submodule__helper(int argc, const char **argv, const char *prefix)
diff --git a/git-submodule.sh b/git-submodule.sh
index 9bc5c5f..3e409af 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -46,79 +46,6 @@ prefix=
 custom_name=
 depth=
 
-# The function takes at most 2 arguments. The first argument is the
-# URL that navigates to the submodule origin repo. When relative, this URL
-# is relative to the superproject origin URL repo. The second up_path
-# argument, if specified, is the relative path that navigates
-# from the submodule working tree to the superproject working tree.
-#
-# The output of the function is the origin URL of the submodule.
-#
-# The output will either be an absolute URL or filesystem path (if the
-# superproject origin URL is an absolute URL or filesystem path,
-# respectively) or a relative file system path (if the superproject
-# origin URL is a relative file system path).
-#
-# When the output is a relative file system path, the path is either
-# relative to the submodule working tree, if up_path is specified, or to
-# the superproject working tree otherwise.
-resolve_relative_url ()
-{
-	remote=$(get_default_remote)
-	remoteurl=$(git config "remote.$remote.url") ||
-		remoteurl=$(pwd) # the repository is its own authoritative upstream
-	url="$1"
-	remoteurl=${remoteurl%/}
-	sep=/
-	up_path="$2"
-
-	case "$remoteurl" in
-	*:*|/*)
-		is_relative=
-		;;
-	./*|../*)
-		is_relative=t
-		;;
-	*)
-		is_relative=t
-		remoteurl="./$remoteurl"
-		;;
-	esac
-
-	while test -n "$url"
-	do
-		case "$url" in
-		../*)
-			url="${url#../}"
-			case "$remoteurl" in
-			*/*)
-				remoteurl="${remoteurl%/*}"
-				;;
-			*:*)
-				remoteurl="${remoteurl%:*}"
-				sep=:
-				;;
-			*)
-				if test -z "$is_relative" || test "." = "$remoteurl"
-				then
-					die "$(eval_gettext "cannot strip one component off url '\$remoteurl'")"
-				else
-					remoteurl=.
-				fi
-				;;
-			esac
-			;;
-		./*)
-			url="${url#./}"
-			;;
-		*)
-			break;;
-		esac
-	done
-	remoteurl="$remoteurl$sep${url%/}"
-	echo "${is_relative:+${up_path}}${remoteurl#./}"
-}
-
 # Resolve a path to be relative to another path.  This is intended for
 # converting submodule paths when git-submodule is run in a subdirectory
 # and only handles paths where the directory separator is '/'.
@@ -281,7 +208,7 @@ cmd_add()
 		die "$(gettext "Relative path can only be used from the toplevel of the working tree")"
 
 		# dereference source url relative to parent's url
-		realrepo=$(resolve_relative_url "$repo") || exit
+		realrepo=$(git submodule--helper resolve-relative-url "$repo") || exit
 		;;
 	*:*|/*)
 		# absolute url
@@ -485,7 +412,7 @@ cmd_init()
 			# Possibly a url relative to parent
 			case "$url" in
 			./*|../*)
-				url=$(resolve_relative_url "$url") || exit
+				url=$(git submodule--helper resolve-relative-url "$url") || exit
 				;;
 			esac
 			git config submodule."$name".url "$url" ||
@@ -1190,9 +1117,9 @@ cmd_sync()
 			# guarantee a trailing /
 			up_path=${up_path%/}/ &&
 			# path from submodule work tree to submodule origin repo
-			sub_origin_url=$(resolve_relative_url "$url" "$up_path") &&
+			sub_origin_url=$(git submodule--helper resolve-relative-url "$url" "$up_path") &&
 			# path from superproject work tree to submodule origin repo
-			super_config_url=$(resolve_relative_url "$url") || exit
+			super_config_url=$(git submodule--helper resolve-relative-url "$url") || exit
 			;;
 		*)
 			sub_origin_url="$url"
diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
index 627ef85..2ae1bbd 100755
--- a/t/t0060-path-utils.sh
+++ b/t/t0060-path-utils.sh
@@ -19,6 +19,12 @@ relative_path() {
 	"test \"\$(test-path-utils relative_path '$1' '$2')\" = '$expected'"
 }
 
+test_submodule_relative_url() {
+	expected="$4"
+	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" \
+	"test \"\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3')\" = '$expected'"
+}
+
 test_git_path() {
 	test_expect_success "git-path $1 $2 => $3" "
 		$1 git rev-parse --git-path $2 >actual &&
@@ -286,4 +292,39 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
 test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
 test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
 
+test_submodule_relative_url "(null)" "../foo/bar" "../bar/a/b/c" "../foo/bar/a/b/c"
+test_submodule_relative_url "../../../" "../foo/bar" "../bar/a/b/c" "../../../../foo/bar/a/b/c"
+test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"
+test_submodule_relative_url "../" "../foo/submodule" "../submodule" "../../foo/submodule"
+test_submodule_relative_url "(null)" "../foo" "../submodule" "../submodule"
+test_submodule_relative_url "../" "../foo" "../submodule" "../../submodule"
+test_submodule_relative_url "(null)" "./foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "./foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "./foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "./foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "//somewhere else/repo" "../subrepo" "//somewhere else/subrepo"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/subsuper_update_r" "../subsubsuper_update_r" "/u//trash directory.t7406-submodule-update/subsubsuper_update_r"
+test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/super_update_r2" "../subsuper_update_r" "/u//trash directory.t7406-submodule-update/subsuper_update_r"
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm/." "../." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm" "./." "/u/trash directory.t3600-rm/."
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "../" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
+test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic" "./å äö" "/u/trash directory.t7400-submodule-basic/å äö"
+test_submodule_relative_url "(null)" "/u/trash directory.t7403-submodule-sync/." "../submodule" "/u/trash directory.t7403-submodule-sync/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7407-submodule-foreach/submodule" "../submodule" "/u/trash directory.t7407-submodule-foreach/submodule"
+test_submodule_relative_url "(null)" "/u/trash directory.t7409-submodule-detached-worktree/home2/../remote" "../bundle1" "/u/trash directory.t7409-submodule-detached-worktree/home2/../bundle1"
+test_submodule_relative_url "(null)" "/u/trash directory.t7613-merge-submodule/submodule_update_repo" "./." "/u/trash directory.t7613-merge-submodule/submodule_update_repo/."
+test_submodule_relative_url "(null)" "file:///tmp/repo" "../subrepo" "file:///tmp/subrepo"
+test_submodule_relative_url "(null)" "foo/bar" "../submodule" "foo/submodule"
+test_submodule_relative_url "../" "foo/bar" "../submodule" "../foo/submodule"
+test_submodule_relative_url "(null)" "foo" "../submodule" "submodule"
+test_submodule_relative_url "../" "foo" "../submodule" "../submodule"
+test_submodule_relative_url "(null)" "helper:://hostname/repo" "../subrepo" "helper:://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname/repo" "../subrepo" "ssh://hostname/subrepo"
+test_submodule_relative_url "(null)" "ssh://hostname:22/repo" "../subrepo" "ssh://hostname:22/subrepo"
+test_submodule_relative_url "(null)" "user@host:path/to/repo" "../subrepo" "user@host:path/to/subrepo"
+test_submodule_relative_url "(null)" "user@host:repo" "../subrepo" "user@host:subrepo"
+
 test_done
-- 
2.7.0.9.g9d9d16f.dirty

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCHv3] submodule: Port resolve_relative_url from shell to C
  2016-01-13  0:10 ` [PATCHv3] " Stefan Beller
@ 2016-01-13  7:42   ` Johannes Sixt
  2016-01-13 16:58     ` Junio C Hamano
  0 siblings, 1 reply; 4+ messages in thread
From: Johannes Sixt @ 2016-01-13  7:42 UTC (permalink / raw)
  To: Stefan Beller; +Cc: gitster, git, peff, jens.lehmann

Am 13.01.2016 um 01:10 schrieb Stefan Beller:
> Later on we want to deprecate the `git submodule init` command and make
> it implicit in other submodule commands. As these other commands are
> written in C already, we'd need the init functionality in C, too.
> The `resolve_relative_url` function is a major part of that init
> functionality, so start by porting this function to C.

Maybe tone down the word "major" to "a large and non-trivial function"?
Otherwise, the lack of proof for the claim is irritating. (As we saw,
the savings with the port to C are not breath-taking. But we have to
start somewhere.)

> 
> As I was porting the functionality I noticed some odds with the inputs.
> To fully understand the situation I added some logging to the function
> temporarily to capture all calls to the function throughout the test
> suite. Duplicates have been removed and all unique testing inputs have
> been recorded into t0060.
> 
> Signed-off-by: Stefan Beller <sbeller@google.com>
> ---
...
> +/*
> + * The `url` argument is the URL that navigates to the submodule origin
> + * repo. When relative, this URL is relative to the superproject origin
> + * URL repo. The `up_path` argument, if specified, is the relative
> + * path that navigates from the submodule working tree to the superproject
> + * working tree. Returns the origin URL of the submodule.
> + *
> + * Return either an absolute URL or filesystem path (if the superproject
> + * origin URL is an absolute URL or filesystem path, respectively) or a
> + * relative file system path (if the superproject origin URL is a relative
> + * file system path).
> + *
> + * When the output is a relative file system path, the path is either
> + * relative to the submodule working tree, if up_path is specified, or to
> + * the superproject working tree otherwise.
> + */
> +static char *relative_url(const char *remote_url,
> +				const char *url,
> +				const char *up_path)
> +{
> +	int is_relative = 0;
> +	int colonsep = 0;
> +	char *out;
> +	char *remoteurl = xstrdup(remote_url);
> +	struct strbuf sb = STRBUF_INIT;
> +	size_t len;
> +
> +	len = strlen(remoteurl);
> +	if (is_dir_sep(remoteurl[len]))
> +		remoteurl[len] = '\0';
> +
> +	if (!url_is_local_not_ssh(remoteurl) || is_absolute_path(remoteurl))
> +		is_relative = 0;
> +	else {
> +		is_relative = 1;
> +
> +		/* Prepend a './' to ensure all relative remoteurls start
> +		 * with './' or '../'. */
> +		if (!starts_with_dot_slash(remoteurl) &&
> +		    !starts_with_dot_dot_slash(remoteurl)) {
> +			strbuf_reset(&sb);
> +			strbuf_addf(&sb, "./%s", remoteurl);
> +			free(remoteurl);
> +			remoteurl = strbuf_detach(&sb, NULL);
> +		}
> +	}
> +	/* When the url starts with '../', remove that and the
> +	 * last directory in remoteurl. */
> +	while (url) {
> +		if (starts_with_dot_dot_slash(url)) {
> +			char *rfind;
> +			url += 3;
> +
> +			rfind = last_dir_separator(remoteurl);
> +			if (rfind)
> +				*rfind = '\0';
> +			else {
> +				rfind = strrchr(remoteurl, ':');
> +				if (rfind) {
> +					*rfind = '\0';
> +					colonsep = 1;
> +				} else {
> +					if (is_relative || !strcmp(".", remoteurl))
> +						die(N_("cannot strip one component off url '%s'"), remoteurl);

This must be _("..."), not N_("..."), otherwise no translation
happens at run-time.

> +					else
> +						remoteurl = ".";

This must be xstrdup(".") because remoteurl is free()d below.

> +				}
> +			}
> +		} else if (starts_with_dot_slash(url)) {
> +			url += 2;
> +		} else
> +			break;
> +	}
> +	strbuf_reset(&sb);
> +	strbuf_addf(&sb, "%s%s%s", remoteurl, colonsep ? ":" : "/", url);
> +
> +	if (starts_with_dot_slash(sb.buf))
> +		out = xstrdup(sb.buf + 2);
> +	else
> +		out = xstrdup(sb.buf);
> +	strbuf_reset(&sb);
> +
> +	free(remoteurl);
> +	if (!up_path || !is_relative)
> +		return out;

Nit: This conditional clearly feels like (and is) an early function
exit...

> +	else {
> +		strbuf_addf(&sb, "%s%s", up_path, out);
> +
> +		free(out);
> +		return strbuf_detach(&sb, NULL);
> +	}

... hence, you don't need to place the rest of the function into the
else branch.

> +}

up_path seems to be ignored when remoteurl is absolute. Is that
combination an invalid use case?

I think that you strike a good balance between a direct rewrite
of the shell function and possible optimizations. Therefore,
further improvements should go into separate patches.

> diff --git a/t/t0060-path-utils.sh b/t/t0060-path-utils.sh
> index 627ef85..2ae1bbd 100755
> --- a/t/t0060-path-utils.sh
> +++ b/t/t0060-path-utils.sh
> @@ -19,6 +19,12 @@ relative_path() {
>   	"test \"\$(test-path-utils relative_path '$1' '$2')\" = '$expected'"
>   }
>   
> +test_submodule_relative_url() {
> +	expected="$4"
> +	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" \
> +	"test \"\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3')\" = '$expected'"
> +}

relative_url() has an error exit mode that is not unlikely to trigger,
but the way this test is written such a failure would be detected
only indirectly as a mismatch between actual and expected result. The
reason is that a failed command substitution does not cause the
surrounding command to fail unless it is in the last variable
assignment. Therefore, I suggest to write the helper function like this:

test_submodule_relative_url() {
	test_expect_success "test_submodule_relative_url: $1 $2 $3 => $4" "
		actual=\$(git submodule--helper resolve-relative-url-test '$1' '$2' '$3') &&
		"test \"\$actual\" = '$4'
	"
}

> +
>   test_git_path() {
>   	test_expect_success "git-path $1 $2 => $3" "
>   		$1 git rev-parse --git-path $2 >actual &&
> @@ -286,4 +292,39 @@ test_git_path GIT_COMMON_DIR=bar config                   bar/config
>   test_git_path GIT_COMMON_DIR=bar packed-refs              bar/packed-refs
>   test_git_path GIT_COMMON_DIR=bar shallow                  bar/shallow
>   
> +test_submodule_relative_url "(null)" "../foo/bar" "../bar/a/b/c" "../foo/bar/a/b/c"
> +test_submodule_relative_url "../../../" "../foo/bar" "../bar/a/b/c" "../../../../foo/bar/a/b/c"

In these two cases, it is unclear whether the "bar" in the 4th
argument is copied from the 2nd or the 3rd argument. I suggest to
use a different token:

test_submodule_relative_url "(null)" "../foo/bar" "../sub/a/b/c" "../foo/sub/a/b/c"
test_submodule_relative_url "../../../" "../foo/bar" "../sub/a/b/c" "../../../../foo/sub/a/b/c"

> +test_submodule_relative_url "(null)" "../foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "../" "../foo/bar" "../submodule" "../../foo/submodule"
> +test_submodule_relative_url "(null)" "../foo/submodule" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "../" "../foo/submodule" "../submodule" "../../foo/submodule"
> +test_submodule_relative_url "(null)" "../foo" "../submodule" "../submodule"
> +test_submodule_relative_url "../" "../foo" "../submodule" "../../submodule"
> +test_submodule_relative_url "(null)" "./foo/bar" "../submodule" "foo/submodule"
> +test_submodule_relative_url "../" "./foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "(null)" "./foo" "../submodule" "submodule"
> +test_submodule_relative_url "../" "./foo" "../submodule" "../submodule"
> +test_submodule_relative_url "(null)" "//somewhere else/repo" "../subrepo" "//somewhere else/subrepo"
> +test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/subsuper_update_r" "../subsubsuper_update_r" "/u//trash directory.t7406-submodule-update/subsubsuper_update_r"
> +test_submodule_relative_url "(null)" "/u//trash directory.t7406-submodule-update/super_update_r2" "../subsuper_update_r" "/u//trash directory.t7406-submodule-update/subsuper_update_r"
> +test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm/." "../." "/u/trash directory.t3600-rm/."
> +test_submodule_relative_url "(null)" "/u/trash directory.t3600-rm" "./." "/u/trash directory.t3600-rm/."
> +test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
> +test_submodule_relative_url "../" "/u/trash directory.t7400-submodule-basic/addtest" "../repo" "/u/trash directory.t7400-submodule-basic/repo"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7400-submodule-basic" "./å äö" "/u/trash directory.t7400-submodule-basic/å äö"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7403-submodule-sync/." "../submodule" "/u/trash directory.t7403-submodule-sync/submodule"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7407-submodule-foreach/submodule" "../submodule" "/u/trash directory.t7407-submodule-foreach/submodule"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7409-submodule-detached-worktree/home2/../remote" "../bundle1" "/u/trash directory.t7409-submodule-detached-worktree/home2/../bundle1"
> +test_submodule_relative_url "(null)" "/u/trash directory.t7613-merge-submodule/submodule_update_repo" "./." "/u/trash directory.t7613-merge-submodule/submodule_update_repo/."
> +test_submodule_relative_url "(null)" "file:///tmp/repo" "../subrepo" "file:///tmp/subrepo"
> +test_submodule_relative_url "(null)" "foo/bar" "../submodule" "foo/submodule"
> +test_submodule_relative_url "../" "foo/bar" "../submodule" "../foo/submodule"
> +test_submodule_relative_url "(null)" "foo" "../submodule" "submodule"
> +test_submodule_relative_url "../" "foo" "../submodule" "../submodule"
> +test_submodule_relative_url "(null)" "helper:://hostname/repo" "../subrepo" "helper:://hostname/subrepo"
> +test_submodule_relative_url "(null)" "ssh://hostname/repo" "../subrepo" "ssh://hostname/subrepo"
> +test_submodule_relative_url "(null)" "ssh://hostname:22/repo" "../subrepo" "ssh://hostname:22/subrepo"
> +test_submodule_relative_url "(null)" "user@host:path/to/repo" "../subrepo" "user@host:path/to/subrepo"
> +test_submodule_relative_url "(null)" "user@host:repo" "../subrepo" "user@host:subrepo"
> +
>   test_done
> 

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCHv3] submodule: Port resolve_relative_url from shell to C
  2016-01-13  7:42   ` Johannes Sixt
@ 2016-01-13 16:58     ` Junio C Hamano
  0 siblings, 0 replies; 4+ messages in thread
From: Junio C Hamano @ 2016-01-13 16:58 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Stefan Beller, git, peff, jens.lehmann

Johannes Sixt <j6t@kdbg.org> writes:

> Am 13.01.2016 um 01:10 schrieb Stefan Beller:
>> Later on we want to deprecate the `git submodule init` command and make
>> it implicit in other submodule commands. As these other commands are
>> written in C already, we'd need the init functionality in C, too.
>> The `resolve_relative_url` function is a major part of that init
>> functionality, so start by porting this function to C.
>
> Maybe tone down the word "major" to "a large and non-trivial function"?
> Otherwise, the lack of proof for the claim is irritating. (As we saw,
> the savings with the port to C are not breath-taking. But we have to
> start somewhere.)
> ...

All good suggestions.  Thanks for a low latency review.  Very much
appreciated.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-01-13 16:58 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-12 23:35 [PATCHv2] submodule: Port resolve_relative_url from shell to C Stefan Beller
2016-01-13  0:10 ` [PATCHv3] " Stefan Beller
2016-01-13  7:42   ` Johannes Sixt
2016-01-13 16:58     ` Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.