All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/2] rev-parse options for absolute or relative paths
@ 2020-12-06 22:53 brian m. carlson
  2020-12-06 22:53 ` [PATCH v4 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
  2020-12-06 22:53 ` [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
  0 siblings, 2 replies; 9+ messages in thread
From: brian m. carlson @ 2020-12-06 22:53 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, René Scharfe

There are a bunch of different situations in which one would like to
have an absolute and canonical or a relative path from Git.  In many of
these cases, these values are already available from git rev-parse, but
some values only come in one form or another.

Changes from v3:

* Switch to a binary option instead of a number of missing path
  components to skip.

Changes from v2:

* Incorporate multiple missing segment support into the strbuf_realpath
  code.
* Switch some invocations to use DEFAULT_UNMODIFIED, which should not
  result in a change in behavior.
* Rebase, resolving some conflicts.

Changes from v1:

* Add a function to handle missing trailing components when
  canonicalizing paths and use it.
* Improve commit messages.
* Fix broken && chain.
* Fix situation where relative paths are not relative.

brian m. carlson (2):
  abspath: add a function to resolve paths with missing components
  rev-parse: add option for absolute or relative path formatting

 Documentation/git-rev-parse.txt |  71 +++++++++++++---------
 abspath.c                       |  13 +++-
 builtin/rev-parse.c             | 104 ++++++++++++++++++++++++++++----
 cache.h                         |   2 +
 t/t1500-rev-parse.sh            |  57 ++++++++++++++++-
 5 files changed, 203 insertions(+), 44 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/2] abspath: add a function to resolve paths with missing components
  2020-12-06 22:53 [PATCH v4 0/2] rev-parse options for absolute or relative paths brian m. carlson
@ 2020-12-06 22:53 ` brian m. carlson
  2020-12-07  0:02   ` Eric Sunshine
  2020-12-07 17:19   ` René Scharfe
  2020-12-06 22:53 ` [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
  1 sibling, 2 replies; 9+ messages in thread
From: brian m. carlson @ 2020-12-06 22:53 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, René Scharfe

We'd like to canonicalize paths such that we can preserve any number of
trailing components that may be missing.  Let's add a function to do
that, allowing either one or an unlimited number of components to
canonicalize, and make strbuf_realpath a wrapper around it that allows
just one component.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 abspath.c | 13 ++++++++++++-
 cache.h   |  2 ++
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/abspath.c b/abspath.c
index 6f15a418bb..e6630b3618 100644
--- a/abspath.c
+++ b/abspath.c
@@ -80,6 +80,17 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
  */
 char *strbuf_realpath(struct strbuf *resolved, const char *path,
 		      int die_on_error)
+{
+	return strbuf_realpath_missing(resolved, path, 0, die_on_error);
+}
+
+/*
+ * Just like strbuf_realpath, but allows specifying how many missing components
+ * are permitted.  If many_missing is true, an arbitrary number of path
+ * components may be missing; otherwise, only the last component may be missing.
+ */
+char *strbuf_realpath_missing(struct strbuf *resolved, const char *path,
+			      int many_missing, int die_on_error)
 {
 	struct strbuf remaining = STRBUF_INIT;
 	struct strbuf next = STRBUF_INIT;
@@ -129,7 +140,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
 
 		if (lstat(resolved->buf, &st)) {
 			/* error out unless this was the last component */
-			if (errno != ENOENT || remaining.len) {
+			if (errno != ENOENT || (!many_missing && remaining.len)) {
 				if (die_on_error)
 					die_errno("Invalid path '%s'",
 						  resolved->buf);
diff --git a/cache.h b/cache.h
index e986cf4ea9..a1386235fc 100644
--- a/cache.h
+++ b/cache.h
@@ -1320,6 +1320,8 @@ static inline int is_absolute_path(const char *path)
 int is_directory(const char *);
 char *strbuf_realpath(struct strbuf *resolved, const char *path,
 		      int die_on_error);
+char *strbuf_realpath_missing(struct strbuf *resolved, const char *path,
+			      int missing_components, int die_on_error);
 char *real_pathdup(const char *path, int die_on_error);
 const char *absolute_path(const char *path);
 char *absolute_pathdup(const char *path);

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting
  2020-12-06 22:53 [PATCH v4 0/2] rev-parse options for absolute or relative paths brian m. carlson
  2020-12-06 22:53 ` [PATCH v4 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
@ 2020-12-06 22:53 ` brian m. carlson
  2020-12-07  0:30   ` Eric Sunshine
  1 sibling, 1 reply; 9+ messages in thread
From: brian m. carlson @ 2020-12-06 22:53 UTC (permalink / raw)
  To: git; +Cc: Johannes Schindelin, René Scharfe

git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Note that if the user needs both forms, it is possible to pass an
additional option in the middle of the command line which changes the
behavior of subsequent operations.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/git-rev-parse.txt |  71 +++++++++++++---------
 builtin/rev-parse.c             | 104 ++++++++++++++++++++++++++++----
 t/t1500-rev-parse.sh            |  57 ++++++++++++++++-
 3 files changed, 189 insertions(+), 43 deletions(-)

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 5013daa6ef..1708af8a66 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -212,6 +212,15 @@ Options for Files
 	Only the names of the variables are listed, not their value,
 	even if they are set.
 
+--path-format=(absolute|relative)::
+	Controls the behavior of certain other options following it on the command
+	line. If specified as absolute, the paths printed by those options will be
+	absolute and canonical. If specified as relative, the paths will be relative
+	to the current working directory if that is possible.  The default is option
+	specific.
+
+The following options are modified by `--path-format`:
+
 --git-dir::
 	Show `$GIT_DIR` if defined. Otherwise show the path to
 	the .git directory. The path shown, when relative, is
@@ -221,13 +230,42 @@ If `$GIT_DIR` is not defined and the current directory
 is not detected to lie in a Git repository or work tree
 print a message to stderr and exit with nonzero status.
 
+--git-common-dir::
+	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
+
+--resolve-git-dir <path>::
+	Check if <path> is a valid repository or a gitfile that
+	points at a valid repository, and print the location of the
+	repository.  If <path> is a gitfile then the resolved path
+	to the real repository is printed.
+
+--git-path <path>::
+	Resolve "$GIT_DIR/<path>" and takes other path relocation
+	variables such as $GIT_OBJECT_DIRECTORY,
+	$GIT_INDEX_FILE... into account. For example, if
+	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
+	--git-path objects/abc" returns /foo/bar/abc.
+
+--show-toplevel::
+	Show the (by default, absolute) path of the top-level directory
+	of the working tree. If there is no working tree, report an error.
+
+--show-superproject-working-tree::
+	Show the absolute path of the root of the superproject's
+	working tree (if exists) that uses the current repository as
+	its submodule.  Outputs nothing if the current repository is
+	not used as a submodule by any project.
+
+--shared-index-path::
+	Show the path to the shared index file in split index mode, or
+	empty if not in split-index mode.
+
+The following options are unaffected by `--path-format`:
+
 --absolute-git-dir::
 	Like `--git-dir`, but its output is always the canonicalized
 	absolute path.
 
---git-common-dir::
-	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
-
 --is-inside-git-dir::
 	When the current working directory is below the repository
 	directory print "true", otherwise "false".
@@ -242,19 +280,6 @@ print a message to stderr and exit with nonzero status.
 --is-shallow-repository::
 	When the repository is shallow print "true", otherwise "false".
 
---resolve-git-dir <path>::
-	Check if <path> is a valid repository or a gitfile that
-	points at a valid repository, and print the location of the
-	repository.  If <path> is a gitfile then the resolved path
-	to the real repository is printed.
-
---git-path <path>::
-	Resolve "$GIT_DIR/<path>" and takes other path relocation
-	variables such as $GIT_OBJECT_DIRECTORY,
-	$GIT_INDEX_FILE... into account. For example, if
-	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
-	--git-path objects/abc" returns /foo/bar/abc.
-
 --show-cdup::
 	When the command is invoked from a subdirectory, show the
 	path of the top-level directory relative to the current
@@ -265,20 +290,6 @@ print a message to stderr and exit with nonzero status.
 	path of the current directory relative to the top-level
 	directory.
 
---show-toplevel::
-	Show the absolute path of the top-level directory of the working
-	tree. If there is no working tree, report an error.
-
---show-superproject-working-tree::
-	Show the absolute path of the root of the superproject's
-	working tree (if exists) that uses the current repository as
-	its submodule.  Outputs nothing if the current repository is
-	not used as a submodule by any project.
-
---shared-index-path::
-	Show the path to the shared index file in split index mode, or
-	empty if not in split-index mode.
-
 --show-object-format[=(storage|input|output)]::
 	Show the object format (hash algorithm) used for the repository
 	for storage inside the `.git` directory, input, or output. For
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 69ba7326cf..8b487cd7a9 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -583,6 +583,73 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
 	clear_ref_exclusion(&ref_excludes);
 }
 
+enum format_type {
+	/* We would like a relative path. */
+	FORMAT_RELATIVE,
+	/* We would like a canonical absolute path. */
+	FORMAT_CANONICAL,
+	/* We would like the default behavior. */
+	FORMAT_DEFAULT,
+};
+
+enum default_type {
+	/* Our default is a relative path. */
+	DEFAULT_RELATIVE,
+	/* Our default is a relative path if there's a shared root. */
+	DEFAULT_RELATIVE_IF_SHARED,
+	/* Our default is a canonical absolute path. */
+	DEFAULT_CANONICAL,
+	/* Our default is not to modify the item. */
+	DEFAULT_UNMODIFIED,
+};
+
+static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
+{
+	char *cwd = NULL;
+	/*
+	 * We don't ever produce a relative path if prefix is NULL, so set the
+	 * prefix to the current directory so that we can produce a relative
+	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
+	 * we want an absolute path unless the two share a common prefix, so don't
+	 * set it in that case, since doing so causes a relative path to always
+	 * be produced if possible.
+	 */
+	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
+		prefix = cwd = xgetcwd();
+	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
+		puts(path);
+	} else if (format == FORMAT_RELATIVE ||
+		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE)) {
+		/*
+		 * In order for relative_path to work as expected, we need to
+		 * make sure that both paths are absolute paths.  If we don't,
+		 * we can end up with an unexpected absolute path that the user
+		 * didn't want.
+		 */
+		struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
+		if (!is_absolute_path(path)) {
+			strbuf_realpath_missing(&realbuf, path, 1, 1);
+			path = realbuf.buf;
+		}
+		if (!is_absolute_path(prefix)) {
+			strbuf_realpath_missing(&prefixbuf, prefix, 1, 1);
+			prefix = prefixbuf.buf;
+		}
+		puts(relative_path(path, prefix, &buf));
+		strbuf_release(&buf);
+	} else if (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED) {
+		struct strbuf buf = STRBUF_INIT;
+		puts(relative_path(path, prefix, &buf));
+		strbuf_release(&buf);
+	} else {
+		struct strbuf buf = STRBUF_INIT;
+		strbuf_realpath_missing(&buf, path, 1, 1);
+		puts(buf.buf);
+		strbuf_release(&buf);
+	}
+	free(cwd);
+}
+
 int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 {
 	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
@@ -596,6 +663,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 	struct strbuf buf = STRBUF_INIT;
 	const int hexsz = the_hash_algo->hexsz;
 	int seen_end_of_options = 0;
+	enum format_type format = FORMAT_DEFAULT;
 
 	if (argc > 1 && !strcmp("--parseopt", argv[1]))
 		return cmd_parseopt(argc - 1, argv + 1, prefix);
@@ -668,8 +736,9 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				if (!argv[i + 1])
 					die("--git-path requires an argument");
 				strbuf_reset(&buf);
-				puts(relative_path(git_path("%s", argv[i + 1]),
-						   prefix, &buf));
+				print_path(git_path("%s", argv[i + 1]), prefix,
+						format,
+						DEFAULT_RELATIVE_IF_SHARED);
 				i++;
 				continue;
 			}
@@ -687,6 +756,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 					show(arg);
 				continue;
 			}
+			if (opt_with_value(arg, "--path-format", &arg)) {
+				if (!strcmp(arg, "absolute")) {
+					format = FORMAT_CANONICAL;
+				} else if (!strcmp(arg, "relative")) {
+					format = FORMAT_RELATIVE;
+				} else {
+					die("unknown argument to --path-format: %s", arg);
+				}
+				continue;
+			}
 			if (!strcmp(arg, "--default")) {
 				def = argv[++i];
 				if (!def)
@@ -807,7 +886,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
-					puts(work_tree);
+					print_path(work_tree, prefix, format, DEFAULT_UNMODIFIED);
 				else
 					die("this operation must be run in a work tree");
 				continue;
@@ -815,7 +894,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-superproject-working-tree")) {
 				struct strbuf superproject = STRBUF_INIT;
 				if (get_superproject_working_tree(&superproject))
-					puts(superproject.buf);
+					print_path(superproject.buf, prefix, format, DEFAULT_UNMODIFIED);
 				strbuf_release(&superproject);
 				continue;
 			}
@@ -850,16 +929,18 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 				char *cwd;
 				int len;
+				enum format_type wanted = format;
 				if (arg[2] == 'g') {	/* --git-dir */
 					if (gitdir) {
-						puts(gitdir);
+						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 					if (!prefix) {
-						puts(".git");
+						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 				} else {		/* --absolute-git-dir */
+					wanted = FORMAT_CANONICAL;
 					if (!gitdir && !prefix)
 						gitdir = ".git";
 					if (gitdir) {
@@ -872,14 +953,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				}
 				cwd = xgetcwd();
 				len = strlen(cwd);
-				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
+				strbuf_reset(&buf);
+				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
 				free(cwd);
+				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
 				continue;
 			}
 			if (!strcmp(arg, "--git-common-dir")) {
-				strbuf_reset(&buf);
-				puts(relative_path(get_git_common_dir(),
-						   prefix, &buf));
+				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
 				continue;
 			}
 			if (!strcmp(arg, "--is-inside-git-dir")) {
@@ -909,8 +990,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				if (the_index.split_index) {
 					const struct object_id *oid = &the_index.split_index->base_oid;
 					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
-					strbuf_reset(&buf);
-					puts(relative_path(path, prefix, &buf));
+					print_path(path, prefix, format, DEFAULT_RELATIVE);
 				}
 				continue;
 			}
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 408b97d5af..51d7d40ec1 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -3,6 +3,16 @@
 test_description='test git rev-parse'
 . ./test-lib.sh
 
+test_one () {
+	dir="$1" &&
+	expect="$2" &&
+	shift &&
+	shift &&
+	echo "$expect" >expect &&
+	git -C "$dir" rev-parse "$@" >actual &&
+	test_cmp expect actual
+}
+
 # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
 test_rev_parse () {
 	d=
@@ -60,7 +70,13 @@ ROOT=$(pwd)
 
 test_expect_success 'setup' '
 	mkdir -p sub/dir work &&
-	cp -R .git repo.git
+	cp -R .git repo.git &&
+	git checkout -B main &&
+	test_commit abc &&
+	git checkout -b side &&
+	test_commit def &&
+	git checkout main &&
+	git worktree add worktree side
 '
 
 test_rev_parse toplevel false false true '' .git "$ROOT/.git"
@@ -88,6 +104,45 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
 
 test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
 
+test_expect_success 'rev-parse --path-format=absolute' '
+	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
+	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-dir &&
+	test_one "sub/dir" "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
+	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
+	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects &&
+	test_one "." "$ROOT/.git/objects/foo/bar/baz" --path-format=absolute --git-path objects/foo/bar/baz
+'
+
+test_expect_success 'rev-parse --path-format=relative' '
+	test_one "." ".git" --path-format=relative --git-dir &&
+	test_one "." ".git" --path-format=relative --git-common-dir &&
+	test_one "sub/dir" "../../.git" --path-format=relative --git-dir &&
+	test_one "sub/dir" "../../.git" --path-format=relative --git-common-dir &&
+	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
+	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
+	test_one "." "./" --path-format=relative --show-toplevel &&
+	test_one "." ".git/objects" --path-format=relative --git-path objects &&
+	test_one "." ".git/objects/foo/bar/baz" --path-format=relative --git-path objects/foo/bar/baz
+'
+
+test_expect_success '--path-format=relative does not affect --absolute-git-dir' '
+	git rev-parse --path-format=relative --absolute-git-dir >actual &&
+	echo "$ROOT/.git" >expect &&
+	test_cmp expect actual
+'
+
+test_expect_success '--path-format can change in the middle of the command line' '
+	git rev-parse --path-format=absolute --git-dir --path-format=relative --git-path objects/foo/bar >actual &&
+	cat >expect <<-EOF &&
+	$ROOT/.git
+	.git/objects/foo/bar
+	EOF
+	test_cmp expect actual
+'
+
 test_expect_success 'git-common-dir from worktree root' '
 	echo .git >expect &&
 	git rev-parse --git-common-dir >actual &&

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 1/2] abspath: add a function to resolve paths with missing components
  2020-12-06 22:53 ` [PATCH v4 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
@ 2020-12-07  0:02   ` Eric Sunshine
  2020-12-07  2:11     ` brian m. carlson
  2020-12-07 17:19   ` René Scharfe
  1 sibling, 1 reply; 9+ messages in thread
From: Eric Sunshine @ 2020-12-07  0:02 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Git List, Johannes Schindelin, René Scharfe

On Sun, Dec 6, 2020 at 5:56 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> We'd like to canonicalize paths such that we can preserve any number of
> trailing components that may be missing.  Let's add a function to do
> that, allowing either one or an unlimited number of components to
> canonicalize, and make strbuf_realpath a wrapper around it that allows
> just one component.

This commit message is too barebones. As a justification, "We'd
like..." is insufficient; it doesn't help the reader understand why
this change is desirable.

Further, the lack of explanation about the seemingly arbitrary "one or
infinite" condition confuses the issue. The first question which
popped into this reader's head was "why those two specific choices?".
What makes one missing path component special as opposed to any number
of missing components? (These questions are mostly rhetorical; I can
figure out reasonable answers, but the commit message ought to do a
better job of explaining.)

> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> +/*
> + * Just like strbuf_realpath, but allows specifying how many missing components
> + * are permitted.  If many_missing is true, an arbitrary number of path
> + * components may be missing; otherwise, only the last component may be missing.
> + */
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path,
> +                             int many_missing, int die_on_error)

This interface feels odd. Why would a caller ever want to call this
function with many_missing=0 when it would be easier, shorter, more
direct to simply call strbuf_realpath()? Is the plan to retire
strbuf_realpath() down the road?

A more orthogonal-feeling interface would be to make this function
_always_ allow any number of missing trailing path components (that
is, drop `many_missing` altogether). Doing so would simplify the
documentation and the signature. If the caller needs the original
behavior of only allowing the final path component to be missing, then
strbuf_realpath() can be used, as usual.

The name of the function is somewhat confusing, especially if you take
the suggestion of dropping the `many_missing` argument. Perhaps a name
such as strbuf_realpath_forgiving() would be more understandable.

Note that the above comments are only about the API. It's perfectly
fine if the two functions share an underlying private implementation
(making them each one-liners calling the actual worker function).

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting
  2020-12-06 22:53 ` [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
@ 2020-12-07  0:30   ` Eric Sunshine
  2020-12-07  2:38     ` brian m. carlson
  0 siblings, 1 reply; 9+ messages in thread
From: Eric Sunshine @ 2020-12-07  0:30 UTC (permalink / raw)
  To: brian m. carlson; +Cc: Git List, Johannes Schindelin, René Scharfe

On Sun, Dec 6, 2020 at 5:58 PM brian m. carlson
<sandals@crustytoothpaste.net> wrote:
> [...]
> To help out the scripter, let's provide an option which turns most of
> the paths printed by git rev-parse to be either relative to the current
> working directory or absolute and canonical.  Document which options are
> affected and which are not so that users are not confused.
> [...]
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
> diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
> @@ -212,6 +212,15 @@ Options for Files
> +--path-format=(absolute|relative)::
> +       Controls the behavior of certain other options following it on the command
> +       line. If specified as absolute, the paths printed by those options will be
> +       absolute and canonical. If specified as relative, the paths will be relative
> +       to the current working directory if that is possible.  The default is option
> +       specific.

I read the commit message and looked at the implementation so I know
that this option can be specified multiple times, but this
documentation only vaguely hints at it by saying "options following
it". We could do a better job of imparting that knowledge to the
reader by saying so explicitly, perhaps like this:

    Controls the behavior of some path-printing options. If
    'absolute', [...]. If 'relative', [...]. May be specified multiple
    times, each time affecting the path-printing options which
    follow it. The default path format is option-specific.

Since this option can be specified multiple times, should it also
recognize `default` to request the default behavior in addition to
`absolute` and `relative`? (Genuine question. Maybe real-world
use-cases wouldn't need it, but it would be easy to support and make
it functionally complete.)

> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> @@ -583,6 +583,73 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
> +static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> +{
> +               struct strbuf buf = STRBUF_INIT, realbuf = STRBUF_INIT, prefixbuf = STRBUF_INIT;
> +               if (!is_absolute_path(path)) {
> +                       strbuf_realpath_missing(&realbuf, path, 1, 1);
> +                       path = realbuf.buf;
> +               }
> +               if (!is_absolute_path(prefix)) {
> +                       strbuf_realpath_missing(&prefixbuf, prefix, 1, 1);
> +                       prefix = prefixbuf.buf;
> +               }
> +               puts(relative_path(path, prefix, &buf));
> +               strbuf_release(&buf);

Leaking `realbuf` and `prefixbuf` here.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 1/2] abspath: add a function to resolve paths with missing components
  2020-12-07  0:02   ` Eric Sunshine
@ 2020-12-07  2:11     ` brian m. carlson
  0 siblings, 0 replies; 9+ messages in thread
From: brian m. carlson @ 2020-12-07  2:11 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Johannes Schindelin, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 1211 bytes --]

On 2020-12-07 at 00:02:16, Eric Sunshine wrote:
> This commit message is too barebones. As a justification, "We'd
> like..." is insufficient; it doesn't help the reader understand why
> this change is desirable.
> 
> Further, the lack of explanation about the seemingly arbitrary "one or
> infinite" condition confuses the issue. The first question which
> popped into this reader's head was "why those two specific choices?".
> What makes one missing path component special as opposed to any number
> of missing components? (These questions are mostly rhetorical; I can
> figure out reasonable answers, but the commit message ought to do a
> better job of explaining.)

Sure, I can expand the commit message to be a little more descriptive.

> The name of the function is somewhat confusing, especially if you take
> the suggestion of dropping the `many_missing` argument. Perhaps a name
> such as strbuf_realpath_forgiving() would be more understandable.

Sure, I agree that's a better name.  It shouldn't be surprising to
anyone on the list that I am absolutely terrible at naming things, so I
appreciate the suggestion.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting
  2020-12-07  0:30   ` Eric Sunshine
@ 2020-12-07  2:38     ` brian m. carlson
  0 siblings, 0 replies; 9+ messages in thread
From: brian m. carlson @ 2020-12-07  2:38 UTC (permalink / raw)
  To: Eric Sunshine; +Cc: Git List, Johannes Schindelin, René Scharfe

[-- Attachment #1: Type: text/plain, Size: 1620 bytes --]

On 2020-12-07 at 00:30:13, Eric Sunshine wrote:
> I read the commit message and looked at the implementation so I know
> that this option can be specified multiple times, but this
> documentation only vaguely hints at it by saying "options following
> it". We could do a better job of imparting that knowledge to the
> reader by saying so explicitly, perhaps like this:
> 
>     Controls the behavior of some path-printing options. If
>     'absolute', [...]. If 'relative', [...]. May be specified multiple
>     times, each time affecting the path-printing options which
>     follow it. The default path format is option-specific.

I'll improve the documentation in v5.

> Since this option can be specified multiple times, should it also
> recognize `default` to request the default behavior in addition to
> `absolute` and `relative`? (Genuine question. Maybe real-world
> use-cases wouldn't need it, but it would be easy to support and make
> it functionally complete.)

I don't think adding the default is helpful.  I can think of situations
where people want absolute path names or relative path names, but in
what case would someone want, "meh, whatever Git decides to give me"?

The problem we're solving here is that Git isn't consistent and can't be
used to provide valuable information that the user wants.  If a user
doesn't care what format the path is in, then they can always specify
absolute and it will almost certainly be correct and work for them.

> Leaking `realbuf` and `prefixbuf` here.

Will fix.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 1/2] abspath: add a function to resolve paths with missing components
  2020-12-06 22:53 ` [PATCH v4 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
  2020-12-07  0:02   ` Eric Sunshine
@ 2020-12-07 17:19   ` René Scharfe
  2020-12-08  2:51     ` brian m. carlson
  1 sibling, 1 reply; 9+ messages in thread
From: René Scharfe @ 2020-12-07 17:19 UTC (permalink / raw)
  To: brian m. carlson, git; +Cc: Johannes Schindelin

Am 06.12.20 um 23:53 schrieb brian m. carlson:
> We'd like to canonicalize paths such that we can preserve any number of
> trailing components that may be missing.  Let's add a function to do
> that, allowing either one or an unlimited number of components to
> canonicalize, and make strbuf_realpath a wrapper around it that allows
> just one component.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  abspath.c | 13 ++++++++++++-
>  cache.h   |  2 ++
>  2 files changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/abspath.c b/abspath.c
> index 6f15a418bb..e6630b3618 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -80,6 +80,17 @@ static void get_root_part(struct strbuf *resolved, struct strbuf *remaining)
>   */
>  char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  		      int die_on_error)
> +{
> +	return strbuf_realpath_missing(resolved, path, 0, die_on_error);
> +}
> +
> +/*
> + * Just like strbuf_realpath, but allows specifying how many missing components
> + * are permitted.  If many_missing is true, an arbitrary number of path
> + * components may be missing; otherwise, only the last component may be missing.
> + */
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path,
> +			      int many_missing, int die_on_error)

So this uses two 32-bit values to pass two bits.  Hmm.

I find the concept of a "real" path with imaginary components strangely
amusing.  But perhaps a name like strbuf_resolve_path() would fit better?

>  {
>  	struct strbuf remaining = STRBUF_INIT;
>  	struct strbuf next = STRBUF_INIT;
> @@ -129,7 +140,7 @@ char *strbuf_realpath(struct strbuf *resolved, const char *path,
>
>  		if (lstat(resolved->buf, &st)) {
>  			/* error out unless this was the last component */
> -			if (errno != ENOENT || remaining.len) {
> +			if (errno != ENOENT || (!many_missing && remaining.len)) {
>  				if (die_on_error)
>  					die_errno("Invalid path '%s'",
>  						  resolved->buf);

So the original code errors out if there is a real error
(errno != ENOENT).  It also errors out if any component except the last
one is missing (errno == ENOENT && remaining.len); that's what the
comment is about.  This patch adds the ability to ignore ENOENT for all
components.

Perhaps convert many_missing and die_on_error into a single flags
parameter and implement the flags DIE_ON_ERR and REQUIRE_BASENAME or
similar?  Callers would be easier to read because such an interface is
self-documenting -- provided we find good flag names.

> diff --git a/cache.h b/cache.h
> index e986cf4ea9..a1386235fc 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1320,6 +1320,8 @@ static inline int is_absolute_path(const char *path)
>  int is_directory(const char *);
>  char *strbuf_realpath(struct strbuf *resolved, const char *path,
>  		      int die_on_error);
> +char *strbuf_realpath_missing(struct strbuf *resolved, const char *path,
> +			      int missing_components, int die_on_error);
>  char *real_pathdup(const char *path, int die_on_error);
>  const char *absolute_path(const char *path);
>  char *absolute_pathdup(const char *path);
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 1/2] abspath: add a function to resolve paths with missing components
  2020-12-07 17:19   ` René Scharfe
@ 2020-12-08  2:51     ` brian m. carlson
  0 siblings, 0 replies; 9+ messages in thread
From: brian m. carlson @ 2020-12-08  2:51 UTC (permalink / raw)
  To: René Scharfe; +Cc: git, Johannes Schindelin

[-- Attachment #1: Type: text/plain, Size: 1152 bytes --]

On 2020-12-07 at 17:19:32, René Scharfe wrote:
> I find the concept of a "real" path with imaginary components strangely
> amusing.  But perhaps a name like strbuf_resolve_path() would fit better?

I think I'm going to take the strbuf_realpath_forgiving solution from
Eric Sunshine because I think having similar names for similar functions
helps discoverability.

> So the original code errors out if there is a real error
> (errno != ENOENT).  It also errors out if any component except the last
> one is missing (errno == ENOENT && remaining.len); that's what the
> comment is about.  This patch adds the ability to ignore ENOENT for all
> components.
> 
> Perhaps convert many_missing and die_on_error into a single flags
> parameter and implement the flags DIE_ON_ERR and REQUIRE_BASENAME or
> similar?  Callers would be easier to read because such an interface is
> self-documenting -- provided we find good flag names.

As discussed elsewhere in the thread, this will be moving to an internal
function, but I can make that function take two flag parameters.
-- 
brian m. carlson (he/him or they/them)
Houston, Texas, US

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 263 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-12-08  2:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-06 22:53 [PATCH v4 0/2] rev-parse options for absolute or relative paths brian m. carlson
2020-12-06 22:53 ` [PATCH v4 1/2] abspath: add a function to resolve paths with missing components brian m. carlson
2020-12-07  0:02   ` Eric Sunshine
2020-12-07  2:11     ` brian m. carlson
2020-12-07 17:19   ` René Scharfe
2020-12-08  2:51     ` brian m. carlson
2020-12-06 22:53 ` [PATCH v4 2/2] rev-parse: add option for absolute or relative path formatting brian m. carlson
2020-12-07  0:30   ` Eric Sunshine
2020-12-07  2:38     ` brian m. carlson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.