All of lore.kernel.org
 help / color / mirror / Atom feed
* Incorrect git-blame result if I use full path to file
@ 2007-12-03  0:52 Anatol Pomozov
  2007-12-03  2:19 ` Junio C Hamano
  2007-12-03  2:27 ` Jeff King
  0 siblings, 2 replies; 21+ messages in thread
From: Anatol Pomozov @ 2007-12-03  0:52 UTC (permalink / raw)
  To: git

Hi, all.

I just start learning git and I found a bug (but sorry if the
functionality I am trying to blame as a bug not actually bug and it
was made by intention)

The problem is that git-blame returns incorrect result if you use full
path for files.

Here is an example script that generates repo.

#go to empty dir
git init
echo "On master" >> master.txt
git add master.txt
git commit -m "First commit"
echo "On master" >> master.txt
git commit -a -m "Second commit"
echo "On master" >> master.txt


Now lets do blame for master.txt
anatol:repo $ git blame master.txt
^69bce74 (Anatol Pomozov    2007-12-02 16:44:07 -0800 1) On master
4e2bbde4 (Anatol Pomozov    2007-12-02 16:44:15 -0800 2) On master
00000000 (Not Committed Yet 2007-12-02 16:44:27 -0800 3) On master

It is exaclty what we expect. But lets try full path for master.txt
$pwd
/personal/sources/learn/gitea/repo
$git blame /personal/sources/learn/gitea/repo/master.txt
^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 1) On master
^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 2) On master
^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 3) On master


Now git shows that all lines in the file were changed by the first
commit and that it does not true.

-- 
anatol

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  0:52 Incorrect git-blame result if I use full path to file Anatol Pomozov
@ 2007-12-03  2:19 ` Junio C Hamano
  2007-12-03  2:28   ` Jeff King
  2007-12-03 17:26   ` Linus Torvalds
  2007-12-03  2:27 ` Jeff King
  1 sibling, 2 replies; 21+ messages in thread
From: Junio C Hamano @ 2007-12-03  2:19 UTC (permalink / raw)
  To: Anatol Pomozov; +Cc: git

"Anatol Pomozov" <anatol.pomozov@gmail.com> writes:

> I just start learning git and I found a bug (but sorry if the
> functionality I am trying to blame as a bug not actually bug and it
> was made by intention)

I think it is rather a sloppy error checking than a bug.  It should be
throwing a stone back at you when you feed it a full path, or converting
it back to work tree relative path before using.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  0:52 Incorrect git-blame result if I use full path to file Anatol Pomozov
  2007-12-03  2:19 ` Junio C Hamano
@ 2007-12-03  2:27 ` Jeff King
  2007-12-03  2:40   ` Junio C Hamano
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff King @ 2007-12-03  2:27 UTC (permalink / raw)
  To: Anatol Pomozov; +Cc: git

On Sun, Dec 02, 2007 at 04:52:36PM -0800, Anatol Pomozov wrote:

> I just start learning git and I found a bug (but sorry if the
> functionality I am trying to blame as a bug not actually bug and it
> was made by intention)

Some of both, I think. :)

> It is exaclty what we expect. But lets try full path for master.txt
> $pwd
> /personal/sources/learn/gitea/repo
> $git blame /personal/sources/learn/gitea/repo/master.txt
> ^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 1) On master
> ^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 2) On master
> ^69bce74 (Anatol Pomozov 2007-12-02 16:44:07 -0800 3) On master

We talk about many git commands taking "files" or "paths" but really
they are git "pathspecs", meaning a path specifier that is relative to
the repository root, and which is generally used for limiting the parts
of the history we are looking at.

So I think what is happening is that git-blame is looking for content
from /personal/sources/..., which of course as a git pathspec doesn't
match any of the files. So everything ends up being blamed on
'^69bce74' (which really means "beyond where we started looking"). But
of course it still finds the content to try blaming in the first place,
because in that instance it treats /personal/sources/... as a file to be
opened.

IOW, it's not intended for users to use absolute paths in this way.
However, the results for git-blame are obviously quite confusing. It
might be worth fixing, but I suspect there are many more such traps
waiting in other commands. I wonder if it would make sense to reject
pathspecs starting with '/' entirely, which would at least give us a
saner error message (and I can't think of a time when such a pathspec
would be useful)? Even more useful would be to convert
/path/to/repo/file to 'file' internally.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  2:19 ` Junio C Hamano
@ 2007-12-03  2:28   ` Jeff King
  2007-12-03 17:26   ` Linus Torvalds
  1 sibling, 0 replies; 21+ messages in thread
From: Jeff King @ 2007-12-03  2:28 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Anatol Pomozov, git

On Sun, Dec 02, 2007 at 06:19:23PM -0800, Junio C Hamano wrote:

> I think it is rather a sloppy error checking than a bug.  It should be
> throwing a stone back at you when you feed it a full path, or converting
> it back to work tree relative path before using.

I think it's not the only place. Doing "git diff /path/to/repo/file"
silently produces an empty diff, even if there are changes in the file.

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  2:27 ` Jeff King
@ 2007-12-03  2:40   ` Junio C Hamano
  2007-12-03  2:49     ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Junio C Hamano @ 2007-12-03  2:40 UTC (permalink / raw)
  To: Jeff King; +Cc: Anatol Pomozov, git

Jeff King <peff@peff.net> writes:

> IOW, it's not intended for users to use absolute paths in this way.
> However, the results for git-blame are obviously quite confusing. It
> might be worth fixing, but I suspect there are many more such traps
> waiting in other commands. I wonder if it would make sense to reject
> pathspecs starting with '/' entirely, which would at least give us a
> saner error message (and I can't think of a time when such a pathspec
> would be useful)?

All correct, except...

> Even more useful would be to convert
> /path/to/repo/file to 'file' internally.

... that might help "cut & paste from file manager" people, and I think
we had comment session for such a patch recently on the list.

Sorry, but I lost track of that the current status of that patch.  Did
it die?

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  2:40   ` Junio C Hamano
@ 2007-12-03  2:49     ` Jeff King
  2007-12-03  6:55       ` Robin Rosenberg
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2007-12-03  2:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Anatol Pomozov, git, Robin Rosenberg

On Sun, Dec 02, 2007 at 06:40:36PM -0800, Junio C Hamano wrote:

> > Even more useful would be to convert
> > /path/to/repo/file to 'file' internally.
> 
> ... that might help "cut & paste from file manager" people, and I think
> we had comment session for such a patch recently on the list.
> 
> Sorry, but I lost track of that the current status of that patch.  Did
> it die?

I didn't pay attention to it originally, but I assume you mean the
recent patch from Robin Rosenberg (cc'd). Looking it over, I see one
obvious omission: there is no canonicalization of the paths. IOW, I
think it will break in the presence of symlinks (if I specify
/path/to/repo/file, /path/to is a symlink to /other/path, I think the
worktree will end up as /other/path/repo, and fail a string comparison
with /path/to/repo).

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  2:49     ` Jeff King
@ 2007-12-03  6:55       ` Robin Rosenberg
  2007-12-03 20:53         ` [PATCH] Make Git accept absolute path names for files within the work tree Robin Rosenberg
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Rosenberg @ 2007-12-03  6:55 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, Anatol Pomozov, git

måndag 03 december 2007 skrev Jeff King:
> On Sun, Dec 02, 2007 at 06:40:36PM -0800, Junio C Hamano wrote:
> 
> > > Even more useful would be to convert
> > > /path/to/repo/file to 'file' internally.
> > 
> > ... that might help "cut & paste from file manager" people, and I think
> > we had comment session for such a patch recently on the list.
> > 
> > Sorry, but I lost track of that the current status of that patch.  Did
> > it die?
> 
> I didn't pay attention to it originally, but I assume you mean the
> recent patch from Robin Rosenberg (cc'd). Looking it over, I see one
> obvious omission: there is no canonicalization of the paths. IOW, I
> think it will break in the presence of symlinks (if I specify
> /path/to/repo/file, /path/to is a symlink to /other/path, I think the
> worktree will end up as /other/path/repo, and fail a string comparison
> with /path/to/repo).

No it didn't die, it's just not worked on too often. I notes, among, other things
that it's test cases were not correct, besides needing more tests.

Symlinks were not covered.

-- robin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03  2:19 ` Junio C Hamano
  2007-12-03  2:28   ` Jeff King
@ 2007-12-03 17:26   ` Linus Torvalds
  2007-12-03 18:09     ` Johannes Schindelin
  1 sibling, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2007-12-03 17:26 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Anatol Pomozov, git



On Sun, 2 Dec 2007, Junio C Hamano wrote:
> "Anatol Pomozov" <anatol.pomozov@gmail.com> writes:
> >
> > I just start learning git and I found a bug (but sorry if the
> > functionality I am trying to blame as a bug not actually bug and it
> > was made by intention)
> 
> I think it is rather a sloppy error checking than a bug.  It should be
> throwing a stone back at you when you feed it a full path, or converting
> it back to work tree relative path before using.

How about this patch?

It makes "get_pathspec()" make all the paths it returns relative, if it 
can. HOWEVER! I think it should actually die() if it sees an absolute path 
that it cannot convert (because it really cannot do anything sane about 
it), but I commented that out for now because that requires some test case 
change: right now we actually have a few test cases for insane filename 
arguments, and they expect the old behaviour.

Comments? This changes behaviour subtly (and if we enable the "die(..)" 
logic, not-so-subtly), but I think that in any case where it changes 
behaviour, the new behaviour would be an improvement, and the old one 
would be nonsensical (ie you get *some* results with an absolute pathname, 
just not the ones you'd expect!)

Note the die() comment in the bad case in "make_relative()".

		Linus
---
 setup.c |   34 +++++++++++++++++++++++++++++++++-
 1 files changed, 33 insertions(+), 1 deletions(-)

diff --git a/setup.c b/setup.c
index 2c7b5cb..fadf4ee 100644
--- a/setup.c
+++ b/setup.c
@@ -111,11 +111,26 @@ void verify_non_filename(const char *prefix, const char *arg)
 		die("'%s': %s", arg, strerror(errno));
 }
 
+static const char *make_relative(const char *file, const char *pwd, int pwdlen)
+{
+	if (strncmp(file, pwd, pwdlen))
+		goto bad;
+	if (file[pwdlen] != '/')
+		goto bad;
+	return file + pwdlen + 1;
+
+bad:
+	/* Should we die() here or just do a "return file"? */
+	/* die("pathname '%s' is not in the repository", file); */
+	return file;
+}
+
 const char **get_pathspec(const char *prefix, const char **pathspec)
 {
+	const char *pwd;
 	const char *entry = *pathspec;
 	const char **p;
-	int prefixlen;
+	int prefixlen, pwdlen;
 
 	if (!prefix && !entry)
 		return NULL;
@@ -127,9 +142,26 @@ const char **get_pathspec(const char *prefix, const char **pathspec)
 		return spec;
 	}
 
+	pwd = NULL;
+	pwdlen = 0;
+	p = pathspec;
+	do {
+		if (*entry == '/') {
+			if (!pwd) {
+				char buffer[PATH_MAX + 1];
+				if (!getcwd(buffer, sizeof(buffer)))
+					break;
+				pwd = buffer;
+				pwdlen = strlen(buffer);
+			}
+			*p = make_relative(entry, pwd, pwdlen);
+		}
+	} while ((entry = *++p) != NULL);
+
 	/* Otherwise we have to re-write the entries.. */
 	p = pathspec;
 	prefixlen = prefix ? strlen(prefix) : 0;
+	entry = *p;
 	do {
 		*p = prefix_path(prefix, prefixlen, entry);
 	} while ((entry = *++p) != NULL);

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03 17:26   ` Linus Torvalds
@ 2007-12-03 18:09     ` Johannes Schindelin
  2007-12-03 18:13       ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2007-12-03 18:09 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, Anatol Pomozov, git

Hi,

On Mon, 3 Dec 2007, Linus Torvalds wrote:

> It [the patch] makes "get_pathspec()" make all the paths it returns 
> relative, if it can. HOWEVER! I think it should actually die() if it 
> sees an absolute path that it cannot convert (because it really cannot 
> do anything sane about it), but I commented that out for now because 
> that requires some test case change: right now we actually have a few 
> test cases for insane filename arguments, and they expect the old 
> behaviour.

I have the slight suspicion that this could break diff --no-index.  And it 
does not contain any symlink resolution, right?

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03 18:09     ` Johannes Schindelin
@ 2007-12-03 18:13       ` Linus Torvalds
  2007-12-03 18:19         ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2007-12-03 18:13 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Anatol Pomozov, git



On Mon, 3 Dec 2007, Johannes Schindelin wrote:
> 
> I have the slight suspicion that this could break diff --no-index.

Quite possible.

> And it does not contain any symlink resolution, right?

That's correct, and by design. If you give a path where the absolute part 
of the path contains some symlink that eventually gets you to the right 
point, you get screwed. That's part of why I'd _prefer_ to do the "die()" 
part, so that you get screwed with a nice error message, rather than being 
screwed by getting unexpected results!

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Incorrect git-blame result if I use full path to file
  2007-12-03 18:13       ` Linus Torvalds
@ 2007-12-03 18:19         ` Linus Torvalds
  0 siblings, 0 replies; 21+ messages in thread
From: Linus Torvalds @ 2007-12-03 18:19 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Junio C Hamano, Anatol Pomozov, git



On Mon, 3 Dec 2007, Linus Torvalds wrote:
> 
> On Mon, 3 Dec 2007, Johannes Schindelin wrote:
> > 
> > I have the slight suspicion that this could break diff --no-index.
> 
> Quite possible.

Side note: another issue (for the particular case that Anatol hit) is that 
this patch obviously only helps for commands that actually use 
"get_pathspec()" (usually through doing all the common argument setup 
stuff). So "git log" and friends work fine.

HOWEVER. "git blame" has its own argument parsing that doesn't use any of 
the common routines, and thus the behaviour that Anatol complained about 
isn't fixed at all by the patch.

I think that should be fixed by just making git blame use the standard 
arguments (which in turn may involve having to teach the *other* commands 
about the "-S <revs-file>" and "-L n,m" forms! I think those are why it 
does its own specialized parsing), but obviously git-blame could also be 
tought to just do "get_pathspec()" too.

		Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-03  6:55       ` Robin Rosenberg
@ 2007-12-03 20:53         ` Robin Rosenberg
  2007-12-03 23:03           ` Junio C Hamano
  2007-12-04  1:43           ` Jeff King
  0 siblings, 2 replies; 21+ messages in thread
From: Robin Rosenberg @ 2007-12-03 20:53 UTC (permalink / raw)
  To: Jeff King
  Cc: Junio C Hamano, Anatol Pomozov, git, Linus Torvalds, Anatol Pomozov

This patch makes it possible to drag files and directories from
a graphical browser and drop them onto a shell and feed them
to common git operations without editing away the path to the
root of the work tree.

Signed-off-by: Robin Rosenberg <robin.rosenberg@dewire.com>
---

I will not surrender to the fierce competion on this subject. Here is an update
with hopefully correct test cases this time. (Linus. your code did not pass). Like Linus,
this code does not resolve symlinks, but I forgot to state that it is by design. It
solves my problem and happens to solve Anatols problem (actually the same since
passing absolute file names to blame is my most important use case).

-- robin

 builtin-blame.c       |    4 +-
 setup.c               |   53 +++++++++++++++++++++++
 t/t3904-abspatharg.sh |  112 +++++++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 166 insertions(+), 3 deletions(-)
 create mode 100755 t/t3904-abspatharg.sh

diff --git a/builtin-blame.c b/builtin-blame.c
index c158d31..b905dcf 100644
--- a/builtin-blame.c
+++ b/builtin-blame.c
@@ -1880,9 +1880,7 @@ static unsigned parse_score(const char *arg)
 
 static const char *add_prefix(const char *prefix, const char *path)
 {
-	if (!prefix || !prefix[0])
-		return path;
-	return prefix_path(prefix, strlen(prefix), path);
+	return prefix_path(prefix, prefix ? strlen(prefix) : 0, path);
 }
 
 /*
diff --git a/setup.c b/setup.c
index 2c7b5cb..1f0ec79 100644
--- a/setup.c
+++ b/setup.c
@@ -4,9 +4,62 @@
 static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 
+static
+const char *strip_work_tree_path(const char *prefix, int len, const char *path)
+{
+	const char *work_tree = get_git_work_tree();
+	int n = strlen(work_tree);
+
+	if (strncmp(path, work_tree, n))
+		return path;
+
+	if (!prefix && !path[n])
+		return path + n;
+
+	if (!prefix) {
+		if (path[n] == '/')
+			return path + n + 1;
+		else
+			if (path[n])
+				return path;
+			else
+				return path + n;
+	}
+
+	if (prefix && !path[n])
+		return path;
+
+	if (strncmp(path + n + 1, prefix, len - 1)) {
+		fprintf(stderr,"prefix mismatch\n");
+		char *np;
+		int i;
+		int d=0;
+		for (i = 0; i < len; ++i)
+			if (prefix[i] == '/')
+				d++;
+		np = xmalloc(strlen(path + n) + d * 3 + 1);
+		for (i=0; i < d * 3; i += 3)
+			strcpy(np + i, "../");
+		strcpy(np + i, path + n + 1);
+		path = np;
+		return np;
+	}
+
+	if (path[len + n] == '/')
+		return path + len + n + 1;
+	else
+		if (path[len + n])
+			return path;
+		else
+			return path + len + n;
+}
+
 const char *prefix_path(const char *prefix, int len, const char *path)
 {
 	const char *orig = path;
+	if (is_absolute_path(path))
+		path = strip_work_tree_path(prefix, len, path);
+
 	for (;;) {
 		char c;
 		if (*path != '.')
diff --git a/t/t3904-abspatharg.sh b/t/t3904-abspatharg.sh
new file mode 100755
index 0000000..47f1222
--- /dev/null
+++ b/t/t3904-abspatharg.sh
@@ -0,0 +1,112 @@
+#!/bin/sh
+#
+# Copyright (C) 2007 Robin Rosenberg
+#
+
+test_description='Test absolute filename arguments to various git
+commands.  Absolute arguments pointing to a location within the git
+work tree should behave the same as relative arguments.  '
+
+. ./test-lib.sh
+
+test_expect_success 'add files using absolute path names' '
+	echo a >afile &&
+	echo b >bfile &&
+	git-add afile &&
+	git-add "$(pwd)/bfile" &&
+	test "afile bfile" = "$(echo $(git ls-files))"
+	mkdir x &&
+	(
+		cd x &&
+		echo c >cfile &&
+		echo d >dfile &&
+		git-add cfile &&
+		git-add "$(pwd)"
+	) &&
+	test "afile bfile x/cfile x/dfile" = "$(echo $(git ls-files))" &&
+	git ls-files x >f1 &&
+	git ls-files "$(pwd)/x" >f2 &&
+	diff -u f1 f2
+'
+
+test_expect_success 'commit using absolute path names' '
+	git commit -m "foo" &&
+	echo aa >>bfile &&
+	git commit -m "aa" "$(pwd)/bfile"
+'
+
+test_expect_success 'log using absolute path names' '
+	echo bb >>bfile &&
+	git commit -m "bb" $(pwd)/bfile &&
+
+	git log bfile >f1.txt &&
+	git log "$(pwd)/bfile" >f2.txt &&
+	diff -u f1.txt f2.txt
+'
+
+test_expect_success 'blame using absolute path names' '
+	git blame bfile >f1.txt &&
+	git blame "$(pwd)/bfile" >f2.txt &&
+	diff -u f1.txt f2.txt
+'
+
+test_expect_success 'diff using absolute path names' '
+	git diff HEAD HEAD^ -- "$(pwd)/bfile" >f1.txt &&
+	git diff HEAD HEAD^ -- bfile >f2.txt &&
+	diff -u f1.txt f2.txt
+'
+
+test_expect_success 'rm using absolute path names' '
+	git rm "$(pwd)/afile" "$(pwd)/x/cfile" &&
+	test "bfile x/dfile" = "$(echo $(git ls-files))"
+'
+
+test_expect_success 'mv using absolute path names' '
+	git reset --hard &&
+	git mv "$(pwd)/afile" "$(pwd)/dfile" &&
+	test "bfile dfile x/cfile x/dfile" = "$(echo $(git ls-files))" &&
+	git mv "$(pwd)/dfile" afile &&
+	test "afile bfile x/cfile x/dfile" = "$(echo $(git ls-files))"
+'
+
+test_expect_success 'show using absolute path names' '
+	git reset --hard &&
+	git show "$(pwd)/bfile" >f1.txt &&
+	git show bfile >f2.txt &&
+	diff -u f1.txt f2.txt
+'
+
+test_expect_success 'add path in parent directory' '
+	(
+		d1="$(pwd)/x"
+		d2="$(pwd)/x/y"
+		mkdir -p x/y &&
+		echo hello1 >x/fa &&
+		echo hello2 >x/y/fb &&
+		cd x/y &&
+		git add "$d1/fa" "$d2/fb"
+	) &&
+	test "afile bfile x/cfile x/dfile x/fa x/y/fb" = "$(echo $(git ls-files))"
+'
+
+test_expect_success 'add a parent directory' '
+	(
+		d1="$(pwd)/a"
+		d2="$(pwd)/a/b"
+		d3="$(pwd)/a/b/c"
+		mkdir -p a/b/c
+		echo helloa >a/a1 &&
+		echo hellob >a/b/b1 &&
+		echo helloc >a/b/c/c1 &&
+		cd a/b/c &&
+		git add "$d2"
+	) &&
+	test "a/b/b1 a/b/c/c1 afile bfile x/cfile x/dfile x/fa x/y/fb" = "$(echo $(git ls-files))"
+'
+
+test_expect_failure 'add a directory outside the work tree' '
+	d1="(cd .. ; pwd)" &&
+	git add "$d1"
+'
+
+test_done
-- 
1.5.3.5.1.gb2df9

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-03 20:53         ` [PATCH] Make Git accept absolute path names for files within the work tree Robin Rosenberg
@ 2007-12-03 23:03           ` Junio C Hamano
  2007-12-04  1:43           ` Jeff King
  1 sibling, 0 replies; 21+ messages in thread
From: Junio C Hamano @ 2007-12-03 23:03 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Jeff King, Anatol Pomozov, git, Linus Torvalds

Robin Rosenberg <robin.rosenberg.lists@dewire.com> writes:

> I will not surrender to the fierce competion on this subject. Here is
> an update with hopefully correct test cases this time.

Yay, that's the spirit!

> ... Like Linus, this code does not resolve symlinks,
> but I forgot to state that it is by design.

Perhaps state it in the commit log message?

>  static const char *add_prefix(const char *prefix, const char *path)
>  {
> +	return prefix_path(prefix, prefix ? strlen(prefix) : 0, path);
>  }

Ok; prefix_path can get NULL prefix (not complaining; just a reminder in
the following discussion).

> diff --git a/setup.c b/setup.c
> index 2c7b5cb..1f0ec79 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -4,9 +4,62 @@
>  static int inside_git_dir = -1;
>  static int inside_work_tree = -1;
>  
> +static
> +const char *strip_work_tree_path(const char *prefix, int len, const char *path)

Style.  "static" not on its own line.

> +{
> +	const char *work_tree = get_git_work_tree();
> +	int n = strlen(work_tree);

Preconditions.

 * prefix could be NULL or path to the subdirectory the user's
   non-absolute path should be relative to, expressed as a relative path
   to the top of the work tree, including a trailing slash.  len is the
   length of the prefix string.

 * path was determined by the caller to be absolute.

 * It is assumed that get_git_work_tree() always gives absolute path,
   and without trailing slash.

 * Does prefix always NULL if we are at the top, and never "", I wonder.
   But lets assume that, too.

> +	if (strncmp(path, work_tree, n))
> +		return path;

If the given path is outside the work tree, return absolute as-is.
After this point we know path matches the work tree

> +	if (!prefix && !path[n])
> +		return path + n;

If we are at the top of the work tree and path names the top of the work
tree, then we return "".

> +	if (!prefix) {
> +		if (path[n] == '/')
> +			return path + n + 1;

If we are at the top of the work tree and the path names the top of the
work tree followed by a slash and then something, that is a path inside
the work tree.  Return relative to the top of the work tree.

> +		else
> +			if (path[n])
> +				return path;
> +			else
> +				return path + n;
> +	}

Style.  "else if" would give you shallower indentation.  We know path[n]
was not slash, and if it is not NUL then path is not inside the work
tree but is a neighbour (e.g. worktree is /a/b and path is /a/bc).
Return absolute.  Otherwise the path names the top of the work tree
itself so we return "".

Now at this point, we know we are in a subdirectory, because the above
if (!prefix) part always return.  So the test for prefix here is
unnecessary.

> +	if (prefix && !path[n])
> +		return path;

If we are in a subdirectory, and path names the top of the work tree, we
return it as-is (i.e. absolute).  This feels a bit inconsistent with the
part that follows, which tries to make things relative by using "../",
doesn't it?

> +	if (strncmp(path + n + 1, prefix, len - 1)) {

For !prefix case we have determined path is not merely a neighbour, but
we haven't checked that in this codepath.  If the parameters were like
this:

	path      = /axbc/e
        work_tree = /a
	n         = 2
        prefix    =    bc/
	len       = 3

this check says "fine, path is under prefix and we won't add ../
uplevels".  You need to have

	if (path[n] != '/')
        	return path;

before this strncmp() for it to work, don't you?

In addition, by comparing (len - 1) excluding the trailing slash of
prefix, I think you would let

	path      = /a/bcye

slip through as well.  That is inside the work_tree but outside of your
prefix.

> +		fprintf(stderr,"prefix mismatch\n");

Stray debugging fprintf.

> +		char *np;
> +		int i;
> +		int d=0;

Style "d = 0" (and "decl after statement").

> +		for (i = 0; i < len; ++i)

Style.  Distracts the reader by forcing him to wonder needlessly if
there is a particular reason for pre-increment of i instead of the usual
post-increment.

> +			if (prefix[i] == '/')
> +				d++;
> +		np = xmalloc(strlen(path + n) + d * 3 + 1);

At this point (assuming that the above if (strncmp()) rejected the path
outside the prefix correctly), we know that we would need to go d levels
up to reach the top of the work tree.

> +		for (i=0; i < d * 3; i += 3)

Style. "i = 0".

> +			strcpy(np + i, "../");
> +		strcpy(np + i, path + n + 1);

As path+n+1 is relative to the work tree, this will make it relative,
which is good.

> +		path = np;
> +		return np;
> +	}

Assuming the if (strncmp()) above correctly handled the path outside
prefix, we are dealing with the path that is inside prefix at this
point.  (len+n) is the length of the prefix directory expressed as an
absolute path.

> +	if (path[len + n] == '/')
> +		return path + len + n + 1;

So strip the absolute prefix would make the result relative to the
prefix directory.  Nice.

> +	else
> +		if (path[len + n])
> +			return path;

The same comment on "else if" applies.  path[len+n] was not slash so
path was not inside the prefix after all.  Oops?  The "if outside
prefix we uplevel with ../" logic above should have handled this case
and we should not be here.

> +		else
> +			return path + len + n;
> +}

path[len+n] was NUL, which means taht the user named the prefix
directory, and we return "".

Isn't this _overly_ complicated?  I think what this function wants to do
is:

 * See if path is outside the work tree, and return absolute if so.

 * Come up with the absolute path for the prefix (if NULL then that is
   the same as work tree) directory, without a trailing slash, and call
   it X.

 * Is path the same as the X?  If so, "" is what you want.

 * Is path a prefix of the "X/"?  If so strip "X/" and return.

 * Find the longuest common leading directory of path and "X/" and call
   it "C/".  Note that this is guaranteed to be inside work tree because
   we rejected paths outside work tree upfront.

 * Count slashes between "C/" and "X/" and come up with necessary
   uplevel "../".  Strip "C/" from path and prepend the uplevel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-03 20:53         ` [PATCH] Make Git accept absolute path names for files within the work tree Robin Rosenberg
  2007-12-03 23:03           ` Junio C Hamano
@ 2007-12-04  1:43           ` Jeff King
  2007-12-04  2:17             ` Johannes Schindelin
  1 sibling, 1 reply; 21+ messages in thread
From: Jeff King @ 2007-12-04  1:43 UTC (permalink / raw)
  To: Robin Rosenberg; +Cc: Junio C Hamano, Anatol Pomozov, git, Linus Torvalds

On Mon, Dec 03, 2007 at 09:53:30PM +0100, Robin Rosenberg wrote:

> code did not pass). Like Linus, this code does not resolve symlinks,
> but I forgot to state that it is by design. It solves my problem and

By design meaning "I didn't feel like implemening it because I do not
personally care" or "I have some reason not to resolve symlinks"?

-Peff

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04  1:43           ` Jeff King
@ 2007-12-04  2:17             ` Johannes Schindelin
  2007-12-04  6:42               ` Robin Rosenberg
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2007-12-04  2:17 UTC (permalink / raw)
  To: Jeff King
  Cc: Robin Rosenberg, Junio C Hamano, Anatol Pomozov, git, Linus Torvalds

Hi,

On Mon, 3 Dec 2007, Jeff King wrote:

> On Mon, Dec 03, 2007 at 09:53:30PM +0100, Robin Rosenberg wrote:
> 
> > code did not pass). Like Linus, this code does not resolve symlinks,
> > but I forgot to state that it is by design. It solves my problem and
> 
> By design meaning "I didn't feel like implemening it because I do not
> personally care" or "I have some reason not to resolve symlinks"?

IMHO those symlinks would be a nice thing in some corner cases, but 
penalise the common case.  So I tend to believe the latter.  (See also 
Linus' message why he talks about his preference for the die() code path.)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04  2:17             ` Johannes Schindelin
@ 2007-12-04  6:42               ` Robin Rosenberg
  2007-12-04 11:50                 ` Johannes Schindelin
  0 siblings, 1 reply; 21+ messages in thread
From: Robin Rosenberg @ 2007-12-04  6:42 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Jeff King, Junio C Hamano, Anatol Pomozov, git, Linus Torvalds

tisdag 04 december 2007 skrev Johannes Schindelin:
> Hi,
> 
> On Mon, 3 Dec 2007, Jeff King wrote:
> 
> > On Mon, Dec 03, 2007 at 09:53:30PM +0100, Robin Rosenberg wrote:
> > 
> > > code did not pass). Like Linus, this code does not resolve symlinks,
> > > but I forgot to state that it is by design. It solves my problem and
> > 
> > By design meaning "I didn't feel like implemening it because I do not
> > personally care" or "I have some reason not to resolve symlinks"?
> 
> IMHO those symlinks would be a nice thing in some corner cases, but 
> penalise the common case.  So I tend to believe the latter.  (See also 
> Linus' message why he talks about his preference for the die() code path.)

Actually the forme.... I don't mind it being fixed if it doesn't cost too much.

-- robin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04  6:42               ` Robin Rosenberg
@ 2007-12-04 11:50                 ` Johannes Schindelin
  2007-12-04 15:59                   ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Johannes Schindelin @ 2007-12-04 11:50 UTC (permalink / raw)
  To: Robin Rosenberg
  Cc: Jeff King, Junio C Hamano, Anatol Pomozov, git, Linus Torvalds

Hi,

On Tue, 4 Dec 2007, Robin Rosenberg wrote:

> tisdag 04 december 2007 skrev Johannes Schindelin:
> 
> > On Mon, 3 Dec 2007, Jeff King wrote:
> > 
> > > On Mon, Dec 03, 2007 at 09:53:30PM +0100, Robin Rosenberg wrote:
> > > 
> > > > code did not pass). Like Linus, this code does not resolve 
> > > > symlinks, but I forgot to state that it is by design. It solves my 
> > > > problem and
> > > 
> > > By design meaning "I didn't feel like implemening it because I do 
> > > not personally care" or "I have some reason not to resolve 
> > > symlinks"?
> > 
> > IMHO those symlinks would be a nice thing in some corner cases, but 
> > penalise the common case.  So I tend to believe the latter.  (See also 
> > Linus' message why he talks about his preference for the die() code 
> > path.)
> 
> Actually the forme.... I don't mind it being fixed if it doesn't cost 
> too much.

I do remember the hassles I went through with get_relative_cwd() until I 
broke down and used chdir() two times (ugly).  So the latter reason is 
good enough that you do not even have to admit to the former reason ;-)

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04 11:50                 ` Johannes Schindelin
@ 2007-12-04 15:59                   ` Linus Torvalds
  2007-12-04 22:08                     ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2007-12-04 15:59 UTC (permalink / raw)
  To: Johannes Schindelin
  Cc: Robin Rosenberg, Jeff King, Junio C Hamano, Anatol Pomozov, git



On Tue, 4 Dec 2007, Johannes Schindelin wrote:
> 
> I do remember the hassles I went through with get_relative_cwd() until I 
> broke down and used chdir() two times (ugly).

It really is a pretty heavy and complex operation in UNIX in general (and 
open to various races too), which is why I'd generally suggest avoiding it 
if you at all can.

The sad(?) part is, it's fairly trivial to do inside the Linux kernel (but 
probably not in other operating systems - it's only because of our 
superior dcache that we could do it). So a special system call would be no 
problem at all. But obviously very unportable indeed.

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04 15:59                   ` Linus Torvalds
@ 2007-12-04 22:08                     ` Jeff King
  2007-12-04 22:52                       ` Linus Torvalds
  0 siblings, 1 reply; 21+ messages in thread
From: Jeff King @ 2007-12-04 22:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Robin Rosenberg, Junio C Hamano,
	Anatol Pomozov, git

On Tue, Dec 04, 2007 at 07:59:43AM -0800, Linus Torvalds wrote:

> > I do remember the hassles I went through with get_relative_cwd() until I 
> > broke down and used chdir() two times (ugly).
> 
> It really is a pretty heavy and complex operation in UNIX in general (and 
> open to various races too), which is why I'd generally suggest avoiding it 
> if you at all can.

It is more expensive, though we will be doing it once per user-supplied
pathspec, so I don't know that it will actually have an impact.

I am concerned that not supporting symlinks will make this feature
unusably annoying for some users. I used to have a home directory that
had a symlink in it, and I frequently ran into these sorts of path
comparison issues ($HOME was /home/peff, so typing ~/repo/file pointed
there, but /home was a symlink to /mnt/data/home, so any routines that
normalize the cwd used /mnt/data/home/repo, and the two never matched
up).

Hrm. Looks like somebody has already helpfully implemented
make_absolute_path, so it would just require calling that on each
argument. Something like this on top of Robin's patch:

diff --git a/setup.c b/setup.c
index 4ee8024..e76c83c 100644
--- a/setup.c
+++ b/setup.c
@@ -58,7 +58,8 @@ const char *prefix_path(const char *prefix, int len, const char *path)
 {
 	const char *orig = path;
 	if (is_absolute_path(path))
-		path = strip_work_tree_path(prefix, len, path);
+		path = strip_work_tree_path(prefix, len,
+				xstrdup(make_absolute_path(path)));
 
 	for (;;) {
 		char c;

-Peff

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04 22:08                     ` Jeff King
@ 2007-12-04 22:52                       ` Linus Torvalds
  2007-12-06  6:12                         ` Jeff King
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2007-12-04 22:52 UTC (permalink / raw)
  To: Jeff King
  Cc: Johannes Schindelin, Robin Rosenberg, Junio C Hamano,
	Anatol Pomozov, git



On Tue, 4 Dec 2007, Jeff King wrote:
> 
> It is more expensive, though we will be doing it once per user-supplied
> pathspec, so I don't know that it will actually have an impact.

Well, I'm more worried about just bugs, actually.

Doing this right is actually rather hard. For example, our current 
"make_absolute_path()" is simply not very good, and it's almost impossible 
to *make* it very good.

Why? It relies on being able to get the current cwd, which isn't always 
even possible on all systems. What about unreadable directories? What 
about just so *deep* directories, that the cwd doesn't fit in the 1kB 
allocated for it? Both do happen (people use executable but non-readable 
directories for security sometimes). I'm also almost certain that you can 
confuse it by renaming directories while that thing is running, etc etc.

IOW, that whole thing is simply a bug waiting to happen. The fact that it 
apparently *always* runs whether needed or not just seems to make it worse 
(ie if we already know our cwd, and the absolute path we have already has 
that as a prefix, just strip it off, don't try to do anything complex, and 
leave the complex and fragile cases for the odd-ball when the simple 
approach doesn't work)

			Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] Make Git accept absolute path names for files within the work tree
  2007-12-04 22:52                       ` Linus Torvalds
@ 2007-12-06  6:12                         ` Jeff King
  0 siblings, 0 replies; 21+ messages in thread
From: Jeff King @ 2007-12-06  6:12 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Johannes Schindelin, Robin Rosenberg, Junio C Hamano,
	Anatol Pomozov, git

On Tue, Dec 04, 2007 at 02:52:15PM -0800, Linus Torvalds wrote:

> IOW, that whole thing is simply a bug waiting to happen. The fact that it 
> apparently *always* runs whether needed or not just seems to make it worse 
> (ie if we already know our cwd, and the absolute path we have already has 
> that as a prefix, just strip it off, don't try to do anything complex, and 
> leave the complex and fragile cases for the odd-ball when the simple 
> approach doesn't work)

Fair enough. Something like this then? It gets called only as a
last-ditch (though I think the 'return path' should simply be a die
-- what is the point of getting a pathspec that isn't in the repo?).

---
diff --git a/setup.c b/setup.c
index 4ee8024..fbb956e 100644
--- a/setup.c
+++ b/setup.c
@@ -5,13 +5,17 @@ static int inside_git_dir = -1;
 static int inside_work_tree = -1;
 
 static
-const char *strip_work_tree_path(const char *prefix, int len, const char *path)
+const char *strip_work_tree_path(const char *prefix, int len, const char *path,
+		int canonicalized)
 {
 	const char *work_tree = get_git_work_tree();
 	int n = strlen(work_tree);
 
 	if (strncmp(path, work_tree, n))
-		return path;
+		return canonicalized ?
+			path :
+			strip_work_tree_path(prefix, len,
+					xstrdup(make_absolute_path(path)), 1);
 
 	if (!prefix && !path[n])
 		return path + n;
@@ -58,7 +62,7 @@ const char *prefix_path(const char *prefix, int len, const char *path)
 {
 	const char *orig = path;
 	if (is_absolute_path(path))
-		path = strip_work_tree_path(prefix, len, path);
+		path = strip_work_tree_path(prefix, len, path, 0);
 
 	for (;;) {
 		char c;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2007-12-06  6:12 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-03  0:52 Incorrect git-blame result if I use full path to file Anatol Pomozov
2007-12-03  2:19 ` Junio C Hamano
2007-12-03  2:28   ` Jeff King
2007-12-03 17:26   ` Linus Torvalds
2007-12-03 18:09     ` Johannes Schindelin
2007-12-03 18:13       ` Linus Torvalds
2007-12-03 18:19         ` Linus Torvalds
2007-12-03  2:27 ` Jeff King
2007-12-03  2:40   ` Junio C Hamano
2007-12-03  2:49     ` Jeff King
2007-12-03  6:55       ` Robin Rosenberg
2007-12-03 20:53         ` [PATCH] Make Git accept absolute path names for files within the work tree Robin Rosenberg
2007-12-03 23:03           ` Junio C Hamano
2007-12-04  1:43           ` Jeff King
2007-12-04  2:17             ` Johannes Schindelin
2007-12-04  6:42               ` Robin Rosenberg
2007-12-04 11:50                 ` Johannes Schindelin
2007-12-04 15:59                   ` Linus Torvalds
2007-12-04 22:08                     ` Jeff King
2007-12-04 22:52                       ` Linus Torvalds
2007-12-06  6:12                         ` Jeff King

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.