All of lore.kernel.org
 help / color / mirror / Atom feed
* issue: strange `git diff --numstat` behavior.
@ 2018-10-12 18:23 Sergey Andreenko
  2018-10-13  8:13 ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Sergey Andreenko @ 2018-10-12 18:23 UTC (permalink / raw)
  To: git

git diff –numstat FOLDER1 FOLDER2 works strange when run from a git
controlled folder.

The output shrinks some symbols in the diff file paths.



For example:

Create a folder and call git init, for example: `C:\test`.

              mkdir C:\test

              cd C:\test

              git init

Create two folders with to diff. For example: ` C:\diff`, `C:\base`
and put a file in them with a diff (for example `1.txt` with `1` in
base and `1.txt` with 2 in diff).

              mkdir C:\base

              mkdir C:\diff

              echo. 12>C:\base\1.txt

              echo 13>C:\diff\1.txt

Run git diff:

pushd C:\

git.exe diff --numstat "C:\diff" "C:\base"

Output will be:

1       1       "C:\\diff/1.txt" => "C:\\base/1.txt"

Now move into C:\test and run it again:

pushd C:\test

git.exe diff --numstat "C:\diff" "C:\base"

1       1       "C:\\diff/1.txt" => "C:\\base/1.txt"

Now create a folder in `C:\test`, for example `one`:

mkdir one

cd one

git.exe diff --numstat "C:\diff" "C:\base"



output will be:

              0       1       {iff => ase}/1.txt

So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”).



Is any way to avoid that? I have checked several git versions and they
all do the same.



Commands to repro:

mkdir C:\test

cd C:\test

git init

mkdir C:\base

mkdir C:\diff

echo. 12>C:\base\1.txt

echo 13>C:\diff\1.txt

pushd C:\

git.exe diff --numstat "C:\diff" "C:\base"

pushd C:\test

git.exe diff --numstat "C:\diff" "C:\base"

mkdir one

cd one

git.exe diff --numstat "C:\diff" "C:\base"



Best Regards,

Sergey Andrenko

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: issue: strange `git diff --numstat` behavior.
  2018-10-12 18:23 issue: strange `git diff --numstat` behavior Sergey Andreenko
@ 2018-10-13  8:13 ` Junio C Hamano
  2018-10-18 18:38   ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt
  0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2018-10-13  8:13 UTC (permalink / raw)
  To: Sergey Andreenko; +Cc: git

Sergey Andreenko <andreenkosa@gmail.com> writes:

> git.exe diff --numstat "C:\diff" "C:\base"
> ...
> output will be:
>
>               0       1       {iff => ase}/1.txt
>
> So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”).
>
> Is any way to avoid that? I have checked several git versions and they
> all do the same.

Not an attempt to offer a solution (I don't do windows), but just
trying to see what random things we can try, I wonder what you'd get
if you did something like

	$ git diff --numstat //c/diff //c/base


^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-13  8:13 ` Junio C Hamano
@ 2018-10-18 18:38   ` Johannes Sixt
  2018-10-18 18:49     ` Stefan Beller
  2018-10-19  1:34     ` [PATCH] " Junio C Hamano
  0 siblings, 2 replies; 8+ messages in thread
From: Johannes Sixt @ 2018-10-18 18:38 UTC (permalink / raw)
  To: Sergey Andreenko; +Cc: Junio C Hamano, git

git diff can be invoked with absolute paths. Typically, this triggers
the --no-index case. Then the absolute paths remain in the file names
that are printed in the output.

There is one peculiarity, though: When the command is invoked from a
a sub-directory in a repository, then it is attempted to strip the
sub-directory from the beginning of relative paths. Yet, to detect a
relative path the code just checks for an initial forward slash.
This mistakes a Windows style path like D:/base as a relative path
and the output looks like this, for example:

  D:\test\one>git -P diff --numstat D:\base D:\diff
  1       1       {ase => iff}/1.txt

where the correct output should be

  D:\test\one>git -P diff --numstat D:\base D:\diff
  1       1       D:/{base => diff}/1.txt

If the sub-directory where 'git diff' is invoked is sufficiently deep
that the prefix becomes longer than the path to be printed, then the
subsequent code even accesses the paths out of bounds!

Use is_absolute_path() to detect Windows style absolute paths.

One might wonder whether the check for a directory separator that
is visible in the patch context should be changed from == '/' to
is_dir_sep() or not. It turns out not to be necessary. That code
only ever investigates paths that have undergone pathspec
normalization, after which there are only forward slashes even on
Windows.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
 diff.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index f0c7557b40..d18eb198f2 100644
--- a/diff.c
+++ b/diff.c
@@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one)
 static void strip_prefix(int prefix_length, const char **namep, const char **otherp)
 {
 	/* Strip the prefix but do not molest /dev/null and absolute paths */
-	if (*namep && **namep != '/') {
+	if (*namep && !is_absolute_path(*namep)) {
 		*namep += prefix_length;
 		if (**namep == '/')
 			++*namep;
 	}
-	if (*otherp && **otherp != '/') {
+	if (*otherp && !is_absolute_path(*otherp)) {
 		*otherp += prefix_length;
 		if (**otherp == '/')
 			++*otherp;
-- 
2.19.1.406.g1aa3f475f3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-18 18:38   ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt
@ 2018-10-18 18:49     ` Stefan Beller
  2018-10-18 19:11       ` Johannes Sixt
  2018-10-19  1:34     ` [PATCH] " Junio C Hamano
  1 sibling, 1 reply; 8+ messages in thread
From: Stefan Beller @ 2018-10-18 18:49 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: andreenkosa, Junio C Hamano, git

On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote:

> There is one peculiarity, though: [...]

The explanation makes sense, and the code looks good.
Do we want to have a test for this niche case?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-18 18:49     ` Stefan Beller
@ 2018-10-18 19:11       ` Johannes Sixt
  2018-10-19 16:58         ` [PATCH v2] " Johannes Sixt
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Sixt @ 2018-10-18 19:11 UTC (permalink / raw)
  To: Stefan Beller; +Cc: andreenkosa, Junio C Hamano, git

Am 18.10.18 um 20:49 schrieb Stefan Beller:
> On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote:
> 
>> There is one peculiarity, though: [...]
> 
> The explanation makes sense, and the code looks good.
> Do we want to have a test for this niche case?
> 

Good point. That would be the following. But give me a day or two to
cross-check on Windows and whether it really catches the breakage.

diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 453e6c35eb..6e0dd6f9e5 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit)
 	test_cmp expect actual.head
 '
 
+test_expect_success 'diff --no-index from repo subdir with absolute paths' '
+	cat <<-EOF >expect &&
+	1	1	$(pwd)/non/git/{a => b}
+	EOF
+	test_expect_code 1 \
+		git -C repo/sub diff --numstat \
+		"$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual &&
+	test_cmp expect actual
+'
+
 test_done

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-18 18:38   ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt
  2018-10-18 18:49     ` Stefan Beller
@ 2018-10-19  1:34     ` Junio C Hamano
  1 sibling, 0 replies; 8+ messages in thread
From: Junio C Hamano @ 2018-10-19  1:34 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Sergey Andreenko, git

Johannes Sixt <j6t@kdbg.org> writes:

> Use is_absolute_path() to detect Windows style absolute paths.

When cd676a51 ("diff --relative: output paths as relative to the
current subdirectory", 2008-02-12) was done, neither "is_dir_sep()"
nor "has_dos_drive_prefix()" existed---the latter had to wait until
25fe217b ("Windows: Treat Windows style path names.", 2008-03-05),
but we didn't notice that the above code needs to use the Windows
aware is_absolute_path() when we applied the change.

> One might wonder whether the check for a directory separator that
> is visible in the patch context should be changed from == '/' to
> is_dir_sep() or not. It turns out not to be necessary. That code
> only ever investigates paths that have undergone pathspec
> normalization, after which there are only forward slashes even on
> Windows.

Thanks for carefully explaining.

>  static void strip_prefix(int prefix_length, const char **namep, const char **otherp)
>  {
>  	/* Strip the prefix but do not molest /dev/null and absolute paths */
> -	if (*namep && **namep != '/') {
> +	if (*namep && !is_absolute_path(*namep)) {
>  		*namep += prefix_length;
>  		if (**namep == '/')
>  			++*namep;
>  	}
> -	if (*otherp && **otherp != '/') {
> +	if (*otherp && !is_absolute_path(*otherp)) {
>  		*otherp += prefix_length;
>  		if (**otherp == '/')
>  			++*otherp;

When I read the initial report and guessed the root cause without
looking at the code, I didn't expect the problematic area to be this
isolated and the solution to be this simple.

Nicely done.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH v2] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-18 19:11       ` Johannes Sixt
@ 2018-10-19 16:58         ` Johannes Sixt
  2018-10-19 17:04           ` Stefan Beller
  0 siblings, 1 reply; 8+ messages in thread
From: Johannes Sixt @ 2018-10-19 16:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Stefan Beller, andreenkosa, git

git diff can be invoked with absolute paths. Typically, this triggers
the --no-index case. Then the absolute paths remain in the file names
that are printed in the output.

There is one peculiarity, though: When the command is invoked from a
a sub-directory in a repository, then it is attempted to strip the
sub-directory from the beginning of relative paths. Yet, to detect a
relative path the code just checks for an initial forward slash.
This mistakes a Windows style path like "D:/base" as a relative path
and the output looks like this, for example:

  D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff
  1       1       ir/{base => diff}/1.txt

where the correct output should be

  D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff
  1       1       D:/dir/{base => diff}/1.txt

If the sub-directory where 'git diff' is invoked is sufficiently deep
that the prefix becomes longer than the path to be printed, then the
subsequent code accesses the path out of bounds.

Use is_absolute_path() to detect Windows style absolute paths.

One might wonder whether the check for a directory separator that
is visible in the patch context should be changed from == '/' to
is_dir_sep() or not. It turns out not to be necessary. That code
only ever investigates paths that have undergone pathspec
normalization, after which there are only forward slashes even on
Windows.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
---
v2:
- added a test that demonstrates the problem on Windows
- changed the example in the commit message to clarify that this is
  about truncated paths, not about failure to detect common prefix

 diff.c                   |  4 ++--
 t/t4053-diff-no-index.sh | 10 ++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/diff.c b/diff.c
index f0c7557b40..d18eb198f2 100644
--- a/diff.c
+++ b/diff.c
@@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one)
 static void strip_prefix(int prefix_length, const char **namep, const char **otherp)
 {
 	/* Strip the prefix but do not molest /dev/null and absolute paths */
-	if (*namep && **namep != '/') {
+	if (*namep && !is_absolute_path(*namep)) {
 		*namep += prefix_length;
 		if (**namep == '/')
 			++*namep;
 	}
-	if (*otherp && **otherp != '/') {
+	if (*otherp && !is_absolute_path(*otherp)) {
 		*otherp += prefix_length;
 		if (**otherp == '/')
 			++*otherp;
diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh
index 453e6c35eb..6e0dd6f9e5 100755
--- a/t/t4053-diff-no-index.sh
+++ b/t/t4053-diff-no-index.sh
@@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit)
 	test_cmp expect actual.head
 '
 
+test_expect_success 'diff --no-index from repo subdir with absolute paths' '
+	cat <<-EOF >expect &&
+	1	1	$(pwd)/non/git/{a => b}
+	EOF
+	test_expect_code 1 \
+		git -C repo/sub diff --numstat \
+		"$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual &&
+	test_cmp expect actual
+'
+
 test_done
-- 
2.19.1.406.g1aa3f475f3

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH v2] diff: don't attempt to strip prefix from absolute Windows paths
  2018-10-19 16:58         ` [PATCH v2] " Johannes Sixt
@ 2018-10-19 17:04           ` Stefan Beller
  0 siblings, 0 replies; 8+ messages in thread
From: Stefan Beller @ 2018-10-19 17:04 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Junio C Hamano, andreenkosa, git

> +test_expect_success 'diff --no-index from repo subdir with absolute paths' '

I was late looking at the test, and was about to propose to guard it to run only
on Windows (as this test seems of little use in other OS), but after thinking
about it I think we should keep it as-is, as there may be other OS that have
interesting absolute path which I may be unaware of.

Reviewed-by: Stefan Beller <sbeller@google.com>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2018-10-19 17:04 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-12 18:23 issue: strange `git diff --numstat` behavior Sergey Andreenko
2018-10-13  8:13 ` Junio C Hamano
2018-10-18 18:38   ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt
2018-10-18 18:49     ` Stefan Beller
2018-10-18 19:11       ` Johannes Sixt
2018-10-19 16:58         ` [PATCH v2] " Johannes Sixt
2018-10-19 17:04           ` Stefan Beller
2018-10-19  1:34     ` [PATCH] " Junio C Hamano

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.