git diff –numstat FOLDER1 FOLDER2 works strange when run from a git controlled folder. The output shrinks some symbols in the diff file paths. For example: Create a folder and call git init, for example: `C:\test`. mkdir C:\test cd C:\test git init Create two folders with to diff. For example: ` C:\diff`, `C:\base` and put a file in them with a diff (for example `1.txt` with `1` in base and `1.txt` with 2 in diff). mkdir C:\base mkdir C:\diff echo. 12>C:\base\1.txt echo 13>C:\diff\1.txt Run git diff: pushd C:\ git.exe diff --numstat "C:\diff" "C:\base" Output will be: 1 1 "C:\\diff/1.txt" => "C:\\base/1.txt" Now move into C:\test and run it again: pushd C:\test git.exe diff --numstat "C:\diff" "C:\base" 1 1 "C:\\diff/1.txt" => "C:\\base/1.txt" Now create a folder in `C:\test`, for example `one`: mkdir one cd one git.exe diff --numstat "C:\diff" "C:\base" output will be: 0 1 {iff => ase}/1.txt So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”). Is any way to avoid that? I have checked several git versions and they all do the same. Commands to repro: mkdir C:\test cd C:\test git init mkdir C:\base mkdir C:\diff echo. 12>C:\base\1.txt echo 13>C:\diff\1.txt pushd C:\ git.exe diff --numstat "C:\diff" "C:\base" pushd C:\test git.exe diff --numstat "C:\diff" "C:\base" mkdir one cd one git.exe diff --numstat "C:\diff" "C:\base" Best Regards, Sergey Andrenko
Sergey Andreenko <andreenkosa@gmail.com> writes:
> git.exe diff --numstat "C:\diff" "C:\base"
> ...
> output will be:
>
> 0 1 {iff => ase}/1.txt
>
> So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”).
>
> Is any way to avoid that? I have checked several git versions and they
> all do the same.
Not an attempt to offer a solution (I don't do windows), but just
trying to see what random things we can try, I wonder what you'd get
if you did something like
$ git diff --numstat //c/diff //c/base
git diff can be invoked with absolute paths. Typically, this triggers the --no-index case. Then the absolute paths remain in the file names that are printed in the output. There is one peculiarity, though: When the command is invoked from a a sub-directory in a repository, then it is attempted to strip the sub-directory from the beginning of relative paths. Yet, to detect a relative path the code just checks for an initial forward slash. This mistakes a Windows style path like D:/base as a relative path and the output looks like this, for example: D:\test\one>git -P diff --numstat D:\base D:\diff 1 1 {ase => iff}/1.txt where the correct output should be D:\test\one>git -P diff --numstat D:\base D:\diff 1 1 D:/{base => diff}/1.txt If the sub-directory where 'git diff' is invoked is sufficiently deep that the prefix becomes longer than the path to be printed, then the subsequent code even accesses the paths out of bounds! Use is_absolute_path() to detect Windows style absolute paths. One might wonder whether the check for a directory separator that is visible in the patch context should be changed from == '/' to is_dir_sep() or not. It turns out not to be necessary. That code only ever investigates paths that have undergone pathspec normalization, after which there are only forward slashes even on Windows. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d18eb198f2 100644 --- a/diff.c +++ b/diff.c @@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one) static void strip_prefix(int prefix_length, const char **namep, const char **otherp) { /* Strip the prefix but do not molest /dev/null and absolute paths */ - if (*namep && **namep != '/') { + if (*namep && !is_absolute_path(*namep)) { *namep += prefix_length; if (**namep == '/') ++*namep; } - if (*otherp && **otherp != '/') { + if (*otherp && !is_absolute_path(*otherp)) { *otherp += prefix_length; if (**otherp == '/') ++*otherp; -- 2.19.1.406.g1aa3f475f3
On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote:
> There is one peculiarity, though: [...]
The explanation makes sense, and the code looks good.
Do we want to have a test for this niche case?
Am 18.10.18 um 20:49 schrieb Stefan Beller: > On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote: > >> There is one peculiarity, though: [...] > > The explanation makes sense, and the code looks good. > Do we want to have a test for this niche case? > Good point. That would be the following. But give me a day or two to cross-check on Windows and whether it really catches the breakage. diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 453e6c35eb..6e0dd6f9e5 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit) test_cmp expect actual.head ' +test_expect_success 'diff --no-index from repo subdir with absolute paths' ' + cat <<-EOF >expect && + 1 1 $(pwd)/non/git/{a => b} + EOF + test_expect_code 1 \ + git -C repo/sub diff --numstat \ + "$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual && + test_cmp expect actual +' + test_done
Johannes Sixt <j6t@kdbg.org> writes: > Use is_absolute_path() to detect Windows style absolute paths. When cd676a51 ("diff --relative: output paths as relative to the current subdirectory", 2008-02-12) was done, neither "is_dir_sep()" nor "has_dos_drive_prefix()" existed---the latter had to wait until 25fe217b ("Windows: Treat Windows style path names.", 2008-03-05), but we didn't notice that the above code needs to use the Windows aware is_absolute_path() when we applied the change. > One might wonder whether the check for a directory separator that > is visible in the patch context should be changed from == '/' to > is_dir_sep() or not. It turns out not to be necessary. That code > only ever investigates paths that have undergone pathspec > normalization, after which there are only forward slashes even on > Windows. Thanks for carefully explaining. > static void strip_prefix(int prefix_length, const char **namep, const char **otherp) > { > /* Strip the prefix but do not molest /dev/null and absolute paths */ > - if (*namep && **namep != '/') { > + if (*namep && !is_absolute_path(*namep)) { > *namep += prefix_length; > if (**namep == '/') > ++*namep; > } > - if (*otherp && **otherp != '/') { > + if (*otherp && !is_absolute_path(*otherp)) { > *otherp += prefix_length; > if (**otherp == '/') > ++*otherp; When I read the initial report and guessed the root cause without looking at the code, I didn't expect the problematic area to be this isolated and the solution to be this simple. Nicely done.
git diff can be invoked with absolute paths. Typically, this triggers the --no-index case. Then the absolute paths remain in the file names that are printed in the output. There is one peculiarity, though: When the command is invoked from a a sub-directory in a repository, then it is attempted to strip the sub-directory from the beginning of relative paths. Yet, to detect a relative path the code just checks for an initial forward slash. This mistakes a Windows style path like "D:/base" as a relative path and the output looks like this, for example: D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff 1 1 ir/{base => diff}/1.txt where the correct output should be D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff 1 1 D:/dir/{base => diff}/1.txt If the sub-directory where 'git diff' is invoked is sufficiently deep that the prefix becomes longer than the path to be printed, then the subsequent code accesses the path out of bounds. Use is_absolute_path() to detect Windows style absolute paths. One might wonder whether the check for a directory separator that is visible in the patch context should be changed from == '/' to is_dir_sep() or not. It turns out not to be necessary. That code only ever investigates paths that have undergone pathspec normalization, after which there are only forward slashes even on Windows. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- v2: - added a test that demonstrates the problem on Windows - changed the example in the commit message to clarify that this is about truncated paths, not about failure to detect common prefix diff.c | 4 ++-- t/t4053-diff-no-index.sh | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d18eb198f2 100644 --- a/diff.c +++ b/diff.c @@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one) static void strip_prefix(int prefix_length, const char **namep, const char **otherp) { /* Strip the prefix but do not molest /dev/null and absolute paths */ - if (*namep && **namep != '/') { + if (*namep && !is_absolute_path(*namep)) { *namep += prefix_length; if (**namep == '/') ++*namep; } - if (*otherp && **otherp != '/') { + if (*otherp && !is_absolute_path(*otherp)) { *otherp += prefix_length; if (**otherp == '/') ++*otherp; diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 453e6c35eb..6e0dd6f9e5 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit) test_cmp expect actual.head ' +test_expect_success 'diff --no-index from repo subdir with absolute paths' ' + cat <<-EOF >expect && + 1 1 $(pwd)/non/git/{a => b} + EOF + test_expect_code 1 \ + git -C repo/sub diff --numstat \ + "$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual && + test_cmp expect actual +' + test_done -- 2.19.1.406.g1aa3f475f3
> +test_expect_success 'diff --no-index from repo subdir with absolute paths' '
I was late looking at the test, and was about to propose to guard it to run only
on Windows (as this test seems of little use in other OS), but after thinking
about it I think we should keep it as-is, as there may be other OS that have
interesting absolute path which I may be unaware of.
Reviewed-by: Stefan Beller <sbeller@google.com>