* issue: strange `git diff --numstat` behavior. @ 2018-10-12 18:23 Sergey Andreenko 2018-10-13 8:13 ` Junio C Hamano 0 siblings, 1 reply; 8+ messages in thread From: Sergey Andreenko @ 2018-10-12 18:23 UTC (permalink / raw) To: git git diff –numstat FOLDER1 FOLDER2 works strange when run from a git controlled folder. The output shrinks some symbols in the diff file paths. For example: Create a folder and call git init, for example: `C:\test`. mkdir C:\test cd C:\test git init Create two folders with to diff. For example: ` C:\diff`, `C:\base` and put a file in them with a diff (for example `1.txt` with `1` in base and `1.txt` with 2 in diff). mkdir C:\base mkdir C:\diff echo. 12>C:\base\1.txt echo 13>C:\diff\1.txt Run git diff: pushd C:\ git.exe diff --numstat "C:\diff" "C:\base" Output will be: 1 1 "C:\\diff/1.txt" => "C:\\base/1.txt" Now move into C:\test and run it again: pushd C:\test git.exe diff --numstat "C:\diff" "C:\base" 1 1 "C:\\diff/1.txt" => "C:\\base/1.txt" Now create a folder in `C:\test`, for example `one`: mkdir one cd one git.exe diff --numstat "C:\diff" "C:\base" output will be: 0 1 {iff => ase}/1.txt So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”). Is any way to avoid that? I have checked several git versions and they all do the same. Commands to repro: mkdir C:\test cd C:\test git init mkdir C:\base mkdir C:\diff echo. 12>C:\base\1.txt echo 13>C:\diff\1.txt pushd C:\ git.exe diff --numstat "C:\diff" "C:\base" pushd C:\test git.exe diff --numstat "C:\diff" "C:\base" mkdir one cd one git.exe diff --numstat "C:\diff" "C:\base" Best Regards, Sergey Andrenko ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: issue: strange `git diff --numstat` behavior. 2018-10-12 18:23 issue: strange `git diff --numstat` behavior Sergey Andreenko @ 2018-10-13 8:13 ` Junio C Hamano 2018-10-18 18:38 ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt 0 siblings, 1 reply; 8+ messages in thread From: Junio C Hamano @ 2018-10-13 8:13 UTC (permalink / raw) To: Sergey Andreenko; +Cc: git Sergey Andreenko <andreenkosa@gmail.com> writes: > git.exe diff --numstat "C:\diff" "C:\base" > ... > output will be: > > 0 1 {iff => ase}/1.txt > > So (folder_name_length) symbols were cut from the path (“C:\\d” and “C:\\b”). > > Is any way to avoid that? I have checked several git versions and they > all do the same. Not an attempt to offer a solution (I don't do windows), but just trying to see what random things we can try, I wonder what you'd get if you did something like $ git diff --numstat //c/diff //c/base ^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-13 8:13 ` Junio C Hamano @ 2018-10-18 18:38 ` Johannes Sixt 2018-10-18 18:49 ` Stefan Beller 2018-10-19 1:34 ` [PATCH] " Junio C Hamano 0 siblings, 2 replies; 8+ messages in thread From: Johannes Sixt @ 2018-10-18 18:38 UTC (permalink / raw) To: Sergey Andreenko; +Cc: Junio C Hamano, git git diff can be invoked with absolute paths. Typically, this triggers the --no-index case. Then the absolute paths remain in the file names that are printed in the output. There is one peculiarity, though: When the command is invoked from a a sub-directory in a repository, then it is attempted to strip the sub-directory from the beginning of relative paths. Yet, to detect a relative path the code just checks for an initial forward slash. This mistakes a Windows style path like D:/base as a relative path and the output looks like this, for example: D:\test\one>git -P diff --numstat D:\base D:\diff 1 1 {ase => iff}/1.txt where the correct output should be D:\test\one>git -P diff --numstat D:\base D:\diff 1 1 D:/{base => diff}/1.txt If the sub-directory where 'git diff' is invoked is sufficiently deep that the prefix becomes longer than the path to be printed, then the subsequent code even accesses the paths out of bounds! Use is_absolute_path() to detect Windows style absolute paths. One might wonder whether the check for a directory separator that is visible in the patch context should be changed from == '/' to is_dir_sep() or not. It turns out not to be necessary. That code only ever investigates paths that have undergone pathspec normalization, after which there are only forward slashes even on Windows. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- diff.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d18eb198f2 100644 --- a/diff.c +++ b/diff.c @@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one) static void strip_prefix(int prefix_length, const char **namep, const char **otherp) { /* Strip the prefix but do not molest /dev/null and absolute paths */ - if (*namep && **namep != '/') { + if (*namep && !is_absolute_path(*namep)) { *namep += prefix_length; if (**namep == '/') ++*namep; } - if (*otherp && **otherp != '/') { + if (*otherp && !is_absolute_path(*otherp)) { *otherp += prefix_length; if (**otherp == '/') ++*otherp; -- 2.19.1.406.g1aa3f475f3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-18 18:38 ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt @ 2018-10-18 18:49 ` Stefan Beller 2018-10-18 19:11 ` Johannes Sixt 2018-10-19 1:34 ` [PATCH] " Junio C Hamano 1 sibling, 1 reply; 8+ messages in thread From: Stefan Beller @ 2018-10-18 18:49 UTC (permalink / raw) To: Johannes Sixt; +Cc: andreenkosa, Junio C Hamano, git On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote: > There is one peculiarity, though: [...] The explanation makes sense, and the code looks good. Do we want to have a test for this niche case? ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-18 18:49 ` Stefan Beller @ 2018-10-18 19:11 ` Johannes Sixt 2018-10-19 16:58 ` [PATCH v2] " Johannes Sixt 0 siblings, 1 reply; 8+ messages in thread From: Johannes Sixt @ 2018-10-18 19:11 UTC (permalink / raw) To: Stefan Beller; +Cc: andreenkosa, Junio C Hamano, git Am 18.10.18 um 20:49 schrieb Stefan Beller: > On Thu, Oct 18, 2018 at 11:38 AM Johannes Sixt <j6t@kdbg.org> wrote: > >> There is one peculiarity, though: [...] > > The explanation makes sense, and the code looks good. > Do we want to have a test for this niche case? > Good point. That would be the following. But give me a day or two to cross-check on Windows and whether it really catches the breakage. diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 453e6c35eb..6e0dd6f9e5 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit) test_cmp expect actual.head ' +test_expect_success 'diff --no-index from repo subdir with absolute paths' ' + cat <<-EOF >expect && + 1 1 $(pwd)/non/git/{a => b} + EOF + test_expect_code 1 \ + git -C repo/sub diff --numstat \ + "$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual && + test_cmp expect actual +' + test_done ^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v2] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-18 19:11 ` Johannes Sixt @ 2018-10-19 16:58 ` Johannes Sixt 2018-10-19 17:04 ` Stefan Beller 0 siblings, 1 reply; 8+ messages in thread From: Johannes Sixt @ 2018-10-19 16:58 UTC (permalink / raw) To: Junio C Hamano; +Cc: Stefan Beller, andreenkosa, git git diff can be invoked with absolute paths. Typically, this triggers the --no-index case. Then the absolute paths remain in the file names that are printed in the output. There is one peculiarity, though: When the command is invoked from a a sub-directory in a repository, then it is attempted to strip the sub-directory from the beginning of relative paths. Yet, to detect a relative path the code just checks for an initial forward slash. This mistakes a Windows style path like "D:/base" as a relative path and the output looks like this, for example: D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff 1 1 ir/{base => diff}/1.txt where the correct output should be D:\dir\test\one>git -P diff --numstat D:\dir\base D:\dir\diff 1 1 D:/dir/{base => diff}/1.txt If the sub-directory where 'git diff' is invoked is sufficiently deep that the prefix becomes longer than the path to be printed, then the subsequent code accesses the path out of bounds. Use is_absolute_path() to detect Windows style absolute paths. One might wonder whether the check for a directory separator that is visible in the patch context should be changed from == '/' to is_dir_sep() or not. It turns out not to be necessary. That code only ever investigates paths that have undergone pathspec normalization, after which there are only forward slashes even on Windows. Signed-off-by: Johannes Sixt <j6t@kdbg.org> --- v2: - added a test that demonstrates the problem on Windows - changed the example in the commit message to clarify that this is about truncated paths, not about failure to detect common prefix diff.c | 4 ++-- t/t4053-diff-no-index.sh | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/diff.c b/diff.c index f0c7557b40..d18eb198f2 100644 --- a/diff.c +++ b/diff.c @@ -4267,12 +4267,12 @@ static void diff_fill_oid_info(struct diff_filespec *one) static void strip_prefix(int prefix_length, const char **namep, const char **otherp) { /* Strip the prefix but do not molest /dev/null and absolute paths */ - if (*namep && **namep != '/') { + if (*namep && !is_absolute_path(*namep)) { *namep += prefix_length; if (**namep == '/') ++*namep; } - if (*otherp && **otherp != '/') { + if (*otherp && !is_absolute_path(*otherp)) { *otherp += prefix_length; if (**otherp == '/') ++*otherp; diff --git a/t/t4053-diff-no-index.sh b/t/t4053-diff-no-index.sh index 453e6c35eb..6e0dd6f9e5 100755 --- a/t/t4053-diff-no-index.sh +++ b/t/t4053-diff-no-index.sh @@ -127,4 +127,14 @@ test_expect_success 'diff --no-index from repo subdir respects config (implicit) test_cmp expect actual.head ' +test_expect_success 'diff --no-index from repo subdir with absolute paths' ' + cat <<-EOF >expect && + 1 1 $(pwd)/non/git/{a => b} + EOF + test_expect_code 1 \ + git -C repo/sub diff --numstat \ + "$(pwd)/non/git/a" "$(pwd)/non/git/b" >actual && + test_cmp expect actual +' + test_done -- 2.19.1.406.g1aa3f475f3 ^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v2] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-19 16:58 ` [PATCH v2] " Johannes Sixt @ 2018-10-19 17:04 ` Stefan Beller 0 siblings, 0 replies; 8+ messages in thread From: Stefan Beller @ 2018-10-19 17:04 UTC (permalink / raw) To: Johannes Sixt; +Cc: Junio C Hamano, andreenkosa, git > +test_expect_success 'diff --no-index from repo subdir with absolute paths' ' I was late looking at the test, and was about to propose to guard it to run only on Windows (as this test seems of little use in other OS), but after thinking about it I think we should keep it as-is, as there may be other OS that have interesting absolute path which I may be unaware of. Reviewed-by: Stefan Beller <sbeller@google.com> ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] diff: don't attempt to strip prefix from absolute Windows paths 2018-10-18 18:38 ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt 2018-10-18 18:49 ` Stefan Beller @ 2018-10-19 1:34 ` Junio C Hamano 1 sibling, 0 replies; 8+ messages in thread From: Junio C Hamano @ 2018-10-19 1:34 UTC (permalink / raw) To: Johannes Sixt; +Cc: Sergey Andreenko, git Johannes Sixt <j6t@kdbg.org> writes: > Use is_absolute_path() to detect Windows style absolute paths. When cd676a51 ("diff --relative: output paths as relative to the current subdirectory", 2008-02-12) was done, neither "is_dir_sep()" nor "has_dos_drive_prefix()" existed---the latter had to wait until 25fe217b ("Windows: Treat Windows style path names.", 2008-03-05), but we didn't notice that the above code needs to use the Windows aware is_absolute_path() when we applied the change. > One might wonder whether the check for a directory separator that > is visible in the patch context should be changed from == '/' to > is_dir_sep() or not. It turns out not to be necessary. That code > only ever investigates paths that have undergone pathspec > normalization, after which there are only forward slashes even on > Windows. Thanks for carefully explaining. > static void strip_prefix(int prefix_length, const char **namep, const char **otherp) > { > /* Strip the prefix but do not molest /dev/null and absolute paths */ > - if (*namep && **namep != '/') { > + if (*namep && !is_absolute_path(*namep)) { > *namep += prefix_length; > if (**namep == '/') > ++*namep; > } > - if (*otherp && **otherp != '/') { > + if (*otherp && !is_absolute_path(*otherp)) { > *otherp += prefix_length; > if (**otherp == '/') > ++*otherp; When I read the initial report and guessed the root cause without looking at the code, I didn't expect the problematic area to be this isolated and the solution to be this simple. Nicely done. ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2018-10-19 17:04 UTC | newest] Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-10-12 18:23 issue: strange `git diff --numstat` behavior Sergey Andreenko 2018-10-13 8:13 ` Junio C Hamano 2018-10-18 18:38 ` [PATCH] diff: don't attempt to strip prefix from absolute Windows paths Johannes Sixt 2018-10-18 18:49 ` Stefan Beller 2018-10-18 19:11 ` Johannes Sixt 2018-10-19 16:58 ` [PATCH v2] " Johannes Sixt 2018-10-19 17:04 ` Stefan Beller 2018-10-19 1:34 ` [PATCH] " Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.