* core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` @ 2017-08-01 18:24 Anthony Sottile 2017-08-01 20:47 ` Torsten Bögershausen 0 siblings, 1 reply; 36+ messages in thread From: Anthony Sottile @ 2017-08-01 18:24 UTC (permalink / raw) To: Git Mailing List Here's my minimal reproduction -- it's slightly far-fetched in that it involves *committing crlf* and then using `autocrlf=true` (commit lf, check out crlf). ``` #!/bin/bash set -ex rm -rf foo git init foo cd foo # Commit crlf into repository git config --local core.autocrlf false python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' git add foo git commit -m "Initial commit with crlf" # Change whitespace mode to autocrlf, "commit lf, checkout crlf" git config --local core.autocrlf true python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' # Generate a patch, check it out, restore it git diff --ignore-submodules --binary --no-color --no-ext-diff > patch python3 -c 'print(open("patch", "rb").read())' git checkout -- . # I expect this to succeed, it fails git apply patch ``` And here's the output: ``` + rm -rf foo + git init foo Initialized empty Git repository in /tmp/foo/.git/ + cd foo + git config --local core.autocrlf false + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' + git add foo + git commit -m 'Initial commit with crlf' [master (root-commit) 02d3246] Initial commit with crlf 1 file changed, 2 insertions(+) create mode 100644 foo + git config --local core.autocrlf true + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' + git diff --ignore-submodules --binary --no-color --no-ext-diff + python3 -c 'print(open("patch", "rb").read())' b'diff --git a/foo b/foo\nindex bd956ea..fbf7936 100644\n--- a/foo\n+++ b/foo\n@@ -1,2 +1,5 @@\n 1\r\n 2\r\n+\r\n+\r\n+\r\n' + git checkout -- . + git apply patch patch:8: trailing whitespace. patch:9: trailing whitespace. patch:10: trailing whitespace. error: patch failed: foo:1 error: foo: patch does not apply ``` I also tried with `git apply --ignore-whitespace`, but this causes the line endings of the existing contents to be changed to *lf* (there may be two bugs here?) Thanks, Anthony ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-01 18:24 core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Anthony Sottile @ 2017-08-01 20:47 ` Torsten Bögershausen 2017-08-01 20:58 ` Anthony Sottile 0 siblings, 1 reply; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-01 20:47 UTC (permalink / raw) To: Anthony Sottile, Git Mailing List On 08/01/2017 08:24 PM, Anthony Sottile wrote: > Here's my minimal reproduction -- it's slightly far-fetched in that it > involves*committing crlf* and > then using `autocrlf=true` (commit lf, check out crlf). > > ``` > #!/bin/bash > set -ex > > rm -rf foo > git init foo > cd foo > > # Commit crlf into repository > git config --local core.autocrlf false > python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' > git add foo > git commit -m "Initial commit with crlf" > > # Change whitespace mode to autocrlf, "commit lf, checkout crlf" > git config --local core.autocrlf true > python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' > > # Generate a patch, check it out, restore it > git diff --ignore-submodules --binary --no-color --no-ext-diff > patch > python3 -c 'print(open("patch", "rb").read())' > git checkout -- . > # I expect this to succeed, it fails > git apply patch > ``` > > And here's the output: > > ``` > + rm -rf foo > + git init foo > Initialized empty Git repository in/tmp/foo/.git/ > + cd foo > + git config --local core.autocrlf false > + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' > + git add foo > + git commit -m 'Initial commit with crlf' > [master (root-commit) 02d3246] Initial commit with crlf > 1 file changed, 2 insertions(+) > create mode 100644 foo > + git config --local core.autocrlf true > + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' > + git diff --ignore-submodules --binary --no-color --no-ext-diff > + python3 -c 'print(open("patch", "rb").read())' > b'diff --git a/foo b/foo\nindex bd956ea..fbf7936 100644\n--- > a/foo\n+++ b/foo\n@@ -1,2 +1,5 @@\n 1\r\n 2\r\n+\r\n+\r\n+\r\n' > + git checkout -- . > + git apply patch > patch:8: trailing whitespace. > > patch:9: trailing whitespace. > > patch:10: trailing whitespace. > > error: patch failed: foo:1 > error: foo: patch does not apply > ``` > > I also tried with `git apply --ignore-whitespace`, but this causes the > line endings of the existing contents to be changed to*lf* (there may > be two bugs here?) > > Thanks, > > Anthony I can reproduce you test case here. The line git apply patch would succeed, if you temporally (for the runtime of the apply command) set core.autocrlf to false: git -c core.autocrlf=false apply patch So this seems to be a bug (in a corner case ?): Typically repos which had been commited with CRLF should be normalized, which means that the CRLF in the repo are replaced by LF. So you test script is a corner case, for which Git has not been designed, It seems as if "git apply" gets things wrong here. Especially, as the '\r' is not a whitespace as a white space. but part of the line ending. So in my understanding the "--ignore-whitespace" option shouldn't affect the line endings at all. Fixes are possible, does anyone have a clue, why the '\r' is handled like this in apply ? And out of interest: is this a real life problem ? ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-01 20:47 ` Torsten Bögershausen @ 2017-08-01 20:58 ` Anthony Sottile 2017-08-02 15:44 ` Torsten Bögershausen 0 siblings, 1 reply; 36+ messages in thread From: Anthony Sottile @ 2017-08-01 20:58 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: Git Mailing List Here's where I'm hitting the problem described: https://github.com/pre-commit/pre-commit/issues/570 Note that `git -c core.autocrlf=false` apply patch fixes this situation, but breaks others. Here's a testcase where `git -c core.autocrlf=false apply patch` causes a *different* patch failure: ``` #!/bin/bash set -ex rm -rf foo git init foo cd foo git config --local core.autocrlf true # Commit lf into repository python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' git add foo python3 -c 'open("foo", "wb").write(b"3\n4\n")' # Generate a patch, check it out, restore it git diff --ignore-submodules --binary --no-color --no-ext-diff > patch python3 -c 'print(open("patch", "rb").read())' git checkout -- . git -c core.autocrlf=false apply patch ``` output: ``` + rm -rf foo + git init foo Initialized empty Git repository in /tmp/foo/.git/ + cd foo + git config --local core.autocrlf true + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' + git add foo + python3 -c 'open("foo", "wb").write(b"3\n4\n")' + git diff --ignore-submodules --binary --no-color --no-ext-diff warning: LF will be replaced by CRLF in foo. The file will have its original line endings in your working directory. + python3 -c 'print(open("patch", "rb").read())' b'diff --git a/foo b/foo\nindex 1191247..b944734 100644\n--- a/foo\n+++ b/foo\n@@ -1,2 +1,2 @@\n-1\n-2\n+3\n+4\n' + git checkout -- . + git -c core.autocrlf=false apply patch error: patch failed: foo:1 ``` My current workaround is: - try `git apply patch` - try `git -c core.autocrlf=false apply patch` which seems to work pretty well. Anthony On Tue, Aug 1, 2017 at 1:47 PM, Torsten Bögershausen <tboegi@web.de> wrote: > > > On 08/01/2017 08:24 PM, Anthony Sottile wrote: >> >> Here's my minimal reproduction -- it's slightly far-fetched in that it >> involves*committing crlf* and >> >> then using `autocrlf=true` (commit lf, check out crlf). >> >> ``` >> #!/bin/bash >> set -ex >> >> rm -rf foo >> git init foo >> cd foo >> >> # Commit crlf into repository >> git config --local core.autocrlf false >> python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' >> git add foo >> git commit -m "Initial commit with crlf" >> >> # Change whitespace mode to autocrlf, "commit lf, checkout crlf" >> git config --local core.autocrlf true >> python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' >> >> # Generate a patch, check it out, restore it >> git diff --ignore-submodules --binary --no-color --no-ext-diff > patch >> python3 -c 'print(open("patch", "rb").read())' >> git checkout -- . >> # I expect this to succeed, it fails >> git apply patch >> ``` >> >> And here's the output: >> >> ``` >> + rm -rf foo >> + git init foo >> Initialized empty Git repository in/tmp/foo/.git/ >> + cd foo >> + git config --local core.autocrlf false >> + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n")' >> + git add foo >> + git commit -m 'Initial commit with crlf' >> [master (root-commit) 02d3246] Initial commit with crlf >> 1 file changed, 2 insertions(+) >> create mode 100644 foo >> + git config --local core.autocrlf true >> + python3 -c 'open("foo", "wb").write(b"1\r\n2\r\n\r\n\r\n\r\n")' >> + git diff --ignore-submodules --binary --no-color --no-ext-diff >> + python3 -c 'print(open("patch", "rb").read())' >> b'diff --git a/foo b/foo\nindex bd956ea..fbf7936 100644\n--- >> a/foo\n+++ b/foo\n@@ -1,2 +1,5 @@\n 1\r\n 2\r\n+\r\n+\r\n+\r\n' >> + git checkout -- . >> + git apply patch >> patch:8: trailing whitespace. >> >> patch:9: trailing whitespace. >> >> patch:10: trailing whitespace. >> >> error: patch failed: foo:1 >> error: foo: patch does not apply >> ``` >> >> I also tried with `git apply --ignore-whitespace`, but this causes the >> line endings of the existing contents to be changed to*lf* (there may >> be two bugs here?) >> >> Thanks, >> >> Anthony > > > > I can reproduce you test case here. > > The line > > git apply patch > > would succeed, if you temporally (for the runtime of the apply command) set > core.autocrlf to false: > > git -c core.autocrlf=false apply patch > > So this seems to be a bug (in a corner case ?): > > Typically repos which had been commited with CRLF should be normalized, > which means that the CRLF in the repo are replaced by LF. > So you test script is a corner case, for which Git has not been designed, > It seems as if "git apply" gets things wrong here. > Especially, as the '\r' is not a whitespace as a white space. but part > of the line ending. > So in my understanding the "--ignore-whitespace" option shouldn't affect > the line endings at all. > > Fixes are possible, does anyone have a clue, why the '\r' is handled > like this in apply ? > > And out of interest: is this a real life problem ? > > > > > ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-01 20:58 ` Anthony Sottile @ 2017-08-02 15:44 ` Torsten Bögershausen 2017-08-02 20:42 ` [PATCH v1 1/1] correct apply for files commited with CRLF tboegi 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano 0 siblings, 2 replies; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-02 15:44 UTC (permalink / raw) To: Anthony Sottile; +Cc: Git Mailing List On 08/01/2017 10:58 PM, Anthony Sottile wrote: > Here's where I'm hitting the problem described: > https://github.com/pre-commit/pre-commit/issues/570 > > Note that `git -c core.autocrlf=false` apply patch fixes this > situation, but breaks others. [] I wasn't thinking of that - and thanks for the detailed report. I seems as there are 3 things to be done: - Make a workaround in your scripts/tools. It seems as if that is already done. - Fix Git. My very first investigation shows that a patch like this could fix the problem: diff --git a/apply.c b/apply.c index f2d599141d..66b8387360 100644 --- a/apply.c +++ b/apply.c @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); + if (would_convert_to_git(&the_index, path)) + read_cache(); convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); return 0; default: ---------------- I will probably do some more investigations if this is the core problem, so it will take some days or weeks. - Convince people to normalize their repos and convert all CRLF in the repo into LF. This may take even longer. ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v1 1/1] correct apply for files commited with CRLF 2017-08-02 15:44 ` Torsten Bögershausen @ 2017-08-02 20:42 ` tboegi 2017-08-02 21:13 ` Junio C Hamano 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano 1 sibling, 1 reply; 36+ messages in thread From: tboegi @ 2017-08-02 20:42 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> git apply does not find the source lines when files have CRLF in the index and core.autocrlf is true: These files should not get the CRLF converted to LF. Because cmd_apply() does not load the index, this does not work, CRLF are converted into LF and apply fails. Fix this in the spirit of commit a08feb8ef0b6, "correct blame for files commited with CRLF" by loading the index. As an optimization, skip read_cache() when no conversion is specified for this path. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- apply.c | 2 ++ t/t0020-crlf.sh | 12 ++++++++++++ 2 files changed, 14 insertions(+) diff --git a/apply.c b/apply.c index f2d599141d..66b8387360 100644 --- a/apply.c +++ b/apply.c @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); + if (would_convert_to_git(&the_index, path)) + read_cache(); convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); return 0; default: diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh index 71350e0657..6611f8a6f6 100755 --- a/t/t0020-crlf.sh +++ b/t/t0020-crlf.sh @@ -386,4 +386,16 @@ test_expect_success 'New CRLF file gets LF in repo' ' test_cmp alllf alllf2 ' +test_expect_success 'CRLF in repo, apply with autocrlf=true' ' + git config core.autocrlf false && + printf "1\r\n2\r\n" >crlf && + git add crlf && + git commit -m "commit crlf with crlf" && + git config core.autocrlf true && + printf "1\r\n2\r\n\r\n\r\n\r\n" >crlf && + git diff >patch && + git checkout -- . && + git apply patch +' + test_done -- 2.13.2.533.ge0aaa1b ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/1] correct apply for files commited with CRLF 2017-08-02 20:42 ` [PATCH v1 1/1] correct apply for files commited with CRLF tboegi @ 2017-08-02 21:13 ` Junio C Hamano 2017-08-04 17:31 ` Torsten Bögershausen 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2017-08-02 21:13 UTC (permalink / raw) To: tboegi; +Cc: git, asottile tboegi@web.de writes: > From: Torsten Bögershausen <tboegi@web.de> > > git apply does not find the source lines when files have CRLF in the index > and core.autocrlf is true: > These files should not get the CRLF converted to LF. Because cmd_apply() > does not load the index, this does not work, CRLF are converted into LF > and apply fails. > > Fix this in the spirit of commit a08feb8ef0b6, > "correct blame for files commited with CRLF" by loading the index. > > As an optimization, skip read_cache() when no conversion is specified > for this path. What happens when the input to apply is a patch to create a new file and the patch uses CRLF line endings? In such a case, shouldn't core.autocrlf=true cause the patch to be converted to LF and then succeed in applying? An extra read_cache() would not help as there must be no existing index entry for the path in such a case. If you did "rm .git/index" after you did the "git checkout -- ." in your test patch, "git apply" ought to succeed, as it is working as a substitute for "patch -p1" and should not be consulting the index at all. I cannot shake this feeling that it is convert_to_git() that needs to be fixed so that it does not need to refer to the current contents in the index. Why does it even need to look at the index? If it is for the "safe CRLF" thing (which I would think is misguided), shouldn't it be checking the original file in the working tree, not the index? After all, that is what we are patching, not the contents stored in the index. > > Reported-by: Anthony Sottile <asottile@umich.edu> > Signed-off-by: Torsten Bögershausen <tboegi@web.de> > --- > apply.c | 2 ++ > t/t0020-crlf.sh | 12 ++++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/apply.c b/apply.c > index f2d599141d..66b8387360 100644 > --- a/apply.c > +++ b/apply.c > @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) > case S_IFREG: > if (strbuf_read_file(buf, path, st->st_size) != st->st_size) > return error(_("unable to open or read %s"), path); > + if (would_convert_to_git(&the_index, path)) > + read_cache(); > convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); > return 0; > default: > diff --git a/t/t0020-crlf.sh b/t/t0020-crlf.sh > index 71350e0657..6611f8a6f6 100755 > --- a/t/t0020-crlf.sh > +++ b/t/t0020-crlf.sh > @@ -386,4 +386,16 @@ test_expect_success 'New CRLF file gets LF in repo' ' > test_cmp alllf alllf2 > ' > > +test_expect_success 'CRLF in repo, apply with autocrlf=true' ' > + git config core.autocrlf false && > + printf "1\r\n2\r\n" >crlf && > + git add crlf && > + git commit -m "commit crlf with crlf" && > + git config core.autocrlf true && > + printf "1\r\n2\r\n\r\n\r\n\r\n" >crlf && > + git diff >patch && > + git checkout -- . && > + git apply patch > +' > + > test_done ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/1] correct apply for files commited with CRLF 2017-08-02 21:13 ` Junio C Hamano @ 2017-08-04 17:31 ` Torsten Bögershausen 2017-08-04 17:57 ` Junio C Hamano 0 siblings, 1 reply; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-04 17:31 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, asottile On 08/02/2017 11:13 PM, Junio C Hamano wrote: > tboegi@web.de writes: > >> From: Torsten Bögershausen <tboegi@web.de> >> >> git apply does not find the source lines when files have CRLF in the index >> and core.autocrlf is true: >> These files should not get the CRLF converted to LF. Because cmd_apply() >> does not load the index, this does not work, CRLF are converted into LF >> and apply fails. >> >> Fix this in the spirit of commit a08feb8ef0b6, >> "correct blame for files commited with CRLF" by loading the index. >> >> As an optimization, skip read_cache() when no conversion is specified >> for this path. > What happens when the input to apply is a patch to create a new file > and the patch uses CRLF line endings? In such a case, shouldn't > core.autocrlf=true cause the patch to be converted to LF and then > succeed in applying? An extra read_cache() would not help as there > must be no existing index entry for the path in such a case. > > If you did "rm .git/index" after you did the "git checkout -- ." in > your test patch, "git apply" ought to succeed, as it is working as > a substitute for "patch -p1" and should not be consulting the index > at all. > > I cannot shake this feeling that it is convert_to_git() that needs > to be fixed so that it does not need to refer to the current > contents in the index. Why does it even need to look at the index? > If it is for the "safe CRLF" thing (which I would think is misguided), > shouldn't it be checking the original file in the working tree, not > the index? After all, that is what we are patching, not the contents > stored in the index. Thanks for the review. My understanding is that there are different things possible with `git apply`: working on files in the index ("cached") files and "worktree" files. If we work on files in the index, the line endings must match for the patch to apply, the patch/apply machinery is not forgiving mismatching line endings. At least not by default. There is one exception: Use "blank-at-eol" to declare the CR of the CRLF as a whitespace, and then tell git apply to ignore white space: `git apply --ignore-whitespace` My feeling is that this is not self-explaining, but this is a different story. Back to the fix, the read_old_data() from below works on the working tree, yes, but after convert_to_git(). And that is why we need the index, to fix this very case. appy.c: static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, const char *name, unsigned expected_mode) { if (state->cached || state->check_index) { if (read_file_or_gitlink(ce, buf)) return error(_("failed to read %s"), name); } else if (name) { if (S_ISGITLINK(expected_mode)) { if (ce) return read_file_or_gitlink(ce, buf); else return SUBMODULE_PATCH_WITHOUT_INDEX; } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { if (read_old_data(st, name, buf)) return error(_("failed to read %s"), name); } } return 0; } ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/1] correct apply for files commited with CRLF 2017-08-04 17:31 ` Torsten Bögershausen @ 2017-08-04 17:57 ` Junio C Hamano 2017-08-04 19:26 ` Junio C Hamano 0 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2017-08-04 17:57 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git, asottile Torsten Bögershausen <tboegi@web.de> writes: > Back to the fix, the read_old_data() from below works on the working tree, > yes, but after convert_to_git(). > And that is why we need the index, to fix this very case. But "git apply" (without "--cached" or "--index") is to work on the working tree file only. The target file of the patch that is going to be modified does not even have to be in the index, i.e. it is perfectly fine to: (1) create a file F, commit, then modify that file; (2) take two patches out of that repository you did (1); (3) go to a Git repository that does not yet know about file F; (4) run "git apply" to process the first patch you took in (2), which will create file F; then (5) run "git apply" to process the second patch on top, which will modify file F. Step (4) will probably not cause too much issue, as the only thing we make sure is "Because the patch creates F, F does not exist in the directory", which would be the case. Now, when trying to apply the second patch at step (5), we may need to adjust for the broken line-ending convention, but you do not have any entry in the index for F to rely on. That is why I am saying convert_to_git() that checks the index content is a wrong thing to use in this codepath. It is OK to use it for "git add" and friends, as the intent of the (I'd say "broken") "safe CRLF" mechanism is to take a hint from the "previous" state to see if CRLF in the "new" content should be munged, and the "previous" in the context of running "git add" _is_ what is in the index. The "previous" in the context of running "git apply" (which does not look at the index) is _not_ what is in the index, on the other hand. Instead of "struct index_state *", convert_to_git() needs to be fixed to take something else that can be queried for the "previous" version to use as a hint, if "safe CRLF" wants to work correctly. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v1 1/1] correct apply for files commited with CRLF 2017-08-04 17:57 ` Junio C Hamano @ 2017-08-04 19:26 ` Junio C Hamano 0 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-04 19:26 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git, asottile Junio C Hamano <gitster@pobox.com> writes: > The "previous" in the context of running "git apply" (which does not > look at the index) is _not_ what is in the index, on the other hand. > Instead of "struct index_state *", convert_to_git() needs to be > fixed to take something else that can be queried for the "previous" > version to use as a hint, if "safe CRLF" wants to work correctly. I left it unsaid by mistake, but I think the right thing to use as the "previous" to take hint from in the context of "git apply" is what is in the working tree, i.e. the result of applying patch in step (4) to create a file F in the sample scenario. While applying patch in step (5), convert_to_git() should "imagine" adding the file F currently in the working tree (i.e. the result of step (4)) to the index---if the resulting object in the index would have CR, then the safe CRLF logic should refrain from doing CRLF->LF conversion. And it should do so without actually adding neither the preimage or the postimage to the index, of course. When we are doing "git apply --index", then we _require_ that the indexed contents and what is in the working tree matches before applying the patch, so it is perfectly fine to let convert_to_git() to look at the current index---that is the "previous" one we want to take hint from while using the safe CRLF logic. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-02 15:44 ` Torsten Bögershausen 2017-08-02 20:42 ` [PATCH v1 1/1] correct apply for files commited with CRLF tboegi @ 2017-08-02 20:58 ` Junio C Hamano 2017-08-12 5:45 ` Torsten Bögershausen ` (4 more replies) 1 sibling, 5 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-02 20:58 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: Anthony Sottile, Git Mailing List Torsten Bögershausen <tboegi@web.de> writes: > My very first investigation shows that a patch like this could fix > the problem: > > diff --git a/apply.c b/apply.c > index f2d599141d..66b8387360 100644 > --- a/apply.c > +++ b/apply.c > @@ -2278,6 +2278,8 @@ static int read_old_data(struct stat *st, const > char *path, struct strbuf *buf) > case S_IFREG: > if (strbuf_read_file(buf, path, st->st_size) != > st->st_size) > return error(_("unable to open or read %s"), path); > + if (would_convert_to_git(&the_index, path)) > + read_cache(); > convert_to_git(&the_index, path, buf->buf, buf->len, > buf, 0); Sorry, but it is unclear why this is a "fix" to me. We may not even be doing "git apply --index" or "git apply --cached" that care about the state recorded in the index, but just the plain vanilla "git apply", which could even be outside any repository. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano @ 2017-08-12 5:45 ` Torsten Bögershausen 2017-08-12 5:53 ` Torsten Bögershausen 2017-08-12 14:56 ` [PATCH/RFC] convert: Add SAFE_CRLF_KEEP_CRLF tboegi ` (3 subsequent siblings) 4 siblings, 1 reply; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-12 5:45 UTC (permalink / raw) To: tboegi; +Cc: Anthony Sottile, Git Mailing List Test from mutt ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` 2017-08-12 5:45 ` Torsten Bögershausen @ 2017-08-12 5:53 ` Torsten Bögershausen 0 siblings, 0 replies; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-12 5:53 UTC (permalink / raw) To: Anthony Sottile, Git Mailing List >I left it unsaid by mistake, but I think the right thing to use as >the "previous" to take hint from in the context of "git apply" is >what is in the working tree, i.e. the result of applying patch in >step (4) to create a file F in the sample scenario. >While applying patch in step (5), convert_to_git() should "imagine" >adding the file F currently in the working tree (i.e. the result of >step (4)) to the index---if the resulting object in the index would >have CR, then the safe CRLF logic should refrain from doing CRLF->LF >conversion. And it should do so without actually adding neither the >preimage or the postimage to the index, of course. (Sorry for the test mail) What I wanted to say is that this long explanation convinced me to write a patch and send it out the next days, >When we are doing "git apply --index", then we _require_ that the >indexed contents and what is in the working tree matches before >applying the patch, so it is perfectly fine to let convert_to_git() >to look at the current index---that is the "previous" one we want to >take hint from while using the safe CRLF logic. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH/RFC] convert: Add SAFE_CRLF_KEEP_CRLF 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano 2017-08-12 5:45 ` Torsten Bögershausen @ 2017-08-12 14:56 ` tboegi 2017-08-12 14:56 ` [PATCH/RFC] File commited with CRLF should roundtrip diff and apply tboegi ` (2 subsequent siblings) 4 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-12 14:56 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- convert.c | 10 ++++++---- convert.h | 3 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b3..040123b4fe 100644 --- a/convert.c +++ b/convert.c @@ -1104,10 +1104,12 @@ int convert_to_git(const struct index_state *istate, src = dst->buf; len = dst->len; } - ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); - if (ret && dst) { - src = dst->buf; - len = dst->len; + if (checksafe != SAFE_CRLF_KEEP_CRLF) { + ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); + if (ret && dst) { + src = dst->buf; + len = dst->len; + } } return ret | ident_to_git(path, src, len, dst, ca.ident); } diff --git a/convert.h b/convert.h index cecf59d1aa..cabd5ed6dd 100644 --- a/convert.h +++ b/convert.h @@ -10,7 +10,8 @@ enum safe_crlf { SAFE_CRLF_FALSE = 0, SAFE_CRLF_FAIL = 1, SAFE_CRLF_WARN = 2, - SAFE_CRLF_RENORMALIZE = 3 + SAFE_CRLF_RENORMALIZE = 3, + SAFE_CRLF_KEEP_CRLF = 4 }; extern enum safe_crlf safe_crlf; -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/RFC] File commited with CRLF should roundtrip diff and apply 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano 2017-08-12 5:45 ` Torsten Bögershausen 2017-08-12 14:56 ` [PATCH/RFC] convert: Add SAFE_CRLF_KEEP_CRLF tboegi @ 2017-08-12 14:56 ` tboegi 2017-08-14 17:37 ` Junio C Hamano 2017-08-13 8:51 ` [PATCH/RFC 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi 4 siblings, 1 reply; 36+ messages in thread From: tboegi @ 2017-08-12 14:56 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When a file had been commited with CRLF and core.autocrlf is true, the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...) Here, when commited with CRLF, the line endings should not be converted. Analyze the patch if there is any context line with CRLF, or if any line with CRLF is to be removed. If yes, the new flag has_crlf is set in "struct patch", and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. Thanks to Junio C Hamano, his input became the base for t4140. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- apply.c | 37 ++++++++++++++++++++++++++++--------- apply.h | 4 ++++ t/t4140-apply-CRLF.sh | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 78 insertions(+), 9 deletions(-) create mode 100755 t/t4140-apply-CRLF.sh diff --git a/apply.c b/apply.c index f2d599141d..63455cd65f 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int has_crlf:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* Check if the patch has context lines with CRLF or + the patch wants to remove lines with CRLF */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->has_crlf = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2282,10 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, int flags) { + enum safe_crlf safe_crlf = flags & APPLY_FLAGS_CR_AT_EOL ? + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE; switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2294,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + convert_to_git(&the_index, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3385,7 +3401,8 @@ static int load_patch_target(struct apply_state *state, const struct cache_entry *ce, struct stat *st, const char *name, - unsigned expected_mode) + unsigned expected_mode, + int flags) { if (state->cached || state->check_index) { if (read_file_or_gitlink(ce, buf)) @@ -3399,7 +3416,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, name, buf, flags)) return error(_("failed to read %s"), name); } } @@ -3423,6 +3440,7 @@ static int load_preimage(struct apply_state *state, char *img; struct patch *previous; int status; + int flags = patch->has_crlf ? APPLY_FLAGS_CR_AT_EOL : 0; previous = previous_patch(state, patch, &status); if (status) @@ -3433,7 +3451,7 @@ static int load_preimage(struct apply_state *state, strbuf_add(&buf, previous->result, previous->resultsize); } else { status = load_patch_target(state, &buf, ce, st, - patch->old_name, patch->old_mode); + patch->old_name, patch->old_mode, flags); if (status < 0) return status; else if (status == SUBMODULE_PATCH_WITHOUT_INDEX) { @@ -3493,7 +3511,8 @@ static int three_way_merge(struct image *image, */ static int load_current(struct apply_state *state, struct image *image, - struct patch *patch) + struct patch *patch, + int flags) { struct strbuf buf = STRBUF_INIT; int status, pos; @@ -3520,7 +3539,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, name, mode, flags); if (status < 0) return status; else if (status) @@ -3571,7 +3590,8 @@ static int try_threeway(struct apply_state *state, /* our_oid is ours */ if (patch->is_new) { - if (load_current(state, &tmp_image, patch)) + int flags = 0; + if (load_current(state, &tmp_image, patch, flags)) return error(_("cannot read the current contents of '%s'"), patch->new_name); } else { @@ -3617,7 +3637,6 @@ static int apply_data(struct apply_state *state, struct patch *patch, struct stat *st, const struct cache_entry *ce) { struct image image; - if (load_preimage(state, &image, patch, st, ce) < 0) return -1; diff --git a/apply.h b/apply.h index b3d6783d55..192140280f 100644 --- a/apply.h +++ b/apply.h @@ -33,9 +33,13 @@ enum apply_verbosity { #define APPLY_SYMLINK_GOES_AWAY 01 #define APPLY_SYMLINK_IN_RESULT 02 + +#define APPLY_FLAGS_CR_AT_EOL (1<<0) + struct apply_state { const char *prefix; int prefix_length; + int flags; /* These are lock_file related */ struct lock_file *lock_file; diff --git a/t/t4140-apply-CRLF.sh b/t/t4140-apply-CRLF.sh new file mode 100755 index 0000000000..fd528daabd --- /dev/null +++ b/t/t4140-apply-CRLF.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='CRLF diff and apply' + +. ./test-lib.sh + +test_expect_success setup ' + mkdir upstream && + ( + cd upstream && + git init && + git config core.autocrlf false && + >.gitignore && + git add . && + git commit -m gitignore && + printf "F1\r\n" >FileCRLF && + git add . && + git commit -m initial && + git diff HEAD^1 HEAD -- >../patch1 + ) && + git config core.autocrlf true +' + +test_expect_success 'apply patches Replace lines' ' + ( + cd upstream && + printf "F11\r\nF12\r\n" >FileCRLF && + git diff >../patch2Replace + ) && + git apply patch1 && + git apply patch2Replace +' + +test_expect_success 'apply patches Add lines' ' + ( + cd upstream && + git checkout FileCRLF && + printf "F2\r\n" >>FileCRLF && + git diff >../patch2Add + ) && + rm -f FileCRLF && + git apply patch1 && + git apply patch2Add +' + +test_done -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH/RFC] File commited with CRLF should roundtrip diff and apply 2017-08-12 14:56 ` [PATCH/RFC] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-14 17:37 ` Junio C Hamano 2017-08-17 6:06 ` [PATCH v2 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi ` (5 more replies) 0 siblings, 6 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-14 17:37 UTC (permalink / raw) To: tboegi; +Cc: git, asottile tboegi@web.de writes: > From: Torsten Bögershausen <tboegi@web.de> > > When a file had been commited with CRLF and core.autocrlf is true, > the following does not roundtrip, `git apply` fails: > > printf "Added line\r\n" >>file && > git diff >patch && > git checkout -- . && > git apply patch Should this tweak be in effect when we are paying attention to the contents of the index? Or does it matter only when we are doing "git apply" without either "--index" or "--cache"? The fact that the new flag that is passed from load_preimage() down to read_old_data(), the latter of which is only about the "behave as a better 'patch -p1', ignoring the index" mode, hints that this should not affect anything when we are paying attention to the index, but then calling the load_patch_target() helper with a very generic CR_AT_EOL flag looks a bit misleading, by making it appear as if the caller is telling all three cases to do the funny CR business. I wonder if we instead want to pass the "patch" structure down the callchain and have _only_ read_old_data() pay attention to the has_crlf bit in it---that way, it becomes more obvious what is going on. I also notice that read_old_data() still passes &the_index to convert_to_git(). Can we fix convert_to_git() further to allow NULL as the istate parameter when we tell it not to look at the index ( I am reading your passing SAFE_CRLF_KEEP and SAFE_CRLF_FALSE as a way for the caller to tell that the caller knows what it wants already and does not want covnert_to_git() to look at the index)? With or without CR in the patch, I do not see any reason for the codepath from read_old_data() down to the convert API to look at the index, ever, as read_old_data() is called only when (!state->cached && !state->check_index), i.e. when we are working as a better 'patch -p1' without even having to know that we are in a Git repository, by load_patch_target(). > diff --git a/apply.c b/apply.c > index f2d599141d..63455cd65f 100644 > --- a/apply.c > +++ b/apply.c > @@ -220,6 +220,7 @@ struct patch { > unsigned int recount:1; > unsigned int conflicted_threeway:1; > unsigned int direct_to_threeway:1; > + unsigned int has_crlf:1; > struct fragment *fragments; > char *result; > size_t resultsize; > @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state, > record_ws_error(state, result, line + 1, len - 2, state->linenr); > } > > +/* Check if the patch has context lines with CRLF or > + the patch wants to remove lines with CRLF */ Style. > +static void check_old_for_crlf(struct patch *patch, const char *line, int len) > +{ > + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { > + patch->ws_rule |= WS_CR_AT_EOL; > + patch->has_crlf = 1; > + } > +} I am wondering where the discrepancy between the names (old crlf here, as opposed to "struct patch" that says "has_crlf" implying it does not care which side between old and new has one) comes from. Is the intention that you only care about what is expected of the preimage? > @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state, > if (!deleted && !added) > leading++; > trailing++; > + check_old_for_crlf(patch, line, len); > if (!state->apply_in_reverse && > state->ws_error_action == correct_ws_error) > check_whitespace(state, line, len, patch->ws_rule); > break; If check_old is about "we care about the lines that are supposed to match what currently exist in the target file to be patched", then the call to it also must pay attention to apply_in_reverse. What appears on '+' lines in a patch will become the "expected old contents the patched target is supposed to have" when we are running "apply -R". > case '-': > + check_old_for_crlf(patch, line, len); > if (state->apply_in_reverse && > state->ws_error_action != nowarn_ws_error) > check_whitespace(state, line, len, patch->ws_rule); Likewise. Thanks for working on this; unlike the previous one I commented, I think this one I can sort of see how this is an approach to fix it. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v2 1/2] convert: Add SAFE_CRLF_KEEP_CRLF 2017-08-14 17:37 ` Junio C Hamano @ 2017-08-17 6:06 ` tboegi 2017-08-17 6:06 ` [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply tboegi ` (4 subsequent siblings) 5 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-17 6:06 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- convert.c | 10 ++++++---- convert.h | 3 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b3..040123b4fe 100644 --- a/convert.c +++ b/convert.c @@ -1104,10 +1104,12 @@ int convert_to_git(const struct index_state *istate, src = dst->buf; len = dst->len; } - ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); - if (ret && dst) { - src = dst->buf; - len = dst->len; + if (checksafe != SAFE_CRLF_KEEP_CRLF) { + ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); + if (ret && dst) { + src = dst->buf; + len = dst->len; + } } return ret | ident_to_git(path, src, len, dst, ca.ident); } diff --git a/convert.h b/convert.h index cecf59d1aa..cabd5ed6dd 100644 --- a/convert.h +++ b/convert.h @@ -10,7 +10,8 @@ enum safe_crlf { SAFE_CRLF_FALSE = 0, SAFE_CRLF_FAIL = 1, SAFE_CRLF_WARN = 2, - SAFE_CRLF_RENORMALIZE = 3 + SAFE_CRLF_RENORMALIZE = 3, + SAFE_CRLF_KEEP_CRLF = 4 }; extern enum safe_crlf safe_crlf; -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-14 17:37 ` Junio C Hamano 2017-08-17 6:06 ` [PATCH v2 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi @ 2017-08-17 6:06 ` tboegi 2017-08-17 6:37 ` Junio C Hamano 2017-08-17 21:43 ` [PATCH v3 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi ` (3 subsequent siblings) 5 siblings, 1 reply; 36+ messages in thread From: tboegi @ 2017-08-17 6:06 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When a file had been commited with CRLF but now .gitattributes say "* text=auto" (or core.autocrlf is true), the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...) Here, when commited with CRLF, the line endings should not be converted. Note that `git apply --index` or `git apply --cache` doesn't call convert_to_git() because the source material is already in index format. Analyze the patch if there is a) any context line with CRLF, or b) if any line with CRLF is to be removed. In this case the patch file `patch` has mixed line endings, for a) it looks like this (ignore the $ at the begin of the line): $ diff --git a/one b/one $ index 533790e..c30dea8 100644 $ --- a/one $ +++ b/one $ @@ -1 +1,2 @@ $ a\r $ +b\r And for b) it looks like this: $ diff --git a/one b/one $ index 533790e..485540d 100644 $ --- a/one $ +++ b/one $ @@ -1 +1 @@ $ -a\r $ +b\r If `git apply` detects that the patch itself has CRLF, (look at the line " a\r" or "-a\r" above), the new flag has_crlf is set in "struct patch" and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. Thanks to Junio C Hamano, his input became the base for the changes in t4124. One test case is split up into 3: - Detect the " a\r" line in the patch - Detect the "-a\r" line in the patch - Use LF in repo and CLRF in the worktree. (*) * This one proves that convert_to_git(&the_index,...) still needs to pass the &index, otherwise Git will crash. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- apply.c | 28 +++++++++++++++++++++++----- t/t4124-apply-ws-rule.sh | 33 +++++++++++++++++++++++++++------ 2 files changed, 50 insertions(+), 11 deletions(-) diff --git a/apply.c b/apply.c index f2d599141d..bebb176099 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int has_crlf:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* Check if the patch has context lines with CRLF or + the patch wants to remove lines with CRLF */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->has_crlf = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2282,11 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, struct patch *patch, + const char *path, struct strbuf *buf) { + enum safe_crlf safe_crlf = patch->has_crlf ? + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE; switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2295,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + convert_to_git(&the_index, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3384,6 +3401,7 @@ static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, + struct patch *patch, const char *name, unsigned expected_mode) { @@ -3399,7 +3417,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, patch, name, buf)) return error(_("failed to read %s"), name); } } @@ -3432,7 +3450,7 @@ static int load_preimage(struct apply_state *state, /* We have a patched copy in memory; use that. */ strbuf_add(&buf, previous->result, previous->resultsize); } else { - status = load_patch_target(state, &buf, ce, st, + status = load_patch_target(state, &buf, ce, st, patch, patch->old_name, patch->old_mode); if (status < 0) return status; @@ -3520,7 +3538,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, patch, name, mode); if (status < 0) return status; else if (status) diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index d350065f25..4fc27c51f7 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -467,21 +467,42 @@ test_expect_success 'same, but with CR-LF line endings && cr-at-eol set' ' test_cmp one expect ' -test_expect_success 'same, but with CR-LF line endings && cr-at-eol unset' ' +test_expect_success 'CR-LF line endings && add line && text=auto' ' git config --unset core.whitespace && printf "a\r\n" >one && + cp one save-one && + git add one && printf "b\r\n" >>one && - printf "c\r\n" >>one && + cp one expect && + git diff -- one >patch && + mv save-one one && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' + +test_expect_success 'CR-LF line endings && change line && text=auto' ' + printf "a\r\n" >one && cp one save-one && - printf " \r\n" >>one && git add one && + printf "b\r\n" >one && cp one expect && - printf "d\r\n" >>one && git diff -- one >patch && mv save-one one && - echo d >>expect && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' - git apply --ignore-space-change --whitespace=fix patch && +test_expect_success 'LF in repo, CRLF in worktree && change line && text=auto' ' + printf "a\n" >one && + git add one && + printf "b\r\n" >one && + git diff -- one >patch && + printf "a\r\n" >one && + echo "one text=auto" >.gitattributes && + git -c core.eol=CRLF apply patch && + printf "b\r\n" >expect && test_cmp one expect ' -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-17 6:06 ` [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-17 6:37 ` Junio C Hamano 0 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 6:37 UTC (permalink / raw) To: tboegi; +Cc: git, asottile tboegi@web.de writes: > Analyze the patch if there is a) any context line with CRLF, > or b) if any line with CRLF is to be removed. > Thanks to Junio C Hamano, his input became the base for the changes in t4124. > One test case is split up into 3: > - Detect the " a\r" line in the patch > - Detect the "-a\r" line in the patch > - Use LF in repo and CLRF in the worktree. (*) > > * This one proves that convert_to_git(&the_index,...) still needs to pass > the &index, otherwise Git will crash. I do not understand why you think it proves anything like that. Forget about "in repo" when you think about "git apply" without "--index/--cache". There is *NO* role the index should play in that mode. "Otherwise Git will crash" is true, because convert_to_git() tries to dereference the istate it is given to see if there is CR in the blob that happens to be in the index at the path. But step back and *think*. It only proves that convert_to_git() you have and/or the way read_old_data() you updated calls it after applying these two patches are still broken. The "blob that happens to be in the index at the path" does *NOT* have anything to do with the file in the working tree that you are patching. Such an index entry may not exist (and the code would see that there is 0 CRs and 0 LFs---so what?), or it may have a blob full of CRLF, or it may have a blob full of CR not LF, or any random thing that has NO relation to the file you are patching. Why should that random blob (which may not even exist---we are discussing "git apply" without "--index/--cache" here) affect the outcome? If it does not affect the outcome, why should convert_to_git() even look at it? It shouldn't be calling down to "has_cr_in_index(istate, path)" that is called from crlf_to_git() in the first place. The check for CR was done in the caller of convert_to_git(), i.e. read_old_data(), long before convert_to_git() is called, and the caller knows if it is (or it is not) dealing with data that needs crlf conversion at that point, based on the contents of the file being patched (which, again, does not have any relation to the blob that may or may not happen to be in the index). Your updated caller is already telling convert_to_git() codepath when it needs CRLF and it refrains from peeking in the index with SAFE_CRLF_KEEP_CRLF flag. The bug still in the code after applying these two patches is that the other choice, i.e. SAFE_CRLF_FALSE, that is passed from read_old_data() to convert_to_git() does *not* prevent the latter from peeking into the in-core index. And that is why "Otherwise Git will crash". ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v3 1/2] convert: Add SAFE_CRLF_KEEP_CRLF 2017-08-14 17:37 ` Junio C Hamano 2017-08-17 6:06 ` [PATCH v2 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-17 6:06 ` [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-17 21:43 ` tboegi 2017-08-17 21:43 ` [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply tboegi ` (2 subsequent siblings) 5 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-17 21:43 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- convert.c | 10 ++++++---- convert.h | 3 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b3..040123b4fe 100644 --- a/convert.c +++ b/convert.c @@ -1104,10 +1104,12 @@ int convert_to_git(const struct index_state *istate, src = dst->buf; len = dst->len; } - ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); - if (ret && dst) { - src = dst->buf; - len = dst->len; + if (checksafe != SAFE_CRLF_KEEP_CRLF) { + ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); + if (ret && dst) { + src = dst->buf; + len = dst->len; + } } return ret | ident_to_git(path, src, len, dst, ca.ident); } diff --git a/convert.h b/convert.h index cecf59d1aa..cabd5ed6dd 100644 --- a/convert.h +++ b/convert.h @@ -10,7 +10,8 @@ enum safe_crlf { SAFE_CRLF_FALSE = 0, SAFE_CRLF_FAIL = 1, SAFE_CRLF_WARN = 2, - SAFE_CRLF_RENORMALIZE = 3 + SAFE_CRLF_RENORMALIZE = 3, + SAFE_CRLF_KEEP_CRLF = 4 }; extern enum safe_crlf safe_crlf; -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-14 17:37 ` Junio C Hamano ` (2 preceding siblings ...) 2017-08-17 21:43 ` [PATCH v3 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi @ 2017-08-17 21:43 ` tboegi 2017-08-17 22:29 ` Junio C Hamano 2017-08-17 22:35 ` Junio C Hamano 2017-08-19 11:27 ` [PATCH v4 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-19 11:28 ` [PATCH v4 2/2] File commited with CRLF should roundtrip diff and apply tboegi 5 siblings, 2 replies; 36+ messages in thread From: tboegi @ 2017-08-17 21:43 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When a file had been commited with CRLF but now .gitattributes say "* text=auto" (or core.autocrlf is true), the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...) Here, when commited with CRLF, the line endings should not be converted. Note that `git apply --index` or `git apply --cache` doesn't call convert_to_git() because the source material is already in index format. Analyze the patch if there is a) any context line with CRLF, or b) if any line with CRLF is to be removed. In this case the patch file `patch` has mixed line endings, for a) it looks like this (ignore the * at the begin of the line): * diff --git a/one b/one * index 533790e..c30dea8 100644 * --- a/one * +++ b/one * @@ -1 +1,2 @@ * a\r * +b\r And for b) it looks like this: * diff --git a/one b/one * index 533790e..485540d 100644 * --- a/one * +++ b/one * @@ -1 +1 @@ * -a\r * +b\r If `git apply` detects that the patch itself has CRLF, (look at the line " a\r" or "-a\r" above), the new flag crlf_in_old is set in "struct patch" and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. Thanks to Junio C Hamano, his input became the base for the changes in t4124. One test case is split up into 3: - Detect the " a\r" line in the patch - Detect the "-a\r" line in the patch - Use LF in repo and CLRF in the worktree. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- Changes since v2: - Manually integrated all code changes from Junio (Thanks, I hope that I didn't miss something) - Having examples of "git diff" in the commit message confuses "git apply", so that all examples for git diff have a '*' at the beginnig of the line (V2 used '$' which is typically an example for a shell script) - The official version to apply the CRLF-rules without having an index is SAFE_CRLF_RENORMALIZE, that is already working today. - Now we have convert_to_git(NULL, ..., safe_crlf) with enum safe_crlf safe_crlf = patch->crlf_in_old ? SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_RENORMALIZE; apply.c | 40 +++++++++++++++++++++++++++++++++++----- t/t4124-apply-ws-rule.sh | 33 +++++++++++++++++++++++++++------ 2 files changed, 62 insertions(+), 11 deletions(-) diff --git a/apply.c b/apply.c index f2d599141d..691f47c783 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int crlf_in_old:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,19 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* + * Check if the patch has context lines with CRLF or + * the patch wants to remove lines with CRLF. + */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->crlf_in_old = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1726,15 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + if (!state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + if (!state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2286,11 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, struct patch *patch, + const char *path, struct strbuf *buf) { + enum safe_crlf safe_crlf = patch->crlf_in_old ? + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_RENORMALIZE; switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2299,15 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + /* + * "git apply" without "--index/--cached" should never look + * at the index; the target file may not have been added to + * the index yet, and we may not even be in any Git repository. + * Pass NULL to convert_to_git() to stress this; the function + * should never look at the index when explicit crlf option + * is given. + */ + convert_to_git(NULL, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3384,6 +3413,7 @@ static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, + struct patch *patch, const char *name, unsigned expected_mode) { @@ -3399,7 +3429,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, patch, name, buf)) return error(_("failed to read %s"), name); } } @@ -3432,7 +3462,7 @@ static int load_preimage(struct apply_state *state, /* We have a patched copy in memory; use that. */ strbuf_add(&buf, previous->result, previous->resultsize); } else { - status = load_patch_target(state, &buf, ce, st, + status = load_patch_target(state, &buf, ce, st, patch, patch->old_name, patch->old_mode); if (status < 0) return status; @@ -3520,7 +3550,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, patch, name, mode); if (status < 0) return status; else if (status) diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index d350065f25..4fc27c51f7 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -467,21 +467,42 @@ test_expect_success 'same, but with CR-LF line endings && cr-at-eol set' ' test_cmp one expect ' -test_expect_success 'same, but with CR-LF line endings && cr-at-eol unset' ' +test_expect_success 'CR-LF line endings && add line && text=auto' ' git config --unset core.whitespace && printf "a\r\n" >one && + cp one save-one && + git add one && printf "b\r\n" >>one && - printf "c\r\n" >>one && + cp one expect && + git diff -- one >patch && + mv save-one one && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' + +test_expect_success 'CR-LF line endings && change line && text=auto' ' + printf "a\r\n" >one && cp one save-one && - printf " \r\n" >>one && git add one && + printf "b\r\n" >one && cp one expect && - printf "d\r\n" >>one && git diff -- one >patch && mv save-one one && - echo d >>expect && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' - git apply --ignore-space-change --whitespace=fix patch && +test_expect_success 'LF in repo, CRLF in worktree && change line && text=auto' ' + printf "a\n" >one && + git add one && + printf "b\r\n" >one && + git diff -- one >patch && + printf "a\r\n" >one && + echo "one text=auto" >.gitattributes && + git -c core.eol=CRLF apply patch && + printf "b\r\n" >expect && test_cmp one expect ' -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-17 21:43 ` [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-17 22:29 ` Junio C Hamano 2017-08-17 22:35 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 22:29 UTC (permalink / raw) To: tboegi; +Cc: git, asottile tboegi@web.de writes: > Changes since v2: > - Manually integrated all code changes from Junio > (Thanks, I hope that I didn't miss something) I suspect that "apply -R makes '+' preimage" change is not here. > - Having examples of "git diff" in the commit message confuses "git apply", > so that all examples for git diff have a '*' at the beginnig of the line > (V2 used '$' which is typically an example for a shell script) Just FYI we tend to just indent them further, just like any displayed material in the proposed log message. > - The official version to apply the CRLF-rules without having an index is > SAFE_CRLF_RENORMALIZE, that is already working today. Ah, good find. I forgot about that thing you added some time ago. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-17 21:43 ` [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-17 22:29 ` Junio C Hamano @ 2017-08-17 22:35 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 22:35 UTC (permalink / raw) To: tboegi; +Cc: git, asottile tboegi@web.de writes: > @@ -1712,11 +1726,15 @@ static int parse_fragment(struct apply_state *state, > if (!deleted && !added) > leading++; > trailing++; > + if (!state->apply_in_reverse) > + check_old_for_crlf(patch, line, len); > if (!state->apply_in_reverse && > state->ws_error_action == correct_ws_error) > check_whitespace(state, line, len, patch->ws_rule); > break; This one is wrong. You are looking at " " (common context) and you should unconditionally call check_old for them. > case '-': > + if (!state->apply_in_reverse) > + check_old_for_crlf(patch, line, len); This is correct. There is "case '+':" below here you did not touch. There should be case '+': + if (state->apply_in_reverse) + check_old_for_crlf(...); there. Note that we call check_old() only when applying in reverse. > @@ -2268,8 +2286,11 @@ static void show_stats(struct apply_state *state, struct patch *patch) > add, pluses, del, minuses); > } > > -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) > +static int read_old_data(struct stat *st, struct patch *patch, > + const char *path, struct strbuf *buf) The order of argument to have the patch structure earlier is different from my version; what we see here looks much better to me. > { > + enum safe_crlf safe_crlf = patch->crlf_in_old ? > + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_RENORMALIZE; > switch (st->st_mode & S_IFMT) { > case S_IFLNK: > if (strbuf_readlink(buf, path, st->st_size) < 0) > @@ -2278,7 +2299,15 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) > case S_IFREG: > if (strbuf_read_file(buf, path, st->st_size) != st->st_size) > return error(_("unable to open or read %s"), path); > - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); > + /* > + * "git apply" without "--index/--cached" should never look > + * at the index; the target file may not have been added to > + * the index yet, and we may not even be in any Git repository. > + * Pass NULL to convert_to_git() to stress this; the function > + * should never look at the index when explicit crlf option > + * is given. > + */ > + convert_to_git(NULL, path, buf->buf, buf->len, buf, safe_crlf); This comment is somewhat strangly indented. I thought opening "/*" alighs with the usual tab stop. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* [PATCH v4 1/2] convert: Add SAFE_CRLF_KEEP_CRLF 2017-08-14 17:37 ` Junio C Hamano ` (3 preceding siblings ...) 2017-08-17 21:43 ` [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-19 11:27 ` tboegi 2017-08-19 11:28 ` [PATCH v4 2/2] File commited with CRLF should roundtrip diff and apply tboegi 5 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-19 11:27 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- convert.c | 10 ++++++---- convert.h | 3 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b3..040123b4fe 100644 --- a/convert.c +++ b/convert.c @@ -1104,10 +1104,12 @@ int convert_to_git(const struct index_state *istate, src = dst->buf; len = dst->len; } - ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); - if (ret && dst) { - src = dst->buf; - len = dst->len; + if (checksafe != SAFE_CRLF_KEEP_CRLF) { + ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); + if (ret && dst) { + src = dst->buf; + len = dst->len; + } } return ret | ident_to_git(path, src, len, dst, ca.ident); } diff --git a/convert.h b/convert.h index cecf59d1aa..cabd5ed6dd 100644 --- a/convert.h +++ b/convert.h @@ -10,7 +10,8 @@ enum safe_crlf { SAFE_CRLF_FALSE = 0, SAFE_CRLF_FAIL = 1, SAFE_CRLF_WARN = 2, - SAFE_CRLF_RENORMALIZE = 3 + SAFE_CRLF_RENORMALIZE = 3, + SAFE_CRLF_KEEP_CRLF = 4 }; extern enum safe_crlf safe_crlf; -- 2.14.0.rc1.15.gd40c2d4e85.dirty ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH v4 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-14 17:37 ` Junio C Hamano ` (4 preceding siblings ...) 2017-08-19 11:27 ` [PATCH v4 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi @ 2017-08-19 11:28 ` tboegi 5 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-19 11:28 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When a file had been commited with CRLF but now .gitattributes say "* text=auto" (or core.autocrlf is true), the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...) Here, when commited with CRLF, the line endings should not be converted. Note that `git apply --index` or `git apply --cache` doesn't call convert_to_git() because the source material is already in index format. Analyze the patch if there is a) any context line with CRLF, or b) if any line with CRLF is to be removed. In this case the patch file `patch` has mixed line endings, for a) it looks like this: diff --git a/one b/one index 533790e..c30dea8 100644 --- a/one +++ b/one @@ -1 +1,2 @@ a\r +b\r And for b) it looks like this: diff --git a/one b/one index 533790e..485540d 100644 --- a/one +++ b/one @@ -1 +1 @@ -a\r +b\r If `git apply` detects that the patch itself has CRLF, (look at the line " a\r" or "-a\r" above), the new flag crlf_in_old is set in "struct patch" and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. While at there, make clear that read_old_data() in apply.c knows what it wants convert_to_git() to do with respect to CRLF. In fact, this codepath is about applying a patch to a file in the filesystem, which may not exist in the index, or may exist but may not match what is recorded in the index, or in the extreme case, we may not even be in a Git repository. If convert_to_git() peeked at the index while doing its work, it *would* be a bug. Pass NULL instead of &the_index to convert_to_git() to make sure we catch future bugs to clarify this. Update the test in t4124: split one test case into 3: - Detect the " a\r" line in the patch - Detect the "-a\r" line in the patch - Use LF in repo and CLRF in the worktree. Reported-by: Anthony Sottile <asottile@umich.edu> Helped-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- Changes since v3: - took apply.c from junio/tb/apply-with-crlf - Remove the leading asterix in the commit message, at the place where the "git diff" is cited. - Mention "Pass NULL instead of &the_index to convert_to_git()" apply.c | 41 ++++++++++++++++++++++++++++++++++++----- t/t4124-apply-ws-rule.sh | 33 +++++++++++++++++++++++++++------ 2 files changed, 63 insertions(+), 11 deletions(-) diff --git a/apply.c b/apply.c index f2d599141d..66c68f193a 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int crlf_in_old:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,19 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* + * Check if the patch has context lines with CRLF or + * the patch wants to remove lines with CRLF. + */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->crlf_in_old = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1726,14 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + if (!state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -1725,6 +1742,8 @@ static int parse_fragment(struct apply_state *state, trailing = 0; break; case '+': + if (state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2287,11 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, struct patch *patch, + const char *path, struct strbuf *buf) { + enum safe_crlf safe_crlf = patch->crlf_in_old ? + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_RENORMALIZE; switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2300,15 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + /* + * "git apply" without "--index/--cached" should never look + * at the index; the target file may not have been added to + * the index yet, and we may not even be in any Git repository. + * Pass NULL to convert_to_git() to stress this; the function + * should never look at the index when explicit crlf option + * is given. + */ + convert_to_git(NULL, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3384,6 +3414,7 @@ static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, + struct patch *patch, const char *name, unsigned expected_mode) { @@ -3399,7 +3430,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, patch, name, buf)) return error(_("failed to read %s"), name); } } @@ -3432,7 +3463,7 @@ static int load_preimage(struct apply_state *state, /* We have a patched copy in memory; use that. */ strbuf_add(&buf, previous->result, previous->resultsize); } else { - status = load_patch_target(state, &buf, ce, st, + status = load_patch_target(state, &buf, ce, st, patch, patch->old_name, patch->old_mode); if (status < 0) return status; @@ -3520,7 +3551,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, patch, name, mode); if (status < 0) return status; else if (status) diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index d350065f25..4fc27c51f7 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -467,21 +467,42 @@ test_expect_success 'same, but with CR-LF line endings && cr-at-eol set' ' test_cmp one expect ' -test_expect_success 'same, but with CR-LF line endings && cr-at-eol unset' ' +test_expect_success 'CR-LF line endings && add line && text=auto' ' git config --unset core.whitespace && printf "a\r\n" >one && + cp one save-one && + git add one && printf "b\r\n" >>one && - printf "c\r\n" >>one && + cp one expect && + git diff -- one >patch && + mv save-one one && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' + +test_expect_success 'CR-LF line endings && change line && text=auto' ' + printf "a\r\n" >one && cp one save-one && - printf " \r\n" >>one && git add one && + printf "b\r\n" >one && cp one expect && - printf "d\r\n" >>one && git diff -- one >patch && mv save-one one && - echo d >>expect && + echo "one text=auto" >.gitattributes && + git apply patch && + test_cmp one expect +' - git apply --ignore-space-change --whitespace=fix patch && +test_expect_success 'LF in repo, CRLF in worktree && change line && text=auto' ' + printf "a\n" >one && + git add one && + printf "b\r\n" >one && + git diff -- one >patch && + printf "a\r\n" >one && + echo "one text=auto" >.gitattributes && + git -c core.eol=CRLF apply patch && + printf "b\r\n" >expect && test_cmp one expect ' -- 2.14.0.rc1.15.gd40c2d4e85.dirty ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/RFC 1/2] convert: Add SAFE_CRLF_KEEP_CRLF 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano ` (2 preceding siblings ...) 2017-08-12 14:56 ` [PATCH/RFC] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-13 8:51 ` tboegi 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi 4 siblings, 0 replies; 36+ messages in thread From: tboegi @ 2017-08-13 8:51 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- convert.c | 10 ++++++---- convert.h | 3 ++- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/convert.c b/convert.c index deaf0ba7b3..040123b4fe 100644 --- a/convert.c +++ b/convert.c @@ -1104,10 +1104,12 @@ int convert_to_git(const struct index_state *istate, src = dst->buf; len = dst->len; } - ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); - if (ret && dst) { - src = dst->buf; - len = dst->len; + if (checksafe != SAFE_CRLF_KEEP_CRLF) { + ret |= crlf_to_git(istate, path, src, len, dst, ca.crlf_action, checksafe); + if (ret && dst) { + src = dst->buf; + len = dst->len; + } } return ret | ident_to_git(path, src, len, dst, ca.ident); } diff --git a/convert.h b/convert.h index cecf59d1aa..cabd5ed6dd 100644 --- a/convert.h +++ b/convert.h @@ -10,7 +10,8 @@ enum safe_crlf { SAFE_CRLF_FALSE = 0, SAFE_CRLF_FAIL = 1, SAFE_CRLF_WARN = 2, - SAFE_CRLF_RENORMALIZE = 3 + SAFE_CRLF_RENORMALIZE = 3, + SAFE_CRLF_KEEP_CRLF = 4 }; extern enum safe_crlf safe_crlf; -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano ` (3 preceding siblings ...) 2017-08-13 8:51 ` [PATCH/RFC 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi @ 2017-08-13 8:51 ` tboegi 2017-08-16 18:28 ` Junio C Hamano ` (4 more replies) 4 siblings, 5 replies; 36+ messages in thread From: tboegi @ 2017-08-13 8:51 UTC (permalink / raw) To: git, asottile; +Cc: Torsten Bögershausen From: Torsten Bögershausen <tboegi@web.de> When a file had been commited with CRLF and core.autocrlf is true, the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...) Here, when commited with CRLF, the line endings should not be converted. Analyze the patch if there is any context line with CRLF, or if any line with CRLF is to be removed. If yes, the new flag has_crlf is set in "struct patch", and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. Thanks to Junio C Hamano, his input became the base for t4140. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> --- The last version did not pass t4124, fix this. apply.c | 37 ++++++++++++++++++++++++++++--------- apply.h | 4 ++++ t/t4124-apply-ws-rule.sh | 3 +-- t/t4140-apply-CRLF.sh | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 79 insertions(+), 11 deletions(-) create mode 100755 t/t4140-apply-CRLF.sh diff --git a/apply.c b/apply.c index f2d599141d..63455cd65f 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int has_crlf:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,17 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* Check if the patch has context lines with CRLF or + the patch wants to remove lines with CRLF */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->has_crlf = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1724,13 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2282,10 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, int flags) { + enum safe_crlf safe_crlf = flags & APPLY_FLAGS_CR_AT_EOL ? + SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE; switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2294,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + convert_to_git(&the_index, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3385,7 +3401,8 @@ static int load_patch_target(struct apply_state *state, const struct cache_entry *ce, struct stat *st, const char *name, - unsigned expected_mode) + unsigned expected_mode, + int flags) { if (state->cached || state->check_index) { if (read_file_or_gitlink(ce, buf)) @@ -3399,7 +3416,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, name, buf, flags)) return error(_("failed to read %s"), name); } } @@ -3423,6 +3440,7 @@ static int load_preimage(struct apply_state *state, char *img; struct patch *previous; int status; + int flags = patch->has_crlf ? APPLY_FLAGS_CR_AT_EOL : 0; previous = previous_patch(state, patch, &status); if (status) @@ -3433,7 +3451,7 @@ static int load_preimage(struct apply_state *state, strbuf_add(&buf, previous->result, previous->resultsize); } else { status = load_patch_target(state, &buf, ce, st, - patch->old_name, patch->old_mode); + patch->old_name, patch->old_mode, flags); if (status < 0) return status; else if (status == SUBMODULE_PATCH_WITHOUT_INDEX) { @@ -3493,7 +3511,8 @@ static int three_way_merge(struct image *image, */ static int load_current(struct apply_state *state, struct image *image, - struct patch *patch) + struct patch *patch, + int flags) { struct strbuf buf = STRBUF_INIT; int status, pos; @@ -3520,7 +3539,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, name, mode, flags); if (status < 0) return status; else if (status) @@ -3571,7 +3590,8 @@ static int try_threeway(struct apply_state *state, /* our_oid is ours */ if (patch->is_new) { - if (load_current(state, &tmp_image, patch)) + int flags = 0; + if (load_current(state, &tmp_image, patch, flags)) return error(_("cannot read the current contents of '%s'"), patch->new_name); } else { @@ -3617,7 +3637,6 @@ static int apply_data(struct apply_state *state, struct patch *patch, struct stat *st, const struct cache_entry *ce) { struct image image; - if (load_preimage(state, &image, patch, st, ce) < 0) return -1; diff --git a/apply.h b/apply.h index b3d6783d55..192140280f 100644 --- a/apply.h +++ b/apply.h @@ -33,9 +33,13 @@ enum apply_verbosity { #define APPLY_SYMLINK_GOES_AWAY 01 #define APPLY_SYMLINK_IN_RESULT 02 + +#define APPLY_FLAGS_CR_AT_EOL (1<<0) + struct apply_state { const char *prefix; int prefix_length; + int flags; /* These are lock_file related */ struct lock_file *lock_file; diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index d350065f25..4da6e6e894 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -475,11 +475,10 @@ test_expect_success 'same, but with CR-LF line endings && cr-at-eol unset' ' cp one save-one && printf " \r\n" >>one && git add one && - cp one expect && printf "d\r\n" >>one && + cp one expect && git diff -- one >patch && mv save-one one && - echo d >>expect && git apply --ignore-space-change --whitespace=fix patch && test_cmp one expect diff --git a/t/t4140-apply-CRLF.sh b/t/t4140-apply-CRLF.sh new file mode 100755 index 0000000000..fd528daabd --- /dev/null +++ b/t/t4140-apply-CRLF.sh @@ -0,0 +1,46 @@ +#!/bin/sh + +test_description='CRLF diff and apply' + +. ./test-lib.sh + +test_expect_success setup ' + mkdir upstream && + ( + cd upstream && + git init && + git config core.autocrlf false && + >.gitignore && + git add . && + git commit -m gitignore && + printf "F1\r\n" >FileCRLF && + git add . && + git commit -m initial && + git diff HEAD^1 HEAD -- >../patch1 + ) && + git config core.autocrlf true +' + +test_expect_success 'apply patches Replace lines' ' + ( + cd upstream && + printf "F11\r\nF12\r\n" >FileCRLF && + git diff >../patch2Replace + ) && + git apply patch1 && + git apply patch2Replace +' + +test_expect_success 'apply patches Add lines' ' + ( + cd upstream && + git checkout FileCRLF && + printf "F2\r\n" >>FileCRLF && + git diff >../patch2Add + ) && + rm -f FileCRLF && + git apply patch1 && + git apply patch2Add +' + +test_done -- 2.14.1.145.gb3622a4ee9 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi @ 2017-08-16 18:28 ` Junio C Hamano 2017-08-16 18:28 ` [PATCH/FIXUP 3/2] apply: remove unused member apply_state.flags Junio C Hamano ` (3 subsequent siblings) 4 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-16 18:28 UTC (permalink / raw) To: tboegi; +Cc: git, asottile I'll be sending a few patches that apply on top of applying these two patches to show what I meant in my previous review comments. The net change to apply.c, when you combine your 2/2 with these, would become like the attached, which I think makes more sense. Instead of queuing a squashed result, I thought it may help to send them as incremental fixes with their own justification. diff --git a/apply.c b/apply.c index f2d599141d..c06f7014a2 100644 --- a/apply.c +++ b/apply.c @@ -220,6 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; + unsigned int crlf_in_old:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1662,6 +1663,19 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } +/* + * Check if the patch has context lines with CRLF or + * the patch wants to remove lines with CRLF. + */ +static void check_old_for_crlf(struct patch *patch, const char *line, int len) +{ + if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { + patch->ws_rule |= WS_CR_AT_EOL; + patch->crlf_in_old = 1; + } +} + + /* * Parse a unified diff. Note that this really needs to parse each * fragment separately, since the only way to know the difference @@ -1712,11 +1726,14 @@ static int parse_fragment(struct apply_state *state, if (!deleted && !added) leading++; trailing++; + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action == correct_ws_error) check_whitespace(state, line, len, patch->ws_rule); break; case '-': + if (!state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -1725,6 +1742,8 @@ static int parse_fragment(struct apply_state *state, trailing = 0; break; case '+': + if (state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -2268,8 +2287,12 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) +static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, + struct patch *patch) { + enum safe_crlf safe_crlf = (patch->crlf_in_old + ? SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE); + switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -2278,7 +2301,7 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf) case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, 0); + convert_to_git(&the_index, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; @@ -3384,6 +3407,7 @@ static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, + struct patch *patch, const char *name, unsigned expected_mode) { @@ -3399,7 +3423,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf)) + if (read_old_data(st, name, buf, patch)) return error(_("failed to read %s"), name); } } @@ -3432,7 +3456,7 @@ static int load_preimage(struct apply_state *state, /* We have a patched copy in memory; use that. */ strbuf_add(&buf, previous->result, previous->resultsize); } else { - status = load_patch_target(state, &buf, ce, st, + status = load_patch_target(state, &buf, ce, st, patch, patch->old_name, patch->old_mode); if (status < 0) return status; @@ -3520,7 +3544,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode); + status = load_patch_target(state, &buf, ce, &st, patch, name, mode); if (status < 0) return status; else if (status) ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/FIXUP 3/2] apply: remove unused member apply_state.flags 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-16 18:28 ` Junio C Hamano @ 2017-08-16 18:28 ` Junio C Hamano 2017-08-16 18:29 ` [PATCH/FIXUP 4/2] apply: only pay attention to CRLF in the preimage Junio C Hamano ` (2 subsequent siblings) 4 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-16 18:28 UTC (permalink / raw) To: tboegi; +Cc: git, asottile The previous step added the "flags" member to apply_state, but it is never used. Remove it and move the bit assignment macro to apply.c as that is just a private implementation detail. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- apply.c | 2 ++ apply.h | 4 ---- 2 files changed, 2 insertions(+), 4 deletions(-) diff --git a/apply.c b/apply.c index 63455cd65f..7663e63df7 100644 --- a/apply.c +++ b/apply.c @@ -2282,6 +2282,8 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } +#define APPLY_FLAGS_CR_AT_EOL (1<<0) + static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, int flags) { enum safe_crlf safe_crlf = flags & APPLY_FLAGS_CR_AT_EOL ? diff --git a/apply.h b/apply.h index 192140280f..b3d6783d55 100644 --- a/apply.h +++ b/apply.h @@ -33,13 +33,9 @@ enum apply_verbosity { #define APPLY_SYMLINK_GOES_AWAY 01 #define APPLY_SYMLINK_IN_RESULT 02 - -#define APPLY_FLAGS_CR_AT_EOL (1<<0) - struct apply_state { const char *prefix; int prefix_length; - int flags; /* These are lock_file related */ struct lock_file *lock_file; -- 2.14.1-331-g7631d96230 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/FIXUP 4/2] apply: only pay attention to CRLF in the preimage 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-16 18:28 ` Junio C Hamano 2017-08-16 18:28 ` [PATCH/FIXUP 3/2] apply: remove unused member apply_state.flags Junio C Hamano @ 2017-08-16 18:29 ` Junio C Hamano 2017-08-16 18:30 ` [PATCH/FIXUP 5/2] apply: localize the CRLF business to read_old_data() Junio C Hamano 2017-08-16 18:34 ` [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case Junio C Hamano 4 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-16 18:29 UTC (permalink / raw) To: tboegi; +Cc: git, asottile The newly added "patch.has_crlf" member wants to indicate if the incoming patch expects any CRLF line in the patch target, and parse_fragment() implements that logic for "git apply". Rename the member to "patch.crlf_in_old" to clarify what it means, and fix the logic in parse_fragment() so that it also works correctly when running "git apply -R", where '+' lines correspond to the patch target. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- There also is an obvious style fix for comment, but I didn't bother splitting it out to a separate step. apply.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-) diff --git a/apply.c b/apply.c index 7663e63df7..995973da3d 100644 --- a/apply.c +++ b/apply.c @@ -220,7 +220,7 @@ struct patch { unsigned int recount:1; unsigned int conflicted_threeway:1; unsigned int direct_to_threeway:1; - unsigned int has_crlf:1; + unsigned int crlf_in_old:1; struct fragment *fragments; char *result; size_t resultsize; @@ -1663,13 +1663,15 @@ static void check_whitespace(struct apply_state *state, record_ws_error(state, result, line + 1, len - 2, state->linenr); } -/* Check if the patch has context lines with CRLF or - the patch wants to remove lines with CRLF */ +/* + * Check if the patch has context lines with CRLF or + * the patch wants to remove lines with CRLF. + */ static void check_old_for_crlf(struct patch *patch, const char *line, int len) { if (len >= 2 && line[len-1] == '\n' && line[len-2] == '\r') { patch->ws_rule |= WS_CR_AT_EOL; - patch->has_crlf = 1; + patch->crlf_in_old = 1; } } @@ -1730,7 +1732,8 @@ static int parse_fragment(struct apply_state *state, check_whitespace(state, line, len, patch->ws_rule); break; case '-': - check_old_for_crlf(patch, line, len); + if (!state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -1739,6 +1742,8 @@ static int parse_fragment(struct apply_state *state, trailing = 0; break; case '+': + if (state->apply_in_reverse) + check_old_for_crlf(patch, line, len); if (!state->apply_in_reverse && state->ws_error_action != nowarn_ws_error) check_whitespace(state, line, len, patch->ws_rule); @@ -3442,7 +3447,7 @@ static int load_preimage(struct apply_state *state, char *img; struct patch *previous; int status; - int flags = patch->has_crlf ? APPLY_FLAGS_CR_AT_EOL : 0; + int flags = patch->crlf_in_old ? APPLY_FLAGS_CR_AT_EOL : 0; previous = previous_patch(state, patch, &status); if (status) ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/FIXUP 5/2] apply: localize the CRLF business to read_old_data() 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi ` (2 preceding siblings ...) 2017-08-16 18:29 ` [PATCH/FIXUP 4/2] apply: only pay attention to CRLF in the preimage Junio C Hamano @ 2017-08-16 18:30 ` Junio C Hamano 2017-08-16 18:34 ` [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case Junio C Hamano 4 siblings, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-16 18:30 UTC (permalink / raw) To: tboegi; +Cc: git, asottile Previous changes passed a new APPLY_FLAGS_CR_AT_EOL option down from load_preimage() to read_old_data(), because the last function in that callchain needs to decide how its call to convert_to_git() function is made on the data read from the working tree. The load_preimage() function and its direct callees, however, are not limited to the case where the patch is applied to the data in the working tree (i.e. "git apply" that is working as a better "patch -p1"), unlike read_old_data(), which deals only with the patch target in the working tree. They are also responsible for driving "git apply --cached" and "git apply --index", both of which take the current index contents into account and do not need the new special-casing of CRLF. Exposing APPLY_FLAGS_CR_AT_EOL bit to them is misleading. Instead, just pass the "struct patch" down the same callchain, and have read_old_data() look at its crlf_in_old member to make the necessary decision. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- This is what I care about the most in these fix-ups. apply.c | 30 ++++++++++++++---------------- 1 file changed, 14 insertions(+), 16 deletions(-) diff --git a/apply.c b/apply.c index 995973da3d..c06f7014a2 100644 --- a/apply.c +++ b/apply.c @@ -2287,12 +2287,12 @@ static void show_stats(struct apply_state *state, struct patch *patch) add, pluses, del, minuses); } -#define APPLY_FLAGS_CR_AT_EOL (1<<0) - -static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, int flags) +static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, + struct patch *patch) { - enum safe_crlf safe_crlf = flags & APPLY_FLAGS_CR_AT_EOL ? - SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE; + enum safe_crlf safe_crlf = (patch->crlf_in_old + ? SAFE_CRLF_KEEP_CRLF : SAFE_CRLF_FALSE); + switch (st->st_mode & S_IFMT) { case S_IFLNK: if (strbuf_readlink(buf, path, st->st_size) < 0) @@ -3407,9 +3407,9 @@ static int load_patch_target(struct apply_state *state, struct strbuf *buf, const struct cache_entry *ce, struct stat *st, + struct patch *patch, const char *name, - unsigned expected_mode, - int flags) + unsigned expected_mode) { if (state->cached || state->check_index) { if (read_file_or_gitlink(ce, buf)) @@ -3423,7 +3423,7 @@ static int load_patch_target(struct apply_state *state, } else if (has_symlink_leading_path(name, strlen(name))) { return error(_("reading from '%s' beyond a symbolic link"), name); } else { - if (read_old_data(st, name, buf, flags)) + if (read_old_data(st, name, buf, patch)) return error(_("failed to read %s"), name); } } @@ -3447,7 +3447,6 @@ static int load_preimage(struct apply_state *state, char *img; struct patch *previous; int status; - int flags = patch->crlf_in_old ? APPLY_FLAGS_CR_AT_EOL : 0; previous = previous_patch(state, patch, &status); if (status) @@ -3457,8 +3456,8 @@ static int load_preimage(struct apply_state *state, /* We have a patched copy in memory; use that. */ strbuf_add(&buf, previous->result, previous->resultsize); } else { - status = load_patch_target(state, &buf, ce, st, - patch->old_name, patch->old_mode, flags); + status = load_patch_target(state, &buf, ce, st, patch, + patch->old_name, patch->old_mode); if (status < 0) return status; else if (status == SUBMODULE_PATCH_WITHOUT_INDEX) { @@ -3518,8 +3517,7 @@ static int three_way_merge(struct image *image, */ static int load_current(struct apply_state *state, struct image *image, - struct patch *patch, - int flags) + struct patch *patch) { struct strbuf buf = STRBUF_INIT; int status, pos; @@ -3546,7 +3544,7 @@ static int load_current(struct apply_state *state, if (verify_index_match(ce, &st)) return error(_("%s: does not match index"), name); - status = load_patch_target(state, &buf, ce, &st, name, mode, flags); + status = load_patch_target(state, &buf, ce, &st, patch, name, mode); if (status < 0) return status; else if (status) @@ -3597,8 +3595,7 @@ static int try_threeway(struct apply_state *state, /* our_oid is ours */ if (patch->is_new) { - int flags = 0; - if (load_current(state, &tmp_image, patch, flags)) + if (load_current(state, &tmp_image, patch)) return error(_("cannot read the current contents of '%s'"), patch->new_name); } else { @@ -3644,6 +3641,7 @@ static int apply_data(struct apply_state *state, struct patch *patch, struct stat *st, const struct cache_entry *ce) { struct image image; + if (load_preimage(state, &image, patch, st, ce) < 0) return -1; -- 2.14.1-331-g7631d96230 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi ` (3 preceding siblings ...) 2017-08-16 18:30 ` [PATCH/FIXUP 5/2] apply: localize the CRLF business to read_old_data() Junio C Hamano @ 2017-08-16 18:34 ` Junio C Hamano 2017-08-17 6:24 ` Torsten Bögershausen 4 siblings, 1 reply; 36+ messages in thread From: Junio C Hamano @ 2017-08-16 18:34 UTC (permalink / raw) To: tboegi; +Cc: git, asottile With the previous fixes to CRLF handling in place, read_old_data() knows what it wants convert_to_git() to do with respect to CRLF. In fact, this codepath is about applying a patch to a file in the filesystem, which may not exist in the index, or may exist but may not match what is recorded in the index, or in the extreme case, we may not even be in a Git repository. If convert_to_git() peeked at the index while doing its work, it *would* be a bug. Pass NULL instead of &the_index to the function to make sure we catch future bugs to clarify this. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- * NOTE NOTE NOTE: This is not a part of the "squashed diff" I sent earlier, and with this applied, you will see failure in t0020. The breakage is because convert_to_git(), with your [PATCH 1/2], is not yet prepared to be told "there is no need for safe-crlf processing, so do not even look at the index". You either need to invent yet another flag similar to SAFE_CRLF_KEEP_CRLF that is different from SAFE_CRLF_FALSE to tell the machinery never to look at the index, or fix SAFE_CRLF_FALSE itself so that the index is not checked when the caller knows safe-crlf processing is not needed. apply.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/apply.c b/apply.c index c06f7014a2..ad58cd1c77 100644 --- a/apply.c +++ b/apply.c @@ -2301,7 +2301,15 @@ static int read_old_data(struct stat *st, const char *path, struct strbuf *buf, case S_IFREG: if (strbuf_read_file(buf, path, st->st_size) != st->st_size) return error(_("unable to open or read %s"), path); - convert_to_git(&the_index, path, buf->buf, buf->len, buf, safe_crlf); + /* + * "git apply" without "--index/--cached" should never look + * at the index; the target file may not have been added to + * the index yet, and we may not even be in any Git repository. + * Pass NULL to convert_to_git() to stress this; the function + * should never look at the index when explicit crlf option + * is given. + */ + convert_to_git(NULL, path, buf->buf, buf->len, buf, safe_crlf); return 0; default: return -1; -- 2.14.1-331-g7631d96230 ^ permalink raw reply related [flat|nested] 36+ messages in thread
* Re: [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-16 18:34 ` [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case Junio C Hamano @ 2017-08-17 6:24 ` Torsten Bögershausen 2017-08-17 7:06 ` Junio C Hamano 2017-08-17 7:12 ` Junio C Hamano 0 siblings, 2 replies; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-17 6:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, asottile On Wed, Aug 16, 2017 at 11:34:45AM -0700, Junio C Hamano wrote: > With the previous fixes to CRLF handling in place, read_old_data() > knows what it wants convert_to_git() to do with respect to CRLF. In > fact, this codepath is about applying a patch to a file in the > filesystem, which may not exist in the index, or may exist but may > not match what is recorded in the index, or in the extreme case, we > may not even be in a Git repository. If convert_to_git() peeked at > the index while doing its work, it *would* be a bug. > > Pass NULL instead of &the_index to the function to make sure we > catch future bugs to clarify this. Thanks for the work, and now our emails crossed. Calling convert_to_git(NULL,...) makes Git crash today, see t4124, my latest version, "LF in repo, CRLF in working tree) Unless we re-define the meaning of "NULL" into "don't do CRLF conversions, like SAFE_CRLF_KEEP_CRLF does. The combination of convert_to_git(NULL,...,SAFE_CRLF_KEEP_CRLF) is OK, but all others must supply an &index. At a very first glance the end result may look like this: - Take my changes in convert.[ch] - Take your changes/commit in apply.c (except the NULL, see above) - Take my changes in t4124. I don't have time to look at this today or tomorrow, please give a hint if you are working further. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-17 6:24 ` Torsten Bögershausen @ 2017-08-17 7:06 ` Junio C Hamano 2017-08-17 7:12 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 7:06 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git, asottile Torsten Bögershausen <tboegi@web.de> writes: > I don't have time to look at this today or tomorrow, > please give a hint if you are working further. It is past my bedtime, and generally I prefer not to touch topics that I know other people are willing to look into, especially when I know those "other people" are well informed and capable. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-17 6:24 ` Torsten Bögershausen 2017-08-17 7:06 ` Junio C Hamano @ 2017-08-17 7:12 ` Junio C Hamano 2017-08-17 8:24 ` Torsten Bögershausen 2017-08-17 17:07 ` Junio C Hamano 1 sibling, 2 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 7:12 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git, asottile Torsten Bögershausen <tboegi@web.de> writes: > Unless we re-define the meaning of "NULL" into "don't do CRLF conversions, > like SAFE_CRLF_KEEP_CRLF does. My preference is not to use NULL as any hint. Instead, the "flag" parameter we already pass to convert_to_git(), just like the updated read_old_data() uses SAFE_CRLF_KEEP_CRLF to tell it that it should not disturb existing CRLF without looking at the istate, should be used to tell convert_to_git() to do the opposite, but do so without looking at the istate. Perhaps SAFE_CRLF_FALSE should be such a flag. Or perhaps we need to invent another flag. I dunno. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-17 7:12 ` Junio C Hamano @ 2017-08-17 8:24 ` Torsten Bögershausen 2017-08-17 17:07 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Torsten Bögershausen @ 2017-08-17 8:24 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, asottile On Thu, Aug 17, 2017 at 12:12:36AM -0700, Junio C Hamano wrote: > Torsten Bögershausen <tboegi@web.de> writes: > > > Unless we re-define the meaning of "NULL" into "don't do CRLF conversions, > > like SAFE_CRLF_KEEP_CRLF does. > > My preference is not to use NULL as any hint. Instead, the "flag" > parameter we already pass to convert_to_git(), just like the updated > read_old_data() uses SAFE_CRLF_KEEP_CRLF to tell it that it should > not disturb existing CRLF without looking at the istate, should be > used to tell convert_to_git() to do the opposite, but do so without > looking at the istate. > > Perhaps SAFE_CRLF_FALSE should be such a flag. Or perhaps we need > to invent another flag. I dunno. OK, message taken, in short: I will come up with a new series in a couple of days - thanks for the input. ^ permalink raw reply [flat|nested] 36+ messages in thread
* Re: [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case 2017-08-17 7:12 ` Junio C Hamano 2017-08-17 8:24 ` Torsten Bögershausen @ 2017-08-17 17:07 ` Junio C Hamano 1 sibling, 0 replies; 36+ messages in thread From: Junio C Hamano @ 2017-08-17 17:07 UTC (permalink / raw) To: Torsten Bögershausen; +Cc: git, asottile Junio C Hamano <gitster@pobox.com> writes: > Torsten Bögershausen <tboegi@web.de> writes: > >> Unless we re-define the meaning of "NULL" into "don't do CRLF conversions, >> like SAFE_CRLF_KEEP_CRLF does. > > My preference is not to use NULL as any hint. Instead, the "flag" > parameter we already pass to convert_to_git(), just like the updated > read_old_data() uses SAFE_CRLF_KEEP_CRLF to tell it that it should > not disturb existing CRLF without looking at the istate, should be > used to tell convert_to_git() to do the opposite, but do so without > looking at the istate. > > Perhaps SAFE_CRLF_FALSE should be such a flag. Or perhaps we need > to invent another flag. I dunno. I grepped for SAFE_CRLF_FALSE and found only two callers that explicitly pass it down the callchain, both of which essentially says "if we are writing the object out, use core.safecrlf, but if we are merely hashing, do SAFE_CRLF_FALSE thing". I think their use case is quite similar to the codepath we are discussing---they have a data that come from the outside world, and they know the index entry that happens to be at the path has nothing to do with the data they are asking convert_to_git() to massage (i.e. it is _wrong_ if the contents of the blob that happens to be in the index at the path affected the outcome of the conversion). So I think the fix to convert_to_git() can just use SAFE_CRLF_FALSE as such an instruction that tells the function not do the "safe crlf" thing, which looks at the contents in the index and decide if the CRLFs in the contents being converted should be turned into LFs. And because the function is told not to look at the index, it should be made safe to pass istate=NULL. There does not seem to be a need to invent another flag. Thanks. ^ permalink raw reply [flat|nested] 36+ messages in thread
end of thread, other threads:[~2017-08-19 11:28 UTC | newest] Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-08-01 18:24 core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Anthony Sottile 2017-08-01 20:47 ` Torsten Bögershausen 2017-08-01 20:58 ` Anthony Sottile 2017-08-02 15:44 ` Torsten Bögershausen 2017-08-02 20:42 ` [PATCH v1 1/1] correct apply for files commited with CRLF tboegi 2017-08-02 21:13 ` Junio C Hamano 2017-08-04 17:31 ` Torsten Bögershausen 2017-08-04 17:57 ` Junio C Hamano 2017-08-04 19:26 ` Junio C Hamano 2017-08-02 20:58 ` core.autocrlf=true causes `git apply` to fail on patch generated with `git diff-index HEAD --patch` Junio C Hamano 2017-08-12 5:45 ` Torsten Bögershausen 2017-08-12 5:53 ` Torsten Bögershausen 2017-08-12 14:56 ` [PATCH/RFC] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-12 14:56 ` [PATCH/RFC] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-14 17:37 ` Junio C Hamano 2017-08-17 6:06 ` [PATCH v2 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-17 6:06 ` [PATCH v2 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-17 6:37 ` Junio C Hamano 2017-08-17 21:43 ` [PATCH v3 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-17 21:43 ` [PATCH v3 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-17 22:29 ` Junio C Hamano 2017-08-17 22:35 ` Junio C Hamano 2017-08-19 11:27 ` [PATCH v4 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-19 11:28 ` [PATCH v4 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-13 8:51 ` [PATCH/RFC 1/2] convert: Add SAFE_CRLF_KEEP_CRLF tboegi 2017-08-13 8:51 ` [PATCH/RFC 2/2] File commited with CRLF should roundtrip diff and apply tboegi 2017-08-16 18:28 ` Junio C Hamano 2017-08-16 18:28 ` [PATCH/FIXUP 3/2] apply: remove unused member apply_state.flags Junio C Hamano 2017-08-16 18:29 ` [PATCH/FIXUP 4/2] apply: only pay attention to CRLF in the preimage Junio C Hamano 2017-08-16 18:30 ` [PATCH/FIXUP 5/2] apply: localize the CRLF business to read_old_data() Junio C Hamano 2017-08-16 18:34 ` [PATCH/FIXUP 6/2] apply: clarify read_old_data() is about no-index case Junio C Hamano 2017-08-17 6:24 ` Torsten Bögershausen 2017-08-17 7:06 ` Junio C Hamano 2017-08-17 7:12 ` Junio C Hamano 2017-08-17 8:24 ` Torsten Bögershausen 2017-08-17 17:07 ` Junio C Hamano
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.