* [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows @ 2017-03-30 19:35 Jakub Narębski 2017-03-30 20:00 ` Jeff King 2017-03-31 12:38 ` Torsten Bögershausen 0 siblings, 2 replies; 10+ messages in thread From: Jakub Narębski @ 2017-03-30 19:35 UTC (permalink / raw) To: git Hello, Recently I had to work on a project which uses legacy 8-bit encoding (namely cp1250 encoding) instead of utf-8 for text files (LaTeX documents). My terminal, that is Git Bash from Git for Windows is set up for utf-8. I wanted for "git diff" and friends to return something sane on said utf-8 terminal, instead of mojibake. There is 'encoding' gitattribute... but it works only for GUI ('git gui', that is). Therefore I have (ab)used textconv facility to convert from cp1250 of file encoding to utf-8 encoding of console. I have set the following in .gitattributes file: ## LaTeX documents in cp1250 encoding *.tex text diff=mylatex The 'mylatex' driver is defined as: [diff "mylatex"] xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$" wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" textconv = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8 cachetextconv = true And everything would be all right... if not the fact that Git appends spurious ^M to added lines in the `git diff` output. Files use CRLF end-of-line convention (the native MS Windows one). $ git diff test.tex diff --git a/test.tex b/test.tex index 029646e..250ab16 100644 --- a/test.tex +++ b/test.tex @@ -1,4 +1,4 @@ -\documentclass{article} +\documentclass{mwart}^M \usepackage[cp1250]{inputenc} \usepackage{polski} What gives? Why there is this ^M tacked on the end of added lines, while it is not present in deleted lines, nor in content lines? Puzzled. P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity that it doesn't supports in core `encoding` attribute together with having `i18n.outputEncoding`. -- Jakub Narębski ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski @ 2017-03-30 20:00 ` Jeff King 2017-03-31 13:24 ` Jakub Narębski 2017-03-31 12:38 ` Torsten Bögershausen 1 sibling, 1 reply; 10+ messages in thread From: Jeff King @ 2017-03-30 20:00 UTC (permalink / raw) To: Jakub Narębski; +Cc: git On Thu, Mar 30, 2017 at 09:35:27PM +0200, Jakub Narębski wrote: > And everything would be all right... if not the fact that Git appends > spurious ^M to added lines in the `git diff` output. Files use CRLF > end-of-line convention (the native MS Windows one). > > $ git diff test.tex > diff --git a/test.tex b/test.tex > index 029646e..250ab16 100644 > --- a/test.tex > +++ b/test.tex > @@ -1,4 +1,4 @@ > -\documentclass{article} > +\documentclass{mwart}^M > > \usepackage[cp1250]{inputenc} > \usepackage{polski} > > What gives? Why there is this ^M tacked on the end of added lines, > while it is not present in deleted lines, nor in content lines? Perhaps it's trailing whitespace highlighting for added lines? You can add "cr-at-eol" to core.whitespace to suppress it. I suspect in the normal case that git is doing line-ending conversion, but it's suppressed when textconv is in use. -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-30 20:00 ` Jeff King @ 2017-03-31 13:24 ` Jakub Narębski 2017-04-01 6:08 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Jakub Narębski @ 2017-03-31 13:24 UTC (permalink / raw) To: Jeff King; +Cc: git W dniu 30.03.2017 o 22:00, Jeff King pisze: > On Thu, Mar 30, 2017 at 09:35:27PM +0200, Jakub Narębski wrote: > >> And everything would be all right... if not the fact that Git appends >> spurious ^M to added lines in the `git diff` output. Files use CRLF >> end-of-line convention (the native MS Windows one). >> >> $ git diff test.tex >> diff --git a/test.tex b/test.tex >> index 029646e..250ab16 100644 >> --- a/test.tex >> +++ b/test.tex >> @@ -1,4 +1,4 @@ >> -\documentclass{article} >> +\documentclass{mwart}^M >> >> \usepackage[cp1250]{inputenc} >> \usepackage{polski} >> >> What gives? Why there is this ^M tacked on the end of added lines, >> while it is not present in deleted lines, nor in content lines? Gah, I forgot that Git for Windows installed with default options uses `core.autocrlf=true`, so file contents is stored in repository and in the index using LF end-of-line convention -- that is why there is no ^M in pre-image (in removed lines). > Perhaps it's trailing whitespace highlighting for added lines? You can > add "cr-at-eol" to core.whitespace to suppress it. Thanks! That solves the problem (or rather workarounds it). > > I suspect in the normal case that git is doing line-ending conversion, > but it's suppressed when textconv is in use. I would not consider this a bug if not for the fact that there is no ^M without using iconv as textconv. Compare (without textconv => no ^M, but mojibake): $ git diff test.txt diff --git a/test.txt b/test.txt index 029646e..38cd657 100644 --- a/test.txt +++ b/test.txt @@ -1,9 +1,10 @@ -\documentclass{article} +\documentclass{mwart} \usepackage[cp1250]{inputenc} \usepackage{polski} \begin{document} +Za<BF><F3><B3><E6> g<EA><9C>l<B9> ja<9F><F1>! \end{document} with the following (with textconv => no gibberish, but ^M): $ git diff test.tex diff --git a/test.tex b/test.tex index 029646e..38cd657 100644 --- a/test.tex +++ b/test.tex @@ -1,9 +1,10 @@ -\documentclass{article} +\documentclass{mwart}^M \usepackage[cp1250]{inputenc} \usepackage{polski} \begin{document} +Zażółć gęślą jaźń!^M \end{document} -- Jakub Narębski ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-31 13:24 ` Jakub Narębski @ 2017-04-01 6:08 ` Jeff King 2017-04-01 18:31 ` Jakub Narębski 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2017-04-01 6:08 UTC (permalink / raw) To: Jakub Narębski; +Cc: git On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote: > > I suspect in the normal case that git is doing line-ending conversion, > > but it's suppressed when textconv is in use. > > I would not consider this a bug if not for the fact that there is no ^M > without using iconv as textconv. I don't think it's a bug, though. You have told Git that you will convert the contents (whatever their format) into the canonical format, but your program to do so includes a CR. We _could_ further process with other canonicalizations, but I'm not sure that is a good idea (line-endings sound reasonably harmless, but almost certainly we should not be doing clean/smudge filtering). And I'm not sure if there would be any compatibility fallouts. So I think the behavior is perhaps not what you want, but it's not an unreasonable one. And the solution is to define your textconv such that it produces clean LF-only output. Perhaps: [diff.whatever] textconv = "iconv ... | tr -d '\r'" ? -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-04-01 6:08 ` Jeff King @ 2017-04-01 18:31 ` Jakub Narębski 2017-04-02 7:45 ` Jeff King 0 siblings, 1 reply; 10+ messages in thread From: Jakub Narębski @ 2017-04-01 18:31 UTC (permalink / raw) To: Jeff King; +Cc: git W dniu 01.04.2017 o 08:08, Jeff King pisze: > On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote: > >>> I suspect in the normal case that git is doing line-ending conversion, >>> but it's suppressed when textconv is in use. >> >> I would not consider this a bug if not for the fact that there is no ^M >> without using iconv as textconv. > > I don't think it's a bug, though. You have told Git that you will > convert the contents (whatever their format) into the canonical format, > but your program to do so includes a CR. Well, I have not declared file binary with "binary = true" in diff driver definition, isn't it? > > We _could_ further process with other canonicalizations, but I'm not > sure that is a good idea (line-endings sound reasonably harmless, but > almost certainly we should not be doing clean/smudge filtering). And I'm > not sure if there would be any compatibility fallouts. Yes, gitattributes(5) defines interaction between 'text'/'eol', 'ident' and 'filter' attributes, but nothing about 'diff' and 'text'/'eol'. > > So I think the behavior is perhaps not what you want, but it's not an > unreasonable one. And the solution is to define your textconv such that > it produces clean LF-only output. Perhaps: > > [diff.whatever] > textconv = "iconv ... | tr -d '\r'" Well, either this (or equivalent using dos2unix), or using 'whitespace' attribute (or 'core.whitespace') with cr-at-eol. P.S. What do you think about Git supporting 'encoding' attribute (or 'core.encoding' config) plus 'core.outputEncoding' in-core? Best, -- Jakub Narębski ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-04-01 18:31 ` Jakub Narębski @ 2017-04-02 7:45 ` Jeff King 2017-04-02 11:40 ` Jakub Narębski 0 siblings, 1 reply; 10+ messages in thread From: Jeff King @ 2017-04-02 7:45 UTC (permalink / raw) To: Jakub Narębski; +Cc: git On Sat, Apr 01, 2017 at 08:31:27PM +0200, Jakub Narębski wrote: > W dniu 01.04.2017 o 08:08, Jeff King pisze: > > On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote: > > > >>> I suspect in the normal case that git is doing line-ending conversion, > >>> but it's suppressed when textconv is in use. > >> > >> I would not consider this a bug if not for the fact that there is no ^M > >> without using iconv as textconv. > > > > I don't think it's a bug, though. You have told Git that you will > > convert the contents (whatever their format) into the canonical format, > > but your program to do so includes a CR. > > Well, I have not declared file binary with "binary = true" in diff driver > definition, isn't it? I don't think binary has anything to do with it. A textconv filter takes input (binary or not) and delivers a normalized representation to feed to the diff algorithm. There's no further post-processing, and it's the responsibility of the filter to deliver the bytes it wants diffed. Like I said, I could see an argument for treating the filter output as text to be pre-processed, but that's not how it works (and I don't think it is a good idea to change it now, unless by adding an option to the diff filter). > P.S. What do you think about Git supporting 'encoding' attribute (or > 'core.encoding' config) plus 'core.outputEncoding' in-core? Supporting an "encoding" attribute to normalize file encodings in diffs seems reasonable to me. But it would have to be enabled only for human-readable diffs, as the result could not be applied (so the same as textconv). I don't think core.outputEncoding is necessarily a good idea. We are not really equipped anything that isn't an ascii superset, as we intermingle the bytes with ascii diff headers (though I think that is true of the commitEncoding stuff; I assume everything breaks horribly if you tried to set that to UTF-16, but I've never tried it). -Peff ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-04-02 7:45 ` Jeff King @ 2017-04-02 11:40 ` Jakub Narębski 0 siblings, 0 replies; 10+ messages in thread From: Jakub Narębski @ 2017-04-02 11:40 UTC (permalink / raw) To: Jeff King; +Cc: git W dniu 02.04.2017 o 09:45, Jeff King pisze: > On Sat, Apr 01, 2017 at 08:31:27PM +0200, Jakub Narębski wrote: > >> W dniu 01.04.2017 o 08:08, Jeff King pisze: >>> On Fri, Mar 31, 2017 at 03:24:48PM +0200, Jakub Narębski wrote: >>> >>>>> I suspect in the normal case that git is doing line-ending conversion, >>>>> but it's suppressed when textconv is in use. >>>> >>>> I would not consider this a bug if not for the fact that there is no ^M >>>> without using iconv as textconv. >>> >>> I don't think it's a bug, though. You have told Git that you will >>> convert the contents (whatever their format) into the canonical format, >>> but your program to do so includes a CR. >> >> Well, I have not declared file binary with "binary = true" in diff driver >> definition, isn't it? > > I don't think binary has anything to do with it. A textconv filter takes > input (binary or not) and delivers a normalized representation to feed > to the diff algorithm. There's no further post-processing, and it's the > responsibility of the filter to deliver the bytes it wants diffed. > > Like I said, I could see an argument for treating the filter output as > text to be pre-processed, but that's not how it works (and I don't think > it is a good idea to change it now, unless by adding an option to the > diff filter). I think that actually there is something wrong. If textconv really gets normalized representation of pre-image (the index version) and post-image (the filesystem version), as it should I think, both pre-image lines ('-') and post-image lines ('+') should use CRLF, so there should be no warning, i.e. ^M Or textconv filter gets normalized representation (it looks this way when examining diff result saved to file with `git diff test.tex >test.diff`; I were unable to use `tr '\r' 'Q', either I got "fatal: bad config line" from Git, or "tr: extra operand" from tr), and somehow Git mistakes what is happening and writes those ^M. If I understand it correctly, if pre-image, post-image and context all use the same eol, there should be no warning, isn't it? > >> P.S. What do you think about Git supporting 'encoding' attribute (or >> 'core.encoding' config) plus 'core.outputEncoding' in-core? > > Supporting an "encoding" attribute to normalize file encodings in diffs > seems reasonable to me. But it would have to be enabled only for > human-readable diffs, as the result could not be applied (so the same as > textconv). I was thinking about human readable diffs, and 'git show <blob>', same as with textconv. > > I don't think core.outputEncoding is necessarily a good idea. We are not > really equipped anything that isn't an ascii superset, as we intermingle > the bytes with ascii diff headers (though I think that is true of the > commitEncoding stuff; I assume everything breaks horribly if you tried > to set that to UTF-16, but I've never tried it). Well, the understanding would be that the same limitation as for core.logOutputEncoding (documented if it isn't) that only encodings that are ASCII compatibile are supported. -- Jakub Narębski ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski 2017-03-30 20:00 ` Jeff King @ 2017-03-31 12:38 ` Torsten Bögershausen 2017-03-31 19:44 ` Jakub Narębski 1 sibling, 1 reply; 10+ messages in thread From: Torsten Bögershausen @ 2017-03-31 12:38 UTC (permalink / raw) To: Jakub Narębski, git On 30.03.17 21:35, Jakub Narębski wrote: > Hello, > > Recently I had to work on a project which uses legacy 8-bit encoding > (namely cp1250 encoding) instead of utf-8 for text files (LaTeX > documents). My terminal, that is Git Bash from Git for Windows is set > up for utf-8. > > I wanted for "git diff" and friends to return something sane on said > utf-8 terminal, instead of mojibake. There is 'encoding' > gitattribute... but it works only for GUI ('git gui', that is). > > Therefore I have (ab)used textconv facility to convert from cp1250 of > file encoding to utf-8 encoding of console. > > I have set the following in .gitattributes file: > > ## LaTeX documents in cp1250 encoding > *.tex text diff=mylatex > > The 'mylatex' driver is defined as: > > [diff "mylatex"] > xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$" > wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" > textconv = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8 > cachetextconv = true > > And everything would be all right... if not the fact that Git appends > spurious ^M to added lines in the `git diff` output. Files use CRLF > end-of-line convention (the native MS Windows one). > > $ git diff test.tex > diff --git a/test.tex b/test.tex > index 029646e..250ab16 100644 > --- a/test.tex > +++ b/test.tex > @@ -1,4 +1,4 @@ > -\documentclass{article} > +\documentclass{mwart}^M > > \usepackage[cp1250]{inputenc} > \usepackage{polski} > > What gives? Why there is this ^M tacked on the end of added lines, > while it is not present in deleted lines, nor in content lines? > > Puzzled. > > P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity > that it doesn't supports in core `encoding` attribute together with > having `i18n.outputEncoding`. > -- > Jakub Narębski > > Is there a chance to give us a receipt how to reproduce it? A complete test script or ? (I don't want to speculate, if the invocation of iconv is the problem, where stdout is not in "binary mode", or however this is called under Windows) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-31 12:38 ` Torsten Bögershausen @ 2017-03-31 19:44 ` Jakub Narębski 2017-04-02 4:34 ` Torsten Bögershausen 0 siblings, 1 reply; 10+ messages in thread From: Jakub Narębski @ 2017-03-31 19:44 UTC (permalink / raw) To: Torsten Bögershausen, git W dniu 31.03.2017 o 14:38, Torsten Bögershausen pisze: > On 30.03.17 21:35, Jakub Narębski wrote: >> Hello, >> >> Recently I had to work on a project which uses legacy 8-bit encoding >> (namely cp1250 encoding) instead of utf-8 for text files (LaTeX >> documents). My terminal, that is Git Bash from Git for Windows is set >> up for utf-8. >> >> I wanted for "git diff" and friends to return something sane on said >> utf-8 terminal, instead of mojibake. There is 'encoding' >> gitattribute... but it works only for GUI ('git gui', that is). >> >> Therefore I have (ab)used textconv facility to convert from cp1250 of >> file encoding to utf-8 encoding of console. >> >> I have set the following in .gitattributes file: >> >> ## LaTeX documents in cp1250 encoding >> *.tex text diff=mylatex >> >> The 'mylatex' driver is defined as: >> >> [diff "mylatex"] >> xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$" >> wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" >> textconv = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8 >> cachetextconv = true >> >> And everything would be all right... if not the fact that Git appends >> spurious ^M to added lines in the `git diff` output. Files use CRLF >> end-of-line convention (the native MS Windows one). >> >> $ git diff test.tex >> diff --git a/test.tex b/test.tex >> index 029646e..250ab16 100644 >> --- a/test.tex >> +++ b/test.tex >> @@ -1,4 +1,4 @@ >> -\documentclass{article} >> +\documentclass{mwart}^M >> >> \usepackage[cp1250]{inputenc} >> \usepackage{polski} >> >> What gives? Why there is this ^M tacked on the end of added lines, >> while it is not present in deleted lines, nor in content lines? >> >> Puzzled. >> >> P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity >> that it doesn't supports in core `encoding` attribute together with >> having `i18n.outputEncoding`. > > Is there a chance to give us a receipt how to reproduce it? > A complete test script or ? > (I don't want to speculate, if the invocation of iconv is the problem, > where stdout is not in "binary mode", or however this is called under Windows) I'm sorry, I though I posted whole recipe, but I missed some details in the above description of the case. First, files are stored on filesystem using CRLF eol (DOS end-of-line convention). Due to `core.autocrlf` they are converted to LF in blobs, that is in the index and in the repository. Second, a textconv with filter preserving end-of-line needs to be configured. I have used `iconv`, but I suspect that the problem would happen also for `cat`. In the .gitattributes file, or .git/info/attributes add, for example: *.tex text diff=myconv In the .git/config configure the textconv filter, for example: [diff "myconv"] textconv = iconv.exe -f cp1250 -t utf-8 Create a file which filename matches the attribute line, and which uses CRLF end of line convention, and add it to Git (adding it to the index): $ printf "foo\r\n" >foo.tex $ git add foo.tex Modify file (also with CRLF): $ printf "bar\r\n" >foo.tex Check the difference $ git diff foo.tex HTH -- Jakub Narębski ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows 2017-03-31 19:44 ` Jakub Narębski @ 2017-04-02 4:34 ` Torsten Bögershausen 0 siblings, 0 replies; 10+ messages in thread From: Torsten Bögershausen @ 2017-04-02 4:34 UTC (permalink / raw) To: Jakub Narębski, git On 2017-03-31 21:44, Jakub Narębski wrote: > W dniu 31.03.2017 o 14:38, Torsten Bögershausen pisze: >> On 30.03.17 21:35, Jakub Narębski wrote: >>> Hello, >>> >>> Recently I had to work on a project which uses legacy 8-bit encoding >>> (namely cp1250 encoding) instead of utf-8 for text files (LaTeX >>> documents). My terminal, that is Git Bash from Git for Windows is set >>> up for utf-8. >>> >>> I wanted for "git diff" and friends to return something sane on said >>> utf-8 terminal, instead of mojibake. There is 'encoding' >>> gitattribute... but it works only for GUI ('git gui', that is). >>> >>> Therefore I have (ab)used textconv facility to convert from cp1250 of >>> file encoding to utf-8 encoding of console. >>> >>> I have set the following in .gitattributes file: >>> >>> ## LaTeX documents in cp1250 encoding >>> *.tex text diff=mylatex >>> >>> The 'mylatex' driver is defined as: >>> >>> [diff "mylatex"] >>> xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$" >>> wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+" >>> textconv = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8 >>> cachetextconv = true >>> >>> And everything would be all right... if not the fact that Git appends >>> spurious ^M to added lines in the `git diff` output. Files use CRLF >>> end-of-line convention (the native MS Windows one). >>> >>> $ git diff test.tex >>> diff --git a/test.tex b/test.tex >>> index 029646e..250ab16 100644 >>> --- a/test.tex >>> +++ b/test.tex >>> @@ -1,4 +1,4 @@ >>> -\documentclass{article} >>> +\documentclass{mwart}^M >>> >>> \usepackage[cp1250]{inputenc} >>> \usepackage{polski} >>> >>> What gives? Why there is this ^M tacked on the end of added lines, >>> while it is not present in deleted lines, nor in content lines? >>> >>> Puzzled. >>> >>> P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity >>> that it doesn't supports in core `encoding` attribute together with >>> having `i18n.outputEncoding`. >> >> Is there a chance to give us a receipt how to reproduce it? >> A complete test script or ? >> (I don't want to speculate, if the invocation of iconv is the problem, >> where stdout is not in "binary mode", or however this is called under Windows) > > I'm sorry, I though I posted whole recipe, but I missed some details > in the above description of the case. > > First, files are stored on filesystem using CRLF eol (DOS end-of-line > convention). Due to `core.autocrlf` they are converted to LF in blobs, > that is in the index and in the repository. > > Second, a textconv with filter preserving end-of-line needs to be > configured. I have used `iconv`, but I suspect that the problem would > happen also for `cat`. > > In the .gitattributes file, or .git/info/attributes add, for example: > > *.tex text diff=myconv > > In the .git/config configure the textconv filter, for example: > > [diff "myconv"] > textconv = iconv.exe -f cp1250 -t utf-8 > > Create a file which filename matches the attribute line, and which > uses CRLF end of line convention, and add it to Git (adding it to > the index): > > $ printf "foo\r\n" >foo.tex > $ git add foo.tex > > Modify file (also with CRLF): > > $ printf "bar\r\n" >foo.tex > > Check the difference > > $ git diff foo.tex > > HTH > There seems to be a bug in Git, when it comes to "git diff". Before we feed the content of the working tree into the diff machinery, a call to convert_to_git() should be made. But it seems as there is something missing, the expected "+fox" becomes a "+foxQ" #!/bin/sh test_description='CRLF with diff filter' . ./test-lib.sh test_expect_success 'setup' ' git config core.autocrlf input && printf "foo\r\n" >foo.tex && git add foo.tex && echo >.gitattributes && git checkout -b master && git add .gitattributes && git commit -m "Add foo.txt" && cat >.git/config <<-\EOF [diff "myconv"] textconv = sed -e "s/f/g" EOF ' test_expect_success 'check EOL in diff' ' printf "fox\r\n" >foo.tex && cat >expect <<-\EOF && diff --git a/foo.tex b/foo.tex index 257cc56..88c2893 100644 --- a/foo.tex +++ b/foo.tex @@ -1 +1 @@ -foo +fox EOF git diff foo.tex | tr "\015" Q >actual && test_cmp expect actual ' test_done ^ permalink raw reply related [flat|nested] 10+ messages in thread
end of thread, other threads:[~2017-04-02 11:41 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski 2017-03-30 20:00 ` Jeff King 2017-03-31 13:24 ` Jakub Narębski 2017-04-01 6:08 ` Jeff King 2017-04-01 18:31 ` Jakub Narębski 2017-04-02 7:45 ` Jeff King 2017-04-02 11:40 ` Jakub Narębski 2017-03-31 12:38 ` Torsten Bögershausen 2017-03-31 19:44 ` Jakub Narębski 2017-04-02 4:34 ` Torsten Bögershausen
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.