All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows
@ 2017-03-30 19:35 Jakub Narębski
  2017-03-30 20:00 ` Jeff King
  2017-03-31 12:38 ` Torsten Bögershausen
  0 siblings, 2 replies; 10+ messages in thread
From: Jakub Narębski @ 2017-03-30 19:35 UTC (permalink / raw)
  To: git

Hello,

Recently I had to work on a project which uses legacy 8-bit encoding
(namely cp1250 encoding) instead of utf-8 for text files (LaTeX
documents).  My terminal, that is Git Bash from Git for Windows is set
up for utf-8.

I wanted for "git diff" and friends to return something sane on said
utf-8 terminal, instead of mojibake.  There is 'encoding'
gitattribute... but it works only for GUI ('git gui', that is).

Therefore I have (ab)used textconv facility to convert from cp1250 of
file encoding to utf-8 encoding of console.

I have set the following in .gitattributes file:

  ## LaTeX documents in cp1250 encoding
  *.tex text diff=mylatex

The 'mylatex' driver is defined as:

  [diff "mylatex"]
        xfuncname = "^(\\\\((sub)*section|chapter|part)\\*{0,1}\\{.*)$"
        wordRegex = "\\\\[a-zA-Z]+|[{}]|\\\\.|[^\\{}[:space:]]+"
        textconv  = \"C:/Program Files/Git/usr/bin/iconv.exe\" -f cp1250 -t utf-8
        cachetextconv = true

And everything would be all right... if not the fact that Git appends
spurious ^M to added lines in the `git diff` output.  Files use CRLF
end-of-line convention (the native MS Windows one).

  $ git diff test.tex
  diff --git a/test.tex b/test.tex
  index 029646e..250ab16 100644
  --- a/test.tex
  +++ b/test.tex
  @@ -1,4 +1,4 @@
  -\documentclass{article}
  +\documentclass{mwart}^M
  
   \usepackage[cp1250]{inputenc}
   \usepackage{polski}

What gives?  Why there is this ^M tacked on the end of added lines,
while it is not present in deleted lines, nor in content lines?

Puzzled.

P.S. Git has `i18n.commitEncoding` and `i18n.logOutputEncoding`; pity
that it doesn't supports in core `encoding` attribute together with
having `i18n.outputEncoding`.
--
Jakub Narębski



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-04-02 11:41 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-03-30 19:35 [BUG?] iconv used as textconv, and spurious ^M on added lines on Windows Jakub Narębski
2017-03-30 20:00 ` Jeff King
2017-03-31 13:24   ` Jakub Narębski
2017-04-01  6:08     ` Jeff King
2017-04-01 18:31       ` Jakub Narębski
2017-04-02  7:45         ` Jeff King
2017-04-02 11:40           ` Jakub Narębski
2017-03-31 12:38 ` Torsten Bögershausen
2017-03-31 19:44   ` Jakub Narębski
2017-04-02  4:34     ` Torsten Bögershausen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.