All of lore.kernel.org
 help / color / mirror / Atom feed
* [BUG] Mojibake in git gui and gitk for certain unicode chars
@ 2015-01-22 11:43 Tobias Getzner
  2015-01-22 12:18 ` Tobias Getzner
  0 siblings, 1 reply; 2+ messages in thread
From: Tobias Getzner @ 2015-01-22 11:43 UTC (permalink / raw)
  To: git

Hello,

I’ve noticed git gui and gitk seem to have problems decoding certain
unicode characters. E.g., when a commit contains the character «👍»
(thumbs up sign; U+1F44D) in UTF-8 encoding, this character will show
as «ðŸ‘» in gitk. git gui also displays it using the same sequence.
When trying to stage lines within the context of such characters, the
program will error out (corrupt patch).

The character sequence appears to be mojibake introduced by decoding
UTF-8 as ISO-8859-1. However, my locale is set to «en_US.utf8». git gui
is also set to assume UTF-8 encoding for files, and in the list menu
where this encoding is selected, it lists the UTF-8 option under
«system encoding», which suggests that my locale is correctly picked
up.

Is there perchance any heuristics in place which tries decoding files
as unicode, with a fall-back to latin1? If so, then potentially the bug
could be due to U+1F44D tripping up the decoder, triggering a
fall-back, and rendering the characters as mojibake.

I’ve noticed a perhaps related glitch when the options in git gui is
shown. My committer name contains the character «ß» (latin small letter
sharp s; U+00DF). The text field in the options dialog displays this as
«ÃŸ», which also seems to be UTF-8 to latin1 mojibake. Curiously, the
same character displays just fine when staging parts of files via git
gui, so the issue is not quite the same as the one described above.

Best regards,
Tobias

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [BUG] Mojibake in git gui and gitk for certain unicode chars
  2015-01-22 11:43 [BUG] Mojibake in git gui and gitk for certain unicode chars Tobias Getzner
@ 2015-01-22 12:18 ` Tobias Getzner
  0 siblings, 0 replies; 2+ messages in thread
From: Tobias Getzner @ 2015-01-22 12:18 UTC (permalink / raw)
  To: git

On Do, 2015-01-22 at 12:43 +0100, Tobias Getzner wrote:
> Hello,
> 
> I’ve noticed git gui and gitk seem to have problems decoding certain
> unicode characters. E.g., when a commit contains the character «👍»
> (thumbs up sign; U+1F44D) in UTF-8 encoding, this character will show
> as «ðŸ‘» in gitk. 

> I’ve noticed a perhaps related glitch when the options in git gui is
> shown. My committer name contains the character «ß» (latin small letter
> sharp s; U+00DF). The text field in the options dialog displays this as
> «ÃŸ»,

I suppose that some of the mojibake characters in the message might
have been stripped out of the message because they are control chars.
So, «👍» was rendered as «ð\x9f\x91\x8d». «ß» was rendered as «Ã\x9f».

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2015-01-22 12:18 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-01-22 11:43 [BUG] Mojibake in git gui and gitk for certain unicode chars Tobias Getzner
2015-01-22 12:18 ` Tobias Getzner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.