All of lore.kernel.org
 help / color / mirror / Atom feed
From: Peter Krefting <peter@softwolves.pp.se>
To: Esko Luontola <esko.luontola@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: Cross-Platform Version Control
Date: Thu, 14 May 2009 14:48:40 +0100 (CET)	[thread overview]
Message-ID: <alpine.DEB.2.00.0905141441200.20117@perkele.intern.softwolves.pp.se> (raw)
In-Reply-To: <419AD153-53B4-4DAB-AF72-4127C17B1CA0@gmail.com>

Esko Luontola:

> A good start for making Git cross-platform, would be storing the text 
> encoding of every file name and commit message together with the commit.

Is it really necessary to store the encoding for every single file name, 
should it not be enough to just store encoding information for all file 
names at once (i.e., for the object that contains the list of file names and 
their associated blobs)?

I did publish, as a request for comments, the beginnings of a patch that 
would change the Windows version of Git to expect file names to be UTF-8 
encoded. There were some comments about it, especially that I could not just 
assume that UTF-8 was the right thing to assume.

Perhaps if we added some meta-data, maybe using the same fall-back mechanism 
as for commit messages (i.e., assume UTF-8 unless otherwise specified), it 
would be easier to do.

On Windows, the file APIs allow you to use Unicode (UTF-16) to specify file 
names, and the file systems will handle any necessary conversion to whatever 
byte sequences are used to store the file names. UTF-16 and UTF-8 are 
trivial to convert between, and Windows does contain APIs to convert between 
other character encodings and UTF-16.

On Mac OS X, I believe the file system APIs assume you use some kind of 
normalized UTF-8. That should also be possible to create, possibly 
converting back and forth between different normalization forms, if necessary.

On Linux and other Unixes we could just use iconv() to convert from the 
repository file name encoding to whatever the current locale has set up. The 
trick here is to handle file names outside the current encoding. Some kind 
of escaping mechanism will probably need to be introduced.

The best way would be to define this in the Git core once and for all, and 
add support to it for all the platforms in the same go, instead of trying to 
hack around the issue whenever it pops up on the various platforms.

My main use-case for Git on Windows has disappeared as my $dayjob went 
bankrupt, but I am happy to assist with whatever insight I may be able to 
bring.

-- 
\\// Peter - http://www.softwolves.pp.se/

  parent reply	other threads:[~2009-05-14 14:52 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-05-12 15:06 Cross-Platform Version Control Esko Luontola
2009-05-12 15:14 ` Shawn O. Pearce
2009-05-12 16:13   ` Johannes Schindelin
2009-05-12 17:56     ` Esko Luontola
2009-05-12 20:38       ` Johannes Schindelin
2009-05-12 21:16         ` Esko Luontola
2009-05-13  0:23           ` Johannes Schindelin
2009-05-13  5:34             ` Esko Luontola
2009-05-13  6:49               ` Alex Riesen
2009-05-13 10:15               ` Johannes Schindelin
     [not found]                 ` <43d8ce650905130340q596043d5g45b342b62fe20e8d@mail.gmail.com>
2009-05-13 10:41                   ` John Tapsell
2009-05-13 13:42                     ` Jay Soffian
2009-05-13 13:44                       ` Alex Riesen
2009-05-13 13:50                         ` Jay Soffian
2009-05-13 13:57                           ` John Tapsell
2009-05-13 15:27                             ` Nicolas Pitre
2009-05-13 16:22                               ` Johannes Schindelin
2009-05-13 17:24                             ` Andreas Ericsson
2009-05-14  1:49                             ` Miles Bader
2009-05-12 16:16   ` Jeff King
2009-05-12 16:57     ` Johannes Schindelin
2009-05-13 16:26     ` Linus Torvalds
2009-05-13 17:12       ` Linus Torvalds
2009-05-13 17:31         ` Andreas Ericsson
2009-05-13 17:46         ` Linus Torvalds
2009-05-13 18:26           ` Martin Langhoff
2009-05-13 18:37             ` Linus Torvalds
2009-05-13 21:04               ` Theodore Tso
2009-05-13 21:20                 ` Linus Torvalds
2009-05-13 21:08               ` Daniel Barkalow
2009-05-13 21:29                 ` Linus Torvalds
2009-05-13 20:57         ` Matthias Andree
2009-05-13 21:10           ` Linus Torvalds
2009-05-13 21:30             ` Jay Soffian
2009-05-13 21:47             ` Matthias Andree
2009-05-12 18:28 ` Dmitry Potapov
2009-05-12 18:40   ` Martin Langhoff
2009-05-12 18:55     ` Jakub Narebski
2009-05-12 21:43       ` [PATCH] Extend sample pre-commit hook to check for non ascii file/usernames Heiko Voigt
2009-05-12 21:55         ` Jakub Narebski
2009-05-14 17:59           ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Heiko Voigt
2009-05-15 10:52             ` Martin Langhoff
2009-05-18  9:37               ` Heiko Voigt
2009-05-18 22:26                 ` Jakub Narebski
2009-06-20 12:14               ` [RFC PATCH] check for filenames that only differ in case to sample pre-commit hook Heiko Voigt
2009-05-15 14:57             ` [PATCH v2] Extend sample pre-commit hook to check for non ascii filenames Jakub Narebski
2009-05-18  9:50               ` [PATCH] " Heiko Voigt
2009-05-18 10:40                 ` Johannes Sixt
2009-05-18 11:50                   ` Heiko Voigt
2009-05-18 12:04                     ` Johannes Sixt
2009-05-19 20:01                   ` [PATCH v4] " Heiko Voigt
2009-05-18 14:42                 ` [PATCH] " Junio C Hamano
2009-05-18 20:35                 ` Julian Phillips
2009-05-15 18:11             ` [PATCH v2] " Junio C Hamano
2009-05-14 13:48 ` Peter Krefting [this message]
2009-05-14 19:58   ` Cross-Platform Version Control Esko Luontola
2009-05-14 20:21     ` Andreas Ericsson
2009-05-14 22:25     ` Johannes Schindelin
2009-05-15 11:18     ` Dmitry Potapov
  -- strict thread matches above, loose matches on Subject: below --
2009-04-27  8:55 Eric Sink's blog - notes on git, dscms and a "whole product" approach Martin Langhoff
2009-04-28 11:24 ` Cross-Platform Version Control (was: Eric Sink's blog - notes on git, dscms and a "whole product" approach) Jakub Narebski
2009-04-29  6:55   ` Martin Langhoff
2009-04-29  7:52     ` Cross-Platform Version Control Jakub Narebski
2009-04-29  8:25       ` Martin Langhoff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.DEB.2.00.0905141441200.20117@perkele.intern.softwolves.pp.se \
    --to=peter@softwolves.pp.se \
    --cc=esko.luontola@gmail.com \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.