All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/8] Antique UTF-8 filename support
@ 2009-05-12 22:50 Robin Rosenberg
  2009-05-12 22:50 ` [RFC 1/8] UTF helpers Robin Rosenberg
  0 siblings, 1 reply; 20+ messages in thread
From: Robin Rosenberg @ 2009-05-12 22:50 UTC (permalink / raw)
  To: git; +Cc: Robin Rosenberg

From: Robin Rosenberg <robin.rosenberg@gmail.com>

Since there is some interest in the topic, now, I can republish my old 2 ½ year old
patches so there is some real code to comment on. They apply on top of
6dcfa306f2b67b733a7eb2d7ded1bc9987809edb, For completness I send
all patches, but the interesing stuff is in patch 4 and 5. Beware of encoding
issues with the test cases.

They do not handle Windows UTF-16 at all, but I think that is just a matter of writing
windows specifc wrappers for the filename and directory handling routines.

Feel free to rewamp and steal ideas and add constructive criticism. Don't even 
think of cherry-picking and rebasing, It's careful handpicking with copy/paste at 
best, but mostly it's fuel for discussions.

I'd admit some parts are quite kludgy and probably slow. as I was primarily 
interested to see if it was even feasible, which it was. however there was simply
no interest, which meant there was no point in optimizing it. It was simply the
wrong problem at the time.

Disclaimer: A problem with this approach is that, although it does character
conversion, if you are on a non-UTF-8 locale it will not let you mange
any repository. That is basically impossible and hence not the goal. It does
help people with the same (or close) languages to cooperate without enforcing
a common encoding as long as stick to the common characters, i.e. the ones
that can be converted between the locales involved.

This is probably the most out-dated patch series ever. 

-- robin

Robin Rosenberg (8):
(mostly obsolete)
  UTF helpers
  Messages in locale.
  Extend tests to cover locale wrt to commit messages.

The interesing stuff (patch 4 & 5)
  UTF file names.
  Extend all tests to work on UTF-8 filenames.

old wip
  test of utf_locallinks
  Convert symlink dest in diff
  UTF-8 in non-SHA1-objects

 Makefile                            |    8 +-
 builtin-add.c                       |    5 +-
 builtin-cat-file.c                  |    6 +-
 builtin-checkout-index.c            |   46 +++-
 builtin-commit-tree.c               |    9 +-
 builtin-ls-files.c                  |   26 ++-
 builtin-ls-tree.c                   |   16 +-
 builtin-rev-parse.c                 |    7 +-
 builtin-update-index.c              |   18 +-
 builtin-write-tree.c                |    5 +-
 diff.c                              |  111 ++++++--
 dir.c                               |   22 +-
 git-commit.sh                       |    5 +
 git-compat-util.h                   |   43 +++
 git-rebase.sh                       |    1 +
 git.c                               |    9 +
 log-tree.c                          |    4 +-
 merge-index.c                       |   25 ++-
 read-cache.c                        |    8 +-
 refs.c                              |   11 +-
 setup.c                             |   28 ++-
 t/lib-read-tree-m-3way.sh           |   38 ++--
 t/t-utf-filenames.sh                |   95 +++++++
 t/t-utf-msg.sh                      |   43 +++
 t/t0000-basic.sh                    |  117 ++++----
 t/t0010-racy-git.sh                 |   10 +-
 t/t1000-read-tree-m-3way.sh         |  240 +++++++++---------
 t/t1001-read-tree-m-2way.sh         |   56 ++--
 t/t1020-subdirectory.sh             |   63 +++---
 t/t1100-commit-tree-options.sh      |   12 +-
 t/t1400-update-ref.sh               |   10 +-
 t/t2000-checkout-cache-clash.sh     |   18 +-
 t/t2001-checkout-cache-clash.sh     |   30 +-
 t/t2002-checkout-cache-u.sh         |    8 +-
 t/t2003-checkout-cache-mkdir.sh     |  118 ++++----
 t/t2004-checkout-cache-temp.sh      |  144 +++++-----
 t/t2100-update-cache-badpath.sh     |   48 ++--
 t/t2101-update-index-reupdate.sh    |   56 ++--
 t/t3000-ls-files-others.sh          |   36 ++--
 t/t3002-ls-files-dashpath.sh        |   24 +-
 t/t3010-ls-files-killed-modified.sh |  104 ++++----
 t/t3020-ls-files-error-unmatch.sh   |   10 +-
 t/t3100-ls-tree-restrict.sh         |  122 +++++-----
 t/t3101-ls-tree-dirname.sh          |   88 +++---
 t/t3400-rebase.sh                   |   18 +-
 t/t3401-rebase-partial.sh           |   24 +-
 t/t3402-rebase-merge.sh             |   17 +-
 t/t3403-rebase-skip.sh              |   10 +-
 t/t3500-cherry.sh                   |   26 +-
 t/t3600-rm.sh                       |   28 +-
 t/t3700-add.sh                      |   30 +-
 t/t4000-diff-format.sh              |   26 +-
 t/t4001-diff-rename.sh              |   20 +-
 t/t4002-diff-basic.sh               |  160 ++++++------
 t/t4003-diff-rename-1.sh            |   66 +++---
 t/t4004-diff-rename-symlink.sh      |   40 ++--
 t/t4005-diff-rename-2.sh            |   54 ++--
 t/t4006-diff-mode.sh                |   14 +-
 t/t4008-diff-break-rewrite.sh       |  100 ++++----
 t/t4009-diff-rename-4.sh            |   63 +++---
 t/t4011-diff-symlink.sh             |   38 ++--
 t/t4012-diff-binary.sh              |   16 +-
 t/t7301-rev-parse.sh                |   20 ++
 t/test-lib.sh                       |   13 +-
 test-utf.c                          |   61 +++++
 utf.c                               |  501 +++++++++++++++++++++++++++++++++++
 utf.h                               |   27 ++
 67 files changed, 2133 insertions(+), 1142 deletions(-)
 create mode 100755 t/t-utf-filenames.sh
 create mode 100755 t/t-utf-msg.sh
 create mode 100755 t/t7301-rev-parse.sh
 create mode 100644 test-utf.c
 create mode 100644 utf.c
 create mode 100644 utf.h

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2009-05-14 13:57 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-12 22:50 [RFC 0/8] Antique UTF-8 filename support Robin Rosenberg
2009-05-12 22:50 ` [RFC 1/8] UTF helpers Robin Rosenberg
2009-05-12 22:50   ` [RFC 2/8] Messages in locale Robin Rosenberg
2009-05-12 22:50     ` [RFC 3/8] Extend tests to cover locale wrt to commit messages Robin Rosenberg
2009-05-12 22:50       ` [RFC 4/8] UTF file names Robin Rosenberg
     [not found]         ` <1242168631-30753-6-git-send-email-robin.rosenberg@dewire.com>
2009-05-12 22:50           ` [RFC 6/8] test of utf_locallinks Robin Rosenberg
2009-05-12 22:50             ` [RFC 7/8] Convert symlink dest in diff Robin Rosenberg
2009-05-12 22:50               ` [RFC 8/8] UTF-8 in non-SHA1-objects Robin Rosenberg
2009-05-13  0:20   ` [RFC 1/8] UTF helpers Johannes Schindelin
2009-05-13  5:24     ` Robin Rosenberg
2009-05-13  9:24       ` Esko Luontola
2009-05-13 10:02         ` Andreas Ericsson
2009-05-13 10:21           ` Esko Luontola
2009-05-13 11:44             ` Alex Riesen
2009-05-13 18:48         ` Junio C Hamano
2009-05-13 19:31           ` Esko Luontola
2009-05-13 20:10             ` Junio C Hamano
2009-05-13 10:14       ` Johannes Schindelin
2009-05-14  4:38       ` Junio C Hamano
2009-05-14 13:57         ` Jay Soffian

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.