linux-ext4.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 00/23] Ext4 Encoding and Case-insensitive support
@ 2018-12-06 23:08 Gabriel Krisman Bertazi
  2018-12-06 23:08 ` [PATCH v4 01/23] nls: Wrap uni2char/char2uni callers Gabriel Krisman Bertazi
                   ` (24 more replies)
  0 siblings, 25 replies; 39+ messages in thread
From: Gabriel Krisman Bertazi @ 2018-12-06 23:08 UTC (permalink / raw)
  To: tytso; +Cc: linux-fsdevel, kernel, linux-ext4, Gabriel Krisman Bertazi

Hi,

[Resending to include fsdevel, as requested by Dave Chinner]

Following the e2fsprogs changes, these are the corresponding kernel-side
modifications to support the fname_encoding feature.

The patches are split in two parts. The fist 14 patches are refactoring
and improvements to the NLS code, including the utf8 normalization
support.  The final patches implement the fname_encoding feature in ext4.

To test this feature, you need to use the tip of e2fsprogs branch, which
already include support for enabling this feature.

As usual, the ucd files are not included in this email because they are
too large, and would actually cause the email message to bounce.

There are two test files for this in a private xfstests branch, that I
plan to submit upstream once we get this series merged:

  https://gitlab.collabora.com/krisman/xfstests.git -b encoding_v4

I also tested this with the xfstests smoke tests using two scenarios:
(1) a non-encoding TEST_DEV; (2) a utf8 enabled TEST_DEV.  On both
cases, no unrelated regressions where observed.  With my branch of
xfstests above, that fixes some related tests, I didn't observe any
regressions.

Gabriel Krisman Bertazi (19):
  nls: Wrap uni2char/char2uni callers
  nls: Wrap charset field access
  nls: Wrap charset hooks in ops structure
  nls: Split default charset from NLS core
  nls: Split struct nls_charset from struct nls_table
  nls: Add support for multiple versions of an encoding
  nls: Implement NLS_STRICT_MODE flag
  nls: Let charsets define the behavior of tolower/toupper
  nls: Add new interface for string comparisons
  nls: Add optional normalization and casefold hooks
  nls: ascii: Support validation and normalization operations
  nls: utf8: Move nls-utf8{,-core}.c
  nls: utf8: Integrate utf8 normalization code with utf8 charset
  nls: utf8: Introduce test module for normalized utf8 implementation
  ext4: Reserve superblock fields for encoding information
  ext4: Include encoding information in the superblock
  ext4: Support encoding-aware file name lookups
  ext4: Implement EXT4_CASEFOLD_FL flag
  docs: ext4.rst: Document encoding and case-insensitive

Olaf Weber (4):
  nls: utf8: Add unicode character database files
  scripts: add trie generator for UTF-8
  nls: utf8: Introduce code for UTF-8 normalization
  nls: utf8n: reduce the size of utf8data[]

 Documentation/admin-guide/ext4.rst   |   29 +
 fs/befs/linuxvfs.c                   |    8 +-
 fs/cifs/cifs_unicode.c               |   15 +-
 fs/cifs/cifsfs.c                     |    2 +-
 fs/cifs/connect.c                    |    2 +-
 fs/cifs/dir.c                        |    7 +-
 fs/ext4/dir.c                        |   59 +
 fs/ext4/ext4.h                       |   33 +-
 fs/ext4/hash.c                       |   38 +-
 fs/ext4/ialloc.c                     |    2 +-
 fs/ext4/inline.c                     |    2 +-
 fs/ext4/inode.c                      |    4 +-
 fs/ext4/ioctl.c                      |   18 +
 fs/ext4/namei.c                      |   85 +-
 fs/ext4/super.c                      |   83 +
 fs/fat/dir.c                         |   13 +-
 fs/fat/inode.c                       |    6 +-
 fs/fat/namei_vfat.c                  |    6 +-
 fs/hfs/super.c                       |    6 +-
 fs/hfs/trans.c                       |    9 +-
 fs/hfsplus/options.c                 |    2 +-
 fs/hfsplus/unicode.c                 |    6 +-
 fs/isofs/inode.c                     |    5 +-
 fs/isofs/joliet.c                    |    3 +-
 fs/jfs/jfs_unicode.c                 |    9 +-
 fs/jfs/super.c                       |    3 +-
 fs/nls/Kconfig                       |   15 +
 fs/nls/Makefile                      |   20 +
 fs/nls/mac-celtic.c                  |   34 +-
 fs/nls/mac-centeuro.c                |   34 +-
 fs/nls/mac-croatian.c                |   34 +-
 fs/nls/mac-cyrillic.c                |   34 +-
 fs/nls/mac-gaelic.c                  |   34 +-
 fs/nls/mac-greek.c                   |   34 +-
 fs/nls/mac-iceland.c                 |   34 +-
 fs/nls/mac-inuit.c                   |   34 +-
 fs/nls/mac-roman.c                   |   34 +-
 fs/nls/mac-romanian.c                |   34 +-
 fs/nls/mac-turkish.c                 |   34 +-
 fs/nls/nls_ascii.c                   |   84 +-
 fs/nls/nls_core.c                    |  163 ++
 fs/nls/nls_cp1250.c                  |   34 +-
 fs/nls/nls_cp1251.c                  |   34 +-
 fs/nls/nls_cp1255.c                  |   36 +-
 fs/nls/nls_cp437.c                   |   34 +-
 fs/nls/nls_cp737.c                   |   34 +-
 fs/nls/nls_cp775.c                   |   34 +-
 fs/nls/nls_cp850.c                   |   34 +-
 fs/nls/nls_cp852.c                   |   34 +-
 fs/nls/nls_cp855.c                   |   34 +-
 fs/nls/nls_cp857.c                   |   34 +-
 fs/nls/nls_cp860.c                   |   34 +-
 fs/nls/nls_cp861.c                   |   34 +-
 fs/nls/nls_cp862.c                   |   34 +-
 fs/nls/nls_cp863.c                   |   34 +-
 fs/nls/nls_cp864.c                   |   34 +-
 fs/nls/nls_cp865.c                   |   34 +-
 fs/nls/nls_cp866.c                   |   34 +-
 fs/nls/nls_cp869.c                   |   34 +-
 fs/nls/nls_cp874.c                   |   36 +-
 fs/nls/nls_cp932.c                   |   36 +-
 fs/nls/nls_cp936.c                   |   36 +-
 fs/nls/nls_cp949.c                   |   36 +-
 fs/nls/nls_cp950.c                   |   36 +-
 fs/nls/{nls_base.c => nls_default.c} |  124 +-
 fs/nls/nls_euc-jp.c                  |   29 +-
 fs/nls/nls_iso8859-1.c               |   34 +-
 fs/nls/nls_iso8859-13.c              |   34 +-
 fs/nls/nls_iso8859-14.c              |   34 +-
 fs/nls/nls_iso8859-15.c              |   34 +-
 fs/nls/nls_iso8859-2.c               |   34 +-
 fs/nls/nls_iso8859-3.c               |   34 +-
 fs/nls/nls_iso8859-4.c               |   34 +-
 fs/nls/nls_iso8859-5.c               |   34 +-
 fs/nls/nls_iso8859-6.c               |   34 +-
 fs/nls/nls_iso8859-7.c               |   34 +-
 fs/nls/nls_iso8859-9.c               |   34 +-
 fs/nls/nls_koi8-r.c                  |   34 +-
 fs/nls/nls_koi8-ru.c                 |   30 +-
 fs/nls/nls_koi8-u.c                  |   34 +-
 fs/nls/nls_utf8-core.c               |  328 +++
 fs/nls/nls_utf8-norm.c               |  797 ++++++
 fs/nls/nls_utf8-selftest.c           |  316 +++
 fs/nls/nls_utf8.c                    |   67 -
 fs/nls/ucd/README                    |   34 +
 fs/nls/utf8n.h                       |  117 +
 fs/ntfs/inode.c                      |    2 +-
 fs/ntfs/super.c                      |    6 +-
 fs/ntfs/unistr.c                     |   13 +-
 fs/udf/super.c                       |    3 +-
 fs/udf/unicode.c                     |    4 +-
 include/linux/fs.h                   |    2 +
 include/linux/nls.h                  |  293 ++-
 scripts/Makefile                     |    1 +
 scripts/mkutf8data.c                 | 3392 ++++++++++++++++++++++++++
 95 files changed, 7287 insertions(+), 618 deletions(-)
 create mode 100644 fs/nls/nls_core.c
 rename fs/nls/{nls_base.c => nls_default.c} (89%)
 create mode 100644 fs/nls/nls_utf8-core.c
 create mode 100644 fs/nls/nls_utf8-norm.c
 create mode 100644 fs/nls/nls_utf8-selftest.c
 delete mode 100644 fs/nls/nls_utf8.c
 create mode 100644 fs/nls/ucd/README
 create mode 100644 fs/nls/utf8n.h
 create mode 100644 scripts/mkutf8data.c

-- 
2.20.0.rc2

^ permalink raw reply	[flat|nested] 39+ messages in thread
* [PATCH v4 00/23] Ext4 Encoding and Case-insensitive support
@ 2018-12-06 22:04 Gabriel Krisman Bertazi
  2018-12-06 22:50 ` Dave Chinner
  0 siblings, 1 reply; 39+ messages in thread
From: Gabriel Krisman Bertazi @ 2018-12-06 22:04 UTC (permalink / raw)
  To: tytso; +Cc: kernel, linux-ext4, Gabriel Krisman Bertazi

Hi,

Following the e2fsprogs changes, these are the corresponding kernel-side
modifications to support the fname_encoding feature.

The patches are split in two parts. The fist 14 patches are refactoring
and improvements to the NLS code, including the utf8 normalization
support.  The final patches implement the fname_encoding feature in ext4.

To test this feature, you need to use the tip of e2fsprogs branch, which
already include support for enabling this feature.

As usual, the ucd files are not included in this email because they are
too large, and would actually cause the email message to bounce.

There are two test files for this in a private xfstests branch, that I
plan to submit upstream once we get this series merged:

  https://gitlab.collabora.com/krisman/xfstests.git -b encoding_v4

I also tested this with the xfstests smoke tests using two scenarios:
(1) a non-encoding TEST_DEV; (2) a utf8 enabled TEST_DEV.  On both
cases, no unrelated regressions where observed.  With my branch of
xfstests above, that fixes some related tests, I didn't observe any
regressions.

Gabriel Krisman Bertazi (19):
  nls: Wrap uni2char/char2uni callers
  nls: Wrap charset field access
  nls: Wrap charset hooks in ops structure
  nls: Split default charset from NLS core
  nls: Split struct nls_charset from struct nls_table
  nls: Add support for multiple versions of an encoding
  nls: Implement NLS_STRICT_MODE flag
  nls: Let charsets define the behavior of tolower/toupper
  nls: Add new interface for string comparisons
  nls: Add optional normalization and casefold hooks
  nls: ascii: Support validation and normalization operations
  nls: utf8: Move nls-utf8{,-core}.c
  nls: utf8: Integrate utf8 normalization code with utf8 charset
  nls: utf8: Introduce test module for normalized utf8 implementation
  ext4: Reserve superblock fields for encoding information
  ext4: Include encoding information in the superblock
  ext4: Support encoding-aware file name lookups
  ext4: Implement EXT4_CASEFOLD_FL flag
  docs: ext4.rst: Document encoding and case-insensitive

Olaf Weber (4):
  nls: utf8: Add unicode character database files
  scripts: add trie generator for UTF-8
  nls: utf8: Introduce code for UTF-8 normalization
  nls: utf8n: reduce the size of utf8data[]

 Documentation/admin-guide/ext4.rst   |   29 +
 fs/befs/linuxvfs.c                   |    8 +-
 fs/cifs/cifs_unicode.c               |   15 +-
 fs/cifs/cifsfs.c                     |    2 +-
 fs/cifs/connect.c                    |    2 +-
 fs/cifs/dir.c                        |    7 +-
 fs/ext4/dir.c                        |   59 +
 fs/ext4/ext4.h                       |   33 +-
 fs/ext4/hash.c                       |   38 +-
 fs/ext4/ialloc.c                     |    2 +-
 fs/ext4/inline.c                     |    2 +-
 fs/ext4/inode.c                      |    4 +-
 fs/ext4/ioctl.c                      |   18 +
 fs/ext4/namei.c                      |   85 +-
 fs/ext4/super.c                      |   83 +
 fs/fat/dir.c                         |   13 +-
 fs/fat/inode.c                       |    6 +-
 fs/fat/namei_vfat.c                  |    6 +-
 fs/hfs/super.c                       |    6 +-
 fs/hfs/trans.c                       |    9 +-
 fs/hfsplus/options.c                 |    2 +-
 fs/hfsplus/unicode.c                 |    6 +-
 fs/isofs/inode.c                     |    5 +-
 fs/isofs/joliet.c                    |    3 +-
 fs/jfs/jfs_unicode.c                 |    9 +-
 fs/jfs/super.c                       |    3 +-
 fs/nls/Kconfig                       |   15 +
 fs/nls/Makefile                      |   20 +
 fs/nls/mac-celtic.c                  |   34 +-
 fs/nls/mac-centeuro.c                |   34 +-
 fs/nls/mac-croatian.c                |   34 +-
 fs/nls/mac-cyrillic.c                |   34 +-
 fs/nls/mac-gaelic.c                  |   34 +-
 fs/nls/mac-greek.c                   |   34 +-
 fs/nls/mac-iceland.c                 |   34 +-
 fs/nls/mac-inuit.c                   |   34 +-
 fs/nls/mac-roman.c                   |   34 +-
 fs/nls/mac-romanian.c                |   34 +-
 fs/nls/mac-turkish.c                 |   34 +-
 fs/nls/nls_ascii.c                   |   84 +-
 fs/nls/nls_core.c                    |  163 ++
 fs/nls/nls_cp1250.c                  |   34 +-
 fs/nls/nls_cp1251.c                  |   34 +-
 fs/nls/nls_cp1255.c                  |   36 +-
 fs/nls/nls_cp437.c                   |   34 +-
 fs/nls/nls_cp737.c                   |   34 +-
 fs/nls/nls_cp775.c                   |   34 +-
 fs/nls/nls_cp850.c                   |   34 +-
 fs/nls/nls_cp852.c                   |   34 +-
 fs/nls/nls_cp855.c                   |   34 +-
 fs/nls/nls_cp857.c                   |   34 +-
 fs/nls/nls_cp860.c                   |   34 +-
 fs/nls/nls_cp861.c                   |   34 +-
 fs/nls/nls_cp862.c                   |   34 +-
 fs/nls/nls_cp863.c                   |   34 +-
 fs/nls/nls_cp864.c                   |   34 +-
 fs/nls/nls_cp865.c                   |   34 +-
 fs/nls/nls_cp866.c                   |   34 +-
 fs/nls/nls_cp869.c                   |   34 +-
 fs/nls/nls_cp874.c                   |   36 +-
 fs/nls/nls_cp932.c                   |   36 +-
 fs/nls/nls_cp936.c                   |   36 +-
 fs/nls/nls_cp949.c                   |   36 +-
 fs/nls/nls_cp950.c                   |   36 +-
 fs/nls/{nls_base.c => nls_default.c} |  124 +-
 fs/nls/nls_euc-jp.c                  |   29 +-
 fs/nls/nls_iso8859-1.c               |   34 +-
 fs/nls/nls_iso8859-13.c              |   34 +-
 fs/nls/nls_iso8859-14.c              |   34 +-
 fs/nls/nls_iso8859-15.c              |   34 +-
 fs/nls/nls_iso8859-2.c               |   34 +-
 fs/nls/nls_iso8859-3.c               |   34 +-
 fs/nls/nls_iso8859-4.c               |   34 +-
 fs/nls/nls_iso8859-5.c               |   34 +-
 fs/nls/nls_iso8859-6.c               |   34 +-
 fs/nls/nls_iso8859-7.c               |   34 +-
 fs/nls/nls_iso8859-9.c               |   34 +-
 fs/nls/nls_koi8-r.c                  |   34 +-
 fs/nls/nls_koi8-ru.c                 |   30 +-
 fs/nls/nls_koi8-u.c                  |   34 +-
 fs/nls/nls_utf8-core.c               |  328 +++
 fs/nls/nls_utf8-norm.c               |  797 ++++++
 fs/nls/nls_utf8-selftest.c           |  316 +++
 fs/nls/nls_utf8.c                    |   67 -
 fs/nls/ucd/README                    |   34 +
 fs/nls/utf8n.h                       |  117 +
 fs/ntfs/inode.c                      |    2 +-
 fs/ntfs/super.c                      |    6 +-
 fs/ntfs/unistr.c                     |   13 +-
 fs/udf/super.c                       |    3 +-
 fs/udf/unicode.c                     |    4 +-
 include/linux/fs.h                   |    2 +
 include/linux/nls.h                  |  293 ++-
 scripts/Makefile                     |    1 +
 scripts/mkutf8data.c                 | 3392 ++++++++++++++++++++++++++
 95 files changed, 7287 insertions(+), 618 deletions(-)
 create mode 100644 fs/nls/nls_core.c
 rename fs/nls/{nls_base.c => nls_default.c} (89%)
 create mode 100644 fs/nls/nls_utf8-core.c
 create mode 100644 fs/nls/nls_utf8-norm.c
 create mode 100644 fs/nls/nls_utf8-selftest.c
 delete mode 100644 fs/nls/nls_utf8.c
 create mode 100644 fs/nls/ucd/README
 create mode 100644 fs/nls/utf8n.h
 create mode 100644 scripts/mkutf8data.c

-- 
2.20.0.rc2

^ permalink raw reply	[flat|nested] 39+ messages in thread

end of thread, other threads:[~2018-12-10 19:35 UTC | newest]

Thread overview: 39+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-06 23:08 [PATCH v4 00/23] Ext4 Encoding and Case-insensitive support Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 01/23] nls: Wrap uni2char/char2uni callers Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 02/23] nls: Wrap charset field access Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 03/23] nls: Wrap charset hooks in ops structure Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 04/23] nls: Split default charset from NLS core Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 05/23] nls: Split struct nls_charset from struct nls_table Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 06/23] nls: Add support for multiple versions of an encoding Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 07/23] nls: Implement NLS_STRICT_MODE flag Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 08/23] nls: Let charsets define the behavior of tolower/toupper Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 09/23] nls: Add new interface for string comparisons Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 10/23] nls: Add optional normalization and casefold hooks Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 11/23] nls: ascii: Support validation and normalization operations Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 12/23] nls: utf8: Add unicode character database files Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 13/23] scripts: add trie generator for UTF-8 Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 14/23] nls: utf8: Move nls-utf8{,-core}.c Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 15/23] nls: utf8: Introduce code for UTF-8 normalization Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 16/23] nls: utf8n: reduce the size of utf8data[] Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 17/23] nls: utf8: Integrate utf8 normalization code with utf8 charset Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 18/23] nls: utf8: Introduce test module for normalized utf8 implementation Gabriel Krisman Bertazi
2018-12-06 23:08 ` [PATCH v4 19/23] ext4: Reserve superblock fields for encoding information Gabriel Krisman Bertazi
2018-12-06 23:09 ` [PATCH v4 20/23] ext4: Include encoding information in the superblock Gabriel Krisman Bertazi
2018-12-06 23:09 ` [PATCH v4 21/23] ext4: Support encoding-aware file name lookups Gabriel Krisman Bertazi
2018-12-06 23:09 ` [PATCH v4 22/23] ext4: Implement EXT4_CASEFOLD_FL flag Gabriel Krisman Bertazi
2018-12-06 23:09 ` [PATCH v4 23/23] docs: ext4.rst: Document encoding and case-insensitive Gabriel Krisman Bertazi
2018-12-07 18:41 ` [PATCH v4 00/23] Ext4 Encoding and Case-insensitive support Randy Dunlap
     [not found] ` <20181208194128.GE20708@thunk.org>
2018-12-08 21:48   ` Linus Torvalds
2018-12-08 21:58     ` Linus Torvalds
2018-12-08 22:59       ` Linus Torvalds
2018-12-09  0:46         ` Andreas Dilger
     [not found]       ` <20181209050326.GA28659@mit.edu>
2018-12-09 17:41         ` Linus Torvalds
2018-12-09 20:10           ` Theodore Y. Ts'o
2018-12-09 20:54             ` Linus Torvalds
2018-12-10  0:08               ` Theodore Y. Ts'o
2018-12-10 19:35                 ` Linus Torvalds
2018-12-09 20:53           ` Gabriel Krisman Bertazi
2018-12-09 21:05             ` Linus Torvalds
  -- strict thread matches above, loose matches on Subject: below --
2018-12-06 22:04 Gabriel Krisman Bertazi
2018-12-06 22:50 ` Dave Chinner
2018-12-06 23:09   ` Gabriel Krisman Bertazi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).