All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: linux-fsdevel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org,
	jfs-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org,
	Alexander Viro <viro@zeniv.linux.org.uk>, Jan Kara <jack@suse.cz>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	Anton Altaparmakov <anton@tuxera.com>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	Luis de Bethencourt <luisbg@kernel.org>,
	Salah Triki <salah.triki@gmail.com>,
	Steve French <sfrench@samba.org>, Paulo Alcantara <pc@cjr.nz>,
	Ronnie Sahlberg <lsahlber@redhat.com>,
	Shyam Prasad N <sprasad@microsoft.com>,
	Tom Talpey <tom@talpey.com>, Dave Kleikamp <shaggy@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Pavel Machek <pavel@ucw.cz>,
	Christoph Hellwig <hch@infradead.org>,
	Kari Argillander <kari.argillander@gmail.com>,
	Viacheslav Dubeyko <slava@dubeyko.com>
Subject: [RFC PATCH v2 00/18] fs: Remove usage of broken nls_utf8 and drop it
Date: Mon, 26 Dec 2022 15:21:32 +0100	[thread overview]
Message-ID: <20221226142150.13324-1-pali@kernel.org> (raw)

Module nls_utf8 is broken in several ways. It does not support (full)
UTF-8, despite its name. It cannot handle 4-byte UTF-8 sequences and
tolower/toupper table is not implemented at all. Which means that it is
not suitable for usage in case-insensitive filesystems or UTF-16
filesystems (because of e.g. missing UTF-16 surrogate pairs processing).

This is RFC v2 patch series which unify and fix iocharset=utf8 mount
option in all fs drivers and converts all remaining fs drivers to use
utf8s_to_utf16s(), utf16s_to_utf8s(), utf8_to_utf32(), utf32_to_utf8
functions for implementing UTF-8 support instead of nls_utf8.

So at the end it allows to completely drop this broken nls_utf8 module.

For more details look at email thread where was discussed fs unification:
https://lore.kernel.org/linux-fsdevel/20200102211855.gg62r7jshp742d6i@pali/t/#u

This patch series is mostly untested and presented as RFC. Please let me
know what do you think about it and if is the correct way how to fix
broken UTF-8 support in fs drivers. As explained in above email thread I
think it does not make sense to try fixing whole NLS framework and it is
easier to just drop this nls_utf8 module.

Note: this patch series does not address UTF-8 fat case-sensitivity issue:
https://lore.kernel.org/linux-fsdevel/20200119221455.bac7dc55g56q2l4r@pali/

Changes since RFC v1:
* Dropped already merged udf and isofs patches
* Addressed review comments:
  - updated documentation
  - usage of seq_puts
  - some code moved to local variables
  - usage of true/false instead of 1/0
  - rebased on top of master branch

Link to RFC v1:
https://lore.kernel.org/linux-fsdevel/20210808162453.1653-1-pali@kernel.org/

Pali Rohár (18):
  fat: Fix iocharset=utf8 mount option
  hfsplus: Add iocharset= mount option as alias for nls=
  ntfs: Undeprecate iocharset= mount option
  ntfs: Fix error processing when load_nls() fails
  befs: Fix printing iocharset= mount option
  befs: Rename enum value Opt_charset to Opt_iocharset to match mount
    option
  befs: Fix error processing when load_nls() fails
  befs: Allow to use native UTF-8 mode
  hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set
  hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
  hfsplus: Do not use broken utf8 NLS table for iocharset=utf8 mount
    option
  jfs: Remove custom iso8859-1 implementation
  jfs: Fix buffer overflow in jfs_strfromUCS_le() function
  jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
  ntfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
  cifs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
  cifs: Remove usage of load_nls_default() calls
  nls: Drop broken nls_utf8 module

 Documentation/filesystems/hfsplus.rst |   3 +
 Documentation/filesystems/ntfs.rst    |   5 +-
 Documentation/filesystems/vfat.rst    |  13 +--
 fs/befs/linuxvfs.c                    |  24 +++--
 fs/cifs/cifs_unicode.c                | 128 +++++++++++++++++---------
 fs/cifs/cifs_unicode.h                |   2 +-
 fs/cifs/cifsfs.c                      |   2 +
 fs/cifs/cifssmb.c                     |   8 +-
 fs/cifs/connect.c                     |   8 +-
 fs/cifs/dfs_cache.c                   |  24 ++---
 fs/cifs/dir.c                         |  28 ++++--
 fs/cifs/smb2pdu.c                     |  18 +---
 fs/cifs/winucase.c                    |  14 ++-
 fs/fat/Kconfig                        |  19 +---
 fs/fat/dir.c                          |  17 ++--
 fs/fat/fat.h                          |  22 +++++
 fs/fat/inode.c                        |  28 +++---
 fs/fat/namei_vfat.c                   |  26 ++++--
 fs/hfs/super.c                        |  62 +++++++++++--
 fs/hfs/trans.c                        |  62 +++++++------
 fs/hfsplus/dir.c                      |   7 +-
 fs/hfsplus/options.c                  |  39 +++++---
 fs/hfsplus/super.c                    |   7 +-
 fs/hfsplus/unicode.c                  |  31 ++++++-
 fs/hfsplus/xattr.c                    |  20 ++--
 fs/hfsplus/xattr_security.c           |   6 +-
 fs/jfs/jfs_dtree.c                    |  13 ++-
 fs/jfs/jfs_unicode.c                  |  35 +++----
 fs/jfs/jfs_unicode.h                  |   2 +-
 fs/jfs/super.c                        |  29 ++++--
 fs/nls/Kconfig                        |   9 --
 fs/nls/Makefile                       |   1 -
 fs/nls/nls_utf8.c                     |  67 --------------
 fs/ntfs/dir.c                         |   6 +-
 fs/ntfs/inode.c                       |   5 +-
 fs/ntfs/super.c                       |  60 ++++++------
 fs/ntfs/unistr.c                      |  29 +++++-
 37 files changed, 493 insertions(+), 386 deletions(-)
 delete mode 100644 fs/nls/nls_utf8.c

-- 
2.20.1


             reply	other threads:[~2022-12-26 14:22 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-26 14:21 Pali Rohár [this message]
2022-12-26 14:21 ` [RFC PATCH v2 01/18] fat: Fix iocharset=utf8 mount option Pali Rohár
2023-01-10  9:17   ` OGAWA Hirofumi
2023-02-04 10:57     ` Pali Rohár
2023-02-08 10:10       ` OGAWA Hirofumi
2022-12-26 14:21 ` [RFC PATCH v2 02/18] hfsplus: Add iocharset= mount option as alias for nls= Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 03/18] ntfs: Undeprecate iocharset= mount option Pali Rohár
2023-01-01 19:02   ` Kari Argillander
2023-01-01 19:06     ` Pali Rohár
2023-01-01 23:02       ` Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 04/18] ntfs: Fix error processing when load_nls() fails Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 05/18] befs: Fix printing iocharset= mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 06/18] befs: Rename enum value Opt_charset to Opt_iocharset to match " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 07/18] befs: Fix error processing when load_nls() fails Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 08/18] befs: Allow to use native UTF-8 mode Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 09/18] hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 10/18] hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 11/18] hfsplus: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 12/18] jfs: Remove custom iso8859-1 implementation Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 13/18] jfs: Fix buffer overflow in jfs_strfromUCS_le() function Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 14/18] jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 15/18] ntfs: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 16/18] cifs: " Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 17/18] cifs: Remove usage of load_nls_default() calls Pali Rohár
2022-12-26 14:21 ` [RFC PATCH v2 18/18] nls: Drop broken nls_utf8 module Pali Rohár

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20221226142150.13324-1-pali@kernel.org \
    --to=pali@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@tuxera.com \
    --cc=hch@infradead.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jack@suse.cz \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=kari.argillander@gmail.com \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=lsahlber@redhat.com \
    --cc=luisbg@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=pc@cjr.nz \
    --cc=salah.triki@gmail.com \
    --cc=sfrench@samba.org \
    --cc=shaggy@kernel.org \
    --cc=slava@dubeyko.com \
    --cc=sprasad@microsoft.com \
    --cc=tom@talpey.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.