linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Kari Argillander <kari.argillander@gmail.com>
To: "Pali Rohár" <pali@kernel.org>
Cc: linux-fsdevel@vger.kernel.org,
	linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org,
	jfs-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Jan Kara" <jack@suse.cz>,
	"OGAWA Hirofumi" <hirofumi@mail.parknet.co.jp>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	"Luis de Bethencourt" <luisbg@kernel.org>,
	"Salah Triki" <salah.triki@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Dave Kleikamp" <shaggy@kernel.org>,
	"Anton Altaparmakov" <anton@tuxera.com>,
	"Pavel Machek" <pavel@ucw.cz>, "Marek Behún" <marek.behun@nic.cz>,
	"Christoph Hellwig" <hch@infradead.org>
Subject: Re: [RFC PATCH 00/20] fs: Remove usage of broken nls_utf8 and drop it
Date: Sat, 4 Sep 2021 00:26:16 +0300	[thread overview]
Message-ID: <20210903212616.xbi5tz5ier5xcpas@kari-VirtualBox> (raw)
In-Reply-To: <20210808162453.1653-1-pali@kernel.org>

On Sun, Aug 08, 2021 at 06:24:33PM +0200, Pali Rohár wrote:
> Module nls_utf8 is broken in several ways. It does not support (full)
> UTF-8, despite its name. It cannot handle 4-byte UTF-8 sequences and
> tolower/toupper table is not implemented at all. Which means that it is
> not suitable for usage in case-insensitive filesystems or UTF-16
> filesystems (because of e.g. missing UTF-16 surrogate pairs processing).
> 
> This is RFC patch series which unify and fix iocharset=utf8 mount
> option in all fs drivers and converts all remaining fs drivers to use
> utf8s_to_utf16s(), utf16s_to_utf8s(), utf8_to_utf32(), utf32_to_utf8
> functions for implementing UTF-8 support instead of nls_utf8.
> 
> So at the end it allows to completely drop this broken nls_utf8 module.

Now that every filesystem will support nls=NULL. Is it possible to just
drop default_table completly? Then default has to be utf8, but is it a
problem?

Then I was also thinking that every nls "codepage module" can have in
Kconfig
	select HAVE_NLS

HAVE_NLS will tell if we can get anything other than nls=NULL. This way
fs can drop some functions if they wanted to.  It would be nice to also
make nls module as small as possible because also acpi, pci and usb
selects it. Also many other driver seems to depend on it and they do not
even seem to select it. All other than filesystems seems to just need
utf conversions. At least for quick eye.  Other option is to seperate
nls and utf, but I'm not fan this idea just yet at least.

Whole point is to help little bit small Linux and embedded devices. I'm
happy to do this, but all really depens on if utf8 can be default and
that we sure can think before hand. 

  Argillander

> For more details look at email thread where was discussed fs unification:
> https://lore.kernel.org/linux-fsdevel/20200102211855.gg62r7jshp742d6i@pali/t/#u
> 
> This patch series is mostly untested and presented as RFC. Please let me
> know what do you think about it and if is the correct way how to fix
> broken UTF-8 support in fs drivers. As explained in above email thread I
> think it does not make sense to try fixing whole NLS framework and it is
> easier to just drop this nls_utf8 module.
> 
> Note: this patch series does not address UTF-8 fat case-sensitivity issue:
> https://lore.kernel.org/linux-fsdevel/20200119221455.bac7dc55g56q2l4r@pali/
> 
> Pali Rohár (20):
>   fat: Fix iocharset=utf8 mount option
>   hfsplus: Add iocharset= mount option as alias for nls=
>   udf: Fix iocharset=utf8 mount option
>   isofs: joliet: Fix iocharset=utf8 mount option
>   ntfs: Undeprecate iocharset= mount option
>   ntfs: Fix error processing when load_nls() fails
>   befs: Fix printing iocharset= mount option
>   befs: Rename enum value Opt_charset to Opt_iocharset to match mount
>     option
>   befs: Fix error processing when load_nls() fails
>   befs: Allow to use native UTF-8 mode
>   hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set
>   hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
>   hfsplus: Do not use broken utf8 NLS table for iocharset=utf8 mount
>     option
>   jfs: Remove custom iso8859-1 implementation
>   jfs: Fix buffer overflow in jfs_strfromUCS_le() function
>   jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
>   ntfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
>   cifs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
>   cifs: Remove usage of load_nls_default() calls
>   nls: Drop broken nls_utf8 module
> 
>  fs/befs/linuxvfs.c          |  22 ++++---
>  fs/cifs/cifs_unicode.c      | 128 +++++++++++++++++++++++-------------
>  fs/cifs/cifs_unicode.h      |   2 +-
>  fs/cifs/cifsfs.c            |   2 +
>  fs/cifs/cifssmb.c           |   8 +--
>  fs/cifs/connect.c           |   8 ++-
>  fs/cifs/dfs_cache.c         |  24 +++----
>  fs/cifs/dir.c               |  28 ++++++--
>  fs/cifs/smb2pdu.c           |  17 ++---
>  fs/cifs/winucase.c          |  14 ++--
>  fs/fat/Kconfig              |  15 -----
>  fs/fat/dir.c                |  17 ++---
>  fs/fat/fat.h                |  22 +++++++
>  fs/fat/inode.c              |  28 ++++----
>  fs/fat/namei_vfat.c         |  26 ++++++--
>  fs/hfs/super.c              |  62 ++++++++++++++---
>  fs/hfs/trans.c              |  62 +++++++++--------
>  fs/hfsplus/dir.c            |   6 +-
>  fs/hfsplus/options.c        |  39 ++++++-----
>  fs/hfsplus/super.c          |   7 +-
>  fs/hfsplus/unicode.c        |  31 ++++++++-
>  fs/hfsplus/xattr.c          |  14 ++--
>  fs/hfsplus/xattr_security.c |   3 +-
>  fs/isofs/inode.c            |  27 ++++----
>  fs/isofs/isofs.h            |   1 -
>  fs/isofs/joliet.c           |   4 +-
>  fs/jfs/jfs_dtree.c          |  13 +++-
>  fs/jfs/jfs_unicode.c        |  35 +++++-----
>  fs/jfs/jfs_unicode.h        |   2 +-
>  fs/jfs/super.c              |  29 ++++++--
>  fs/nls/Kconfig              |   9 ---
>  fs/nls/Makefile             |   1 -
>  fs/nls/nls_utf8.c           |  67 -------------------
>  fs/ntfs/dir.c               |   6 +-
>  fs/ntfs/inode.c             |   5 +-
>  fs/ntfs/super.c             |  60 ++++++++---------
>  fs/ntfs/unistr.c            |  28 +++++++-
>  fs/udf/super.c              |  50 ++++++--------
>  fs/udf/udf_sb.h             |   2 -
>  fs/udf/unicode.c            |   4 +-
>  40 files changed, 510 insertions(+), 418 deletions(-)
>  delete mode 100644 fs/nls/nls_utf8.c
> 
> -- 
> 2.20.1
> 

  parent reply	other threads:[~2021-09-03 21:26 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-08 16:24 [RFC PATCH 00/20] fs: Remove usage of broken nls_utf8 and drop it Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 01/20] fat: Fix iocharset=utf8 mount option Pali Rohár
2021-08-15  3:42   ` OGAWA Hirofumi
2021-08-15  9:42     ` Pali Rohár
2021-08-15 11:23       ` OGAWA Hirofumi
2021-08-23  3:51   ` Kari Argillander
2021-08-08 16:24 ` [RFC PATCH 02/20] hfsplus: Add iocharset= mount option as alias for nls= Pali Rohár
2021-08-09 17:51   ` Viacheslav Dubeyko
2021-08-09 20:49   ` Kari Argillander
2021-08-09 21:25     ` Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 03/20] udf: Fix iocharset=utf8 mount option Pali Rohár
2021-08-12 14:17   ` Jan Kara
2021-08-12 15:51     ` Pali Rohár
2021-08-13 13:48       ` Jan Kara
2021-08-19  8:34         ` Pali Rohár
2021-08-19 10:41           ` Jan Kara
2021-08-08 16:24 ` [RFC PATCH 04/20] isofs: joliet: " Pali Rohár
2021-08-12 14:18   ` Jan Kara
2021-08-08 16:24 ` [RFC PATCH 05/20] ntfs: Undeprecate iocharset= " Pali Rohár
2021-08-09 20:52   ` Kari Argillander
2021-08-19  1:21   ` Kari Argillander
2021-08-19  8:12     ` Pali Rohár
2021-08-19 10:23       ` Kari Argillander
2021-08-19 22:04         ` Pali Rohár
2021-08-19 23:18           ` Kari Argillander
2021-08-08 16:24 ` [RFC PATCH 06/20] ntfs: Fix error processing when load_nls() fails Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 07/20] befs: Fix printing iocharset= mount option Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 08/20] befs: Rename enum value Opt_charset to Opt_iocharset to match " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 09/20] befs: Fix error processing when load_nls() fails Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 10/20] befs: Allow to use native UTF-8 mode Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 11/20] hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set Pali Rohár
2021-08-09 17:31   ` Viacheslav Dubeyko
2021-08-09 17:37     ` Matthew Wilcox
2021-08-09 17:47       ` Pali Rohár
2021-08-09 20:43         ` Steve French
2021-08-09 18:00       ` Viacheslav Dubeyko
2021-08-08 16:24 ` [RFC PATCH 12/20] hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2021-08-09 17:49   ` Viacheslav Dubeyko
2022-09-25 12:06     ` Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 13/20] hfsplus: " Pali Rohár
2021-08-09 17:42   ` Viacheslav Dubeyko
2022-09-25 12:12     ` Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 14/20] jfs: Remove custom iso8859-1 implementation Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 15/20] jfs: Fix buffer overflow in jfs_strfromUCS_le() function Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 16/20] jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 17/20] ntfs: " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 18/20] cifs: " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 19/20] cifs: Remove usage of load_nls_default() calls Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 20/20] nls: Drop broken nls_utf8 module Pali Rohár
2021-09-03 21:26 ` Kari Argillander [this message]
2021-09-03 21:37   ` [RFC PATCH 00/20] fs: Remove usage of broken nls_utf8 and drop it Pali Rohár
2021-09-03 22:06     ` Kari Argillander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210903212616.xbi5tz5ier5xcpas@kari-VirtualBox \
    --to=kari.argillander@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=anton@tuxera.com \
    --cc=hch@infradead.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jack@suse.cz \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=luisbg@kernel.org \
    --cc=marek.behun@nic.cz \
    --cc=pali@kernel.org \
    --cc=pavel@ucw.cz \
    --cc=salah.triki@gmail.com \
    --cc=shaggy@kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).