linux-cifs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali@kernel.org>
To: Viacheslav Dubeyko <slava@dubeyko.com>
Cc: "Linux FS Devel" <linux-fsdevel@vger.kernel.org>,
	linux-ntfs-dev@lists.sourceforge.net, linux-cifs@vger.kernel.org,
	jfs-discussion@lists.sourceforge.net,
	linux-kernel@vger.kernel.org,
	"Alexander Viro" <viro@zeniv.linux.org.uk>,
	"Jan Kara" <jack@suse.cz>,
	"OGAWA Hirofumi" <hirofumi@mail.parknet.co.jp>,
	"Theodore Y . Ts'o" <tytso@mit.edu>,
	"Luis de Bethencourt" <luisbg@kernel.org>,
	"Salah Triki" <salah.triki@gmail.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"Dave Kleikamp" <shaggy@kernel.org>,
	"Anton Altaparmakov" <anton@tuxera.com>,
	"Pavel Machek" <pavel@ucw.cz>, "Marek Behún" <marek.behun@nic.cz>,
	"Christoph Hellwig" <hch@infradead.org>
Subject: Re: [RFC PATCH 12/20] hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option
Date: Sun, 25 Sep 2022 14:06:46 +0200	[thread overview]
Message-ID: <20220925120646.dfkofrka74blwrwb@pali> (raw)
In-Reply-To: <4B1987C7-F6D9-4493-ACD0-846B92F86037@dubeyko.com>

Hello! Sorry for a longer delay. Below are comments.

On Monday 09 August 2021 10:49:34 Viacheslav Dubeyko wrote:
> > On Aug 8, 2021, at 9:24 AM, Pali Rohár <pali@kernel.org> wrote:
> > 
> > NLS table for utf8 is broken and cannot be fixed.
> > 
> > So instead of broken utf8 nls functions char2uni() and uni2char() use
> > functions utf8_to_utf32() and utf32_to_utf8() which implements correct
> > encoding and decoding between Unicode code points and UTF-8 sequence.
> > 
> > When iochatset=utf8 is used then set hsb->nls_io to NULL and use it for
> > distinguish between the fact if NLS table or native UTF-8 functions should
> > be used.
> > 
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > ---
> > fs/hfs/super.c | 33 ++++++++++++++++++++++-----------
> > fs/hfs/trans.c | 24 ++++++++++++++++++++----
> > 2 files changed, 42 insertions(+), 15 deletions(-)
> > 
> > diff --git a/fs/hfs/super.c b/fs/hfs/super.c
> > index 86bc46746c7f..076308df41cf 100644
> > --- a/fs/hfs/super.c
> > +++ b/fs/hfs/super.c
> > @@ -149,10 +149,13 @@ static int hfs_show_options(struct seq_file *seq, struct dentry *root)
> > 		seq_printf(seq, ",part=%u", sbi->part);
> > 	if (sbi->session >= 0)
> > 		seq_printf(seq, ",session=%u", sbi->session);
> > -	if (sbi->nls_disk)
> > +	if (sbi->nls_disk) {
> > 		seq_printf(seq, ",codepage=%s", sbi->nls_disk->charset);
> 
> Maybe, I am missing something. But where is the closing “}”?

See below...

> 
> > -	if (sbi->nls_io)
> > -		seq_printf(seq, ",iocharset=%s", sbi->nls_io->charset);
> > +		if (sbi->nls_io)
> > +			seq_printf(seq, ",iocharset=%s", sbi->nls_io->charset);
> > +		else
> > +			seq_puts(seq, ",iocharset=utf8");
> > +	}

        ^
... Closing "}" is marked above.

> > 	if (sbi->s_quiet)
> > 		seq_printf(seq, ",quiet");
> > 	return 0;
> > @@ -225,6 +228,7 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
> > 	char *p;
> > 	substring_t args[MAX_OPT_ARGS];
> > 	int tmp, token;
> > +	int have_iocharset;
> 
> What’s about boolean type?

Ok! No problem, I can use "bool" type. Just I was in impression that
code style of this driver is to use "int" type also for booleans.
Same for "false" and "true" as you mentioned below.

> > 
> > 	/* initialize the sb with defaults */
> > 	hsb->s_uid = current_uid();
> > @@ -239,6 +243,8 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
> > 	if (!options)
> > 		return 1;
> > 
> > +	have_iocharset = 0;
> 
> What’s about false here?
> 
> > +
> > 	while ((p = strsep(&options, ",")) != NULL) {
> > 		if (!*p)
> > 			continue;
> > @@ -332,18 +338,22 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
> > 			kfree(p);
> > 			break;
> > 		case opt_iocharset:
> > -			if (hsb->nls_io) {
> > +			if (have_iocharset) {
> > 				pr_err("unable to change iocharset\n");
> > 				return 0;
> > 			}
> > 			p = match_strdup(&args[0]);
> > -			if (p)
> > -				hsb->nls_io = load_nls(p);
> > -			if (!hsb->nls_io) {
> > -				pr_err("unable to load iocharset \"%s\"\n", p);
> > -				kfree(p);
> > +			if (!p)
> > 				return 0;
> > +			if (strcmp(p, "utf8") != 0) {
> > +				hsb->nls_io = load_nls(p);
> > +				if (!hsb->nls_io) {
> > +					pr_err("unable to load iocharset \"%s\"\n", p);
> > +					kfree(p);
> > +					return 0;
> > +				}
> > 			}
> > +			have_iocharset = 1;
> 
> What’s about true here?
> 
> > 			kfree(p);
> > 			break;
> > 		default:
> > @@ -351,7 +361,7 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
> > 		}
> > 	}
> > 
> > -	if (hsb->nls_io && !hsb->nls_disk) {
> > +	if (have_iocharset && !hsb->nls_disk) {
> > 		/*
> > 		 * Previous version of hfs driver did something unexpected:
> > 		 * When codepage was not defined but iocharset was then
> > @@ -382,7 +392,8 @@ static int parse_options(char *options, struct hfs_sb_info *hsb)
> > 			return 0;
> > 		}
> > 	}
> > -	if (hsb->nls_disk && !hsb->nls_io) {
> > +	if (hsb->nls_disk &&
> > +	    !have_iocharset && strcmp(CONFIG_NLS_DEFAULT, "utf8") != 0) {
> 
> Maybe, introduce the variable to calculate the boolean value here? Then if statement will look much cleaner.

I'm not sure how to do it to make code look cleaner.

Currently there is:

if (hsb->nls_disk &&
    !have_iocharset && strcmp(CONFIG_NLS_DEFAULT, "utf8") != 0) {
    hsb->nls_io = load_nls_default();
    ...
}

I can replace it e.g. by:

bool need_to_load_nls;
...
if (hsb->nls_disk &&
    !have_iocharset && strcmp(CONFIG_NLS_DEFAULT, "utf8") != 0)
    need_to_load_nls = true;
else
    need_to_load_nls = false;

if (need_to_load_nls) {
    hsb->nls_io = load_nls_default();
    ...
}

But it is just longer, condition is still there and it requires one
additional variable which more me is less readable because it is longer.

> > 		hsb->nls_io = load_nls_default();
> > 		if (!hsb->nls_io) {
> > 			pr_err("unable to load default iocharset\n");
> > diff --git a/fs/hfs/trans.c b/fs/hfs/trans.c
> > index c75682c61b06..bff8e54003ab 100644
> > --- a/fs/hfs/trans.c
> > +++ b/fs/hfs/trans.c
> > @@ -44,7 +44,7 @@ int hfs_mac2asc(struct super_block *sb, char *out, const struct hfs_name *in)
> > 		srclen = HFS_NAMELEN;
> > 	dst = out;
> > 	dstlen = HFS_MAX_NAMELEN;
> > -	if (nls_io) {
> > +	if (nls_disk) {
> > 		wchar_t ch;
> > 
> 
> I could miss something here. But what’s about the closing “}”?

Closing "}" is there on the same location as it was. Before my change on
"if" line was opening "{" and also with my change there is opening "{".
So opening "{" and closing "}" are there and matches.

> Thanks,
> Slava.
> 
> > 		while (srclen > 0) {
> > @@ -57,7 +57,12 @@ int hfs_mac2asc(struct super_block *sb, char *out, const struct hfs_name *in)
> > 			srclen -= size;
> > 			if (ch == '/')
> > 				ch = ':';
> > -			size = nls_io->uni2char(ch, dst, dstlen);
> > +			if (nls_io)
> > +				size = nls_io->uni2char(ch, dst, dstlen);
> > +			else if (dstlen > 0)
> > +				size = utf32_to_utf8(ch, dst, dstlen);
> > +			else
> > +				size = -ENAMETOOLONG;
> > 			if (size < 0) {
> > 				if (size == -ENAMETOOLONG)
> > 					goto out;
> > @@ -101,11 +106,22 @@ void hfs_asc2mac(struct super_block *sb, struct hfs_name *out, const struct qstr
> > 	srclen = in->len;
> > 	dst = out->name;
> > 	dstlen = HFS_NAMELEN;
> > -	if (nls_io) {
> > +	if (nls_disk) {
> > 		wchar_t ch;
> > +		unicode_t u;
> > 
> > 		while (srclen > 0) {
> > -			size = nls_io->char2uni(src, srclen, &ch);
> > +			if (nls_io)
> > +				size = nls_io->char2uni(src, srclen, &ch);
> > +			else {
> > +				size = utf8_to_utf32(str, strlen, &u);
> > +				if (size >= 0) {
> > +					if (u <= MAX_WCHAR_T)
> > +						ch = u;
> > +					else
> > +						size = -EINVAL;
> > +				}
> > +			}
> > 			if (size < 0) {
> > 				ch = '?';
> > 				size = 1;
> > -- 
> > 2.20.1
> > 
> 

  reply	other threads:[~2022-09-25 12:06 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-08 16:24 [RFC PATCH 00/20] fs: Remove usage of broken nls_utf8 and drop it Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 01/20] fat: Fix iocharset=utf8 mount option Pali Rohár
2021-08-15  3:42   ` OGAWA Hirofumi
2021-08-15  9:42     ` Pali Rohár
2021-08-15 11:23       ` OGAWA Hirofumi
2021-08-23  3:51   ` Kari Argillander
2021-08-08 16:24 ` [RFC PATCH 02/20] hfsplus: Add iocharset= mount option as alias for nls= Pali Rohár
2021-08-09 17:51   ` Viacheslav Dubeyko
2021-08-09 20:49   ` Kari Argillander
2021-08-09 21:25     ` Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 03/20] udf: Fix iocharset=utf8 mount option Pali Rohár
2021-08-12 14:17   ` Jan Kara
2021-08-12 15:51     ` Pali Rohár
2021-08-13 13:48       ` Jan Kara
2021-08-19  8:34         ` Pali Rohár
2021-08-19 10:41           ` Jan Kara
2021-08-08 16:24 ` [RFC PATCH 04/20] isofs: joliet: " Pali Rohár
2021-08-12 14:18   ` Jan Kara
2021-08-08 16:24 ` [RFC PATCH 05/20] ntfs: Undeprecate iocharset= " Pali Rohár
2021-08-09 20:52   ` Kari Argillander
2021-08-19  1:21   ` Kari Argillander
2021-08-19  8:12     ` Pali Rohár
2021-08-19 10:23       ` Kari Argillander
2021-08-19 22:04         ` Pali Rohár
2021-08-19 23:18           ` Kari Argillander
2021-08-08 16:24 ` [RFC PATCH 06/20] ntfs: Fix error processing when load_nls() fails Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 07/20] befs: Fix printing iocharset= mount option Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 08/20] befs: Rename enum value Opt_charset to Opt_iocharset to match " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 09/20] befs: Fix error processing when load_nls() fails Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 10/20] befs: Allow to use native UTF-8 mode Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 11/20] hfs: Explicitly set hsb->nls_disk when hsb->nls_io is set Pali Rohár
2021-08-09 17:31   ` Viacheslav Dubeyko
2021-08-09 17:37     ` Matthew Wilcox
2021-08-09 17:47       ` Pali Rohár
2021-08-09 20:43         ` Steve French
2021-08-09 18:00       ` Viacheslav Dubeyko
2021-08-08 16:24 ` [RFC PATCH 12/20] hfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2021-08-09 17:49   ` Viacheslav Dubeyko
2022-09-25 12:06     ` Pali Rohár [this message]
2021-08-08 16:24 ` [RFC PATCH 13/20] hfsplus: " Pali Rohár
2021-08-09 17:42   ` Viacheslav Dubeyko
2022-09-25 12:12     ` Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 14/20] jfs: Remove custom iso8859-1 implementation Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 15/20] jfs: Fix buffer overflow in jfs_strfromUCS_le() function Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 16/20] jfs: Do not use broken utf8 NLS table for iocharset=utf8 mount option Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 17/20] ntfs: " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 18/20] cifs: " Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 19/20] cifs: Remove usage of load_nls_default() calls Pali Rohár
2021-08-08 16:24 ` [RFC PATCH 20/20] nls: Drop broken nls_utf8 module Pali Rohár
2021-09-03 21:26 ` [RFC PATCH 00/20] fs: Remove usage of broken nls_utf8 and drop it Kari Argillander
2021-09-03 21:37   ` Pali Rohár
2021-09-03 22:06     ` Kari Argillander

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220925120646.dfkofrka74blwrwb@pali \
    --to=pali@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=anton@tuxera.com \
    --cc=hch@infradead.org \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=jack@suse.cz \
    --cc=jfs-discussion@lists.sourceforge.net \
    --cc=linux-cifs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-ntfs-dev@lists.sourceforge.net \
    --cc=luisbg@kernel.org \
    --cc=marek.behun@nic.cz \
    --cc=pavel@ucw.cz \
    --cc=salah.triki@gmail.com \
    --cc=shaggy@kernel.org \
    --cc=slava@dubeyko.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).