From: Al Viro <viro@zeniv.linux.org.uk>
To: "Pali Rohár" <pali.rohar@gmail.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
"Theodore Y. Ts'o" <tytso@mit.edu>,
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
Namjae Jeon <linkinjeon@gmail.com>,
Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: vfat: Broken case-insensitive support for UTF-8
Date: Sun, 19 Jan 2020 23:08:09 +0000 [thread overview]
Message-ID: <20200119230809.GW8904@ZenIV.linux.org.uk> (raw)
In-Reply-To: <20200119221455.bac7dc55g56q2l4r@pali>
On Sun, Jan 19, 2020 at 11:14:55PM +0100, Pali Rohár wrote:
> So when UTF-8 on VFS for VFAT is enabled, then for VFS <--> VFAT
> conversion are used utf16s_to_utf8s() and utf8s_to_utf16s() functions.
> But in fat_name_match(), vfat_hashi() and vfat_cmpi() functions is used
> NLS table (default iso8859-1) with nls_strnicmp() and nls_tolower().
>
> Which means that fat_name_match(), vfat_hashi() and vfat_cmpi() are
> broken for vfat in UTF-8 mode.
>
> I was thinking how to fix it, and the only possible way is to write a
> uni_tolower() function which takes one Unicode code point and returns
> lowercase of input's Unicode code point. We cannot do any Unicode
> normalization as VFAT specification does not say anything about it and
> MS reference fastfat.sys implementation does not do it neither.
Then how can that possibly be broken? If it matches the native behaviour,
that's it.
> As you can see lowercase 'd' and uppercase 'D' are same, but lowercase
> 'č' and uppercase 'Č' are not same. This is because 'č' is two bytes
> 0xc4 0x8d sequence and comparing is done by Latin1 table. 0xc4 is in
> Latin 'Ä' which is already in uppercase. 0x8d is control char so is not
> changed by tolower/toupper function.
Again, who the hell cares? Does the behaviour match how Windows handles
that thing? "Case" is not something well-defined; the only definition
is "whatever weird crap does the native implementation choose to do".
That's the only reason to support that garbage at all...
next prev parent reply other threads:[~2020-01-19 23:08 UTC|newest]
Thread overview: 41+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-19 22:14 vfat: Broken case-insensitive support for UTF-8 Pali Rohár
2020-01-19 23:08 ` Al Viro [this message]
2020-01-19 23:33 ` Pali Rohár
2020-01-20 0:09 ` Al Viro
2020-01-20 11:19 ` Pali Rohár
2020-01-20 4:04 ` OGAWA Hirofumi
2020-01-20 7:30 ` Al Viro
2020-01-20 7:45 ` Al Viro
2020-01-20 8:07 ` oopsably broken case-insensitive support in ext4 and f2fs (Re: vfat: Broken case-insensitive support for UTF-8) Al Viro
2020-01-20 19:35 ` Al Viro
2020-01-24 4:29 ` Eric Biggers
2020-01-24 17:47 ` Linus Torvalds
2020-01-24 18:03 ` Jaegeuk Kim
2020-01-24 18:45 ` Eric Biggers
2020-01-20 11:04 ` vfat: Broken case-insensitive support for UTF-8 Pali Rohár
2020-01-20 12:07 ` OGAWA Hirofumi
2020-01-20 21:40 ` Pali Rohár
2020-01-20 22:46 ` Al Viro
2020-01-20 23:57 ` Pali Rohár
2020-01-21 0:07 ` Al Viro
2020-01-21 20:34 ` Pali Rohár
2020-01-21 21:36 ` Al Viro
2020-01-21 22:14 ` Al Viro
2020-01-21 22:46 ` Pali Rohár
2020-01-26 23:08 ` Pali Rohár
2020-01-21 12:43 ` David Laight
2020-01-22 0:25 ` Gabriel Krisman Bertazi
2020-01-20 15:07 ` David Laight
2020-01-20 15:20 ` Pali Rohár
2020-01-20 15:47 ` David Laight
2020-01-20 16:12 ` Al Viro
2020-01-20 16:51 ` David Laight
2020-01-20 16:27 ` Pali Rohár
2020-01-20 16:43 ` David Laight
2020-01-20 16:56 ` Pali Rohár
2020-01-20 17:37 ` Theodore Y. Ts'o
2020-01-20 17:32 ` Theodore Y. Ts'o
2020-01-20 17:56 ` Pali Rohár
2020-01-21 3:52 ` OGAWA Hirofumi
2020-01-21 11:00 ` Pali Rohár
2020-01-21 12:26 ` OGAWA Hirofumi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200119230809.GW8904@ZenIV.linux.org.uk \
--to=viro@zeniv.linux.org.uk \
--cc=hirofumi@mail.parknet.co.jp \
--cc=krisman@collabora.com \
--cc=linkinjeon@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=pali.rohar@gmail.com \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).