linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Pali Rohár" <pali.rohar@gmail.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	"Theodore Y. Ts'o" <tytso@mit.edu>,
	OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>,
	Namjae Jeon <linkinjeon@gmail.com>,
	Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: Re: vfat: Broken case-insensitive support for UTF-8
Date: Mon, 20 Jan 2020 12:19:16 +0100	[thread overview]
Message-ID: <20200120111916.pc2ml2farnga3yen@pali> (raw)
In-Reply-To: <20200120000931.GX8904@ZenIV.linux.org.uk>

On Monday 20 January 2020 00:09:31 Al Viro wrote:
> On Mon, Jan 20, 2020 at 12:33:48AM +0100, Pali Rohár wrote:
> 
> > > Does the behaviour match how Windows handles that thing?
> > 
> > Linux behavior does not match Windows behavior.
> > 
> > On Windows is FAT32 (fastfat.sys) case insensitive and file names "č"
> > and "Č" are treated as same file. Windows does not allow you to create
> > both files. It says that file already exists.
> 
> So how is the mapping specified in their implementation?  That's
> obviously the mapping we have to match.

FAT specification (fatgen103.doc) is just parody for specifications.
E.g. it requires you to use pencil and paper during implementation...

About case insensitivity I found in specification these parts:

"The UNICODE name passed to the file system is converted to upper case."

"UNICODE solves the case mapping problem prevalent in some OEM code
pages by always providing a translation for lower case characters to a
single, unique upper case character."

Which basically says nothing... I can deduce from it that for mapping
table should be used Unicode standard.

But we already know that in that specifications are mistakes. And
relevant is Microsoft FAT implementation (fastfat.sys). It is now open
source on github, so we can inspect how it implements upper case
conversion.

> > > That's the only reason to support that garbage at all...
> > 
> > What do you mean by garbage?
> 
> Case-insensitive anything... the only reason to have that crap at all
> is that native implementations are basically forcing it as fs
> image correctness issue.

You are right. But we need to deal with it.

> It's worthless on its own merits, but
> we can't do something that amounts to corrupting fs image when
> we access it for write.

If we implement same upper case conversion as in reference
implementation (fastfat.sys) then we prevent "corrupting fs".

-- 
Pali Rohár
pali.rohar@gmail.com

  reply	other threads:[~2020-01-20 11:19 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-19 22:14 vfat: Broken case-insensitive support for UTF-8 Pali Rohár
2020-01-19 23:08 ` Al Viro
2020-01-19 23:33   ` Pali Rohár
2020-01-20  0:09     ` Al Viro
2020-01-20 11:19       ` Pali Rohár [this message]
2020-01-20  4:04 ` OGAWA Hirofumi
2020-01-20  7:30   ` Al Viro
2020-01-20  7:45     ` Al Viro
2020-01-20  8:07       ` oopsably broken case-insensitive support in ext4 and f2fs (Re: vfat: Broken case-insensitive support for UTF-8) Al Viro
2020-01-20 19:35         ` Al Viro
2020-01-24  4:29           ` Eric Biggers
2020-01-24 17:47             ` Linus Torvalds
2020-01-24 18:03               ` Jaegeuk Kim
2020-01-24 18:45                 ` Eric Biggers
2020-01-20 11:04   ` vfat: Broken case-insensitive support for UTF-8 Pali Rohár
2020-01-20 12:07     ` OGAWA Hirofumi
2020-01-20 21:40       ` Pali Rohár
2020-01-20 22:46         ` Al Viro
2020-01-20 23:57           ` Pali Rohár
2020-01-21  0:07             ` Al Viro
2020-01-21 20:34               ` Pali Rohár
2020-01-21 21:36                 ` Al Viro
2020-01-21 22:14                   ` Al Viro
2020-01-21 22:46                     ` Pali Rohár
2020-01-26 23:08                 ` Pali Rohár
2020-01-21 12:43             ` David Laight
2020-01-22  0:25         ` Gabriel Krisman Bertazi
2020-01-20 15:07     ` David Laight
2020-01-20 15:20       ` Pali Rohár
2020-01-20 15:47         ` David Laight
2020-01-20 16:12           ` Al Viro
2020-01-20 16:51             ` David Laight
2020-01-20 16:27           ` Pali Rohár
2020-01-20 16:43             ` David Laight
2020-01-20 16:56               ` Pali Rohár
2020-01-20 17:37       ` Theodore Y. Ts'o
2020-01-20 17:32   ` Theodore Y. Ts'o
2020-01-20 17:56     ` Pali Rohár
2020-01-21  3:52     ` OGAWA Hirofumi
2020-01-21 11:00       ` Pali Rohár
2020-01-21 12:26         ` OGAWA Hirofumi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200120111916.pc2ml2farnga3yen@pali \
    --to=pali.rohar@gmail.com \
    --cc=hirofumi@mail.parknet.co.jp \
    --cc=krisman@collabora.com \
    --cc=linkinjeon@gmail.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).