linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
	Gabriel Krisman Bertazi <krisman@suse.de>,
	viro@zeniv.linux.org.uk, linux-f2fs-devel@lists.sourceforge.net,
	ebiggers@kernel.org, linux-fsdevel@vger.kernel.org,
	jaegeuk@kernel.org, linux-ext4@vger.kernel.org
Subject: Re: [f2fs-dev] [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs
Date: Mon, 20 Nov 2023 21:02:54 -0500	[thread overview]
Message-ID: <20231121020254.GB291888@mit.edu> (raw)
In-Reply-To: <CAHk-=whTCWwfmSzv3uVLN286_WZ6coN-GNw=4DWja7NZzp5ytg@mail.gmail.com>

On Mon, Nov 20, 2023 at 10:07:51AM -0800, Linus Torvalds wrote:
> Of course, "do it in shared generic code" doesn't tend to really fix
> the braindamage, but at least it's now shared braindamage and not
> spread out all over. I'm looking at things like
> generic_ci_d_compare(), and it hurts to see the mindless "let's do
> lookups and compares one utf8 character at a time". What a disgrace.
> Somebody either *really* didn't care, or was a Unicode person who
> didn't understand the point of UTF-8.

This isn't because of case-folding brain damage, but rather Unicode
brain damage.  We compare one character at a time because it's
possible for some character like é to either be encoded as 0x0089 (aka
"Latin Small Letter E with Acute") OR as 0x0065 0x0301 ("Latin Small
Letter E" plus "Combining Acute Accent").

Typically, we pretend that UTF-8 means that we can just encode é, or
0x0089 as 0xC3 0xA9 and then call it a day and just use strcmp(3) on
the sucker.  But Unicode is a lot more insane than that.  Technically,
0x65 0xCC 0x81 is the same character as 0xC3 0xA9.

> Oh well. I guess people went "this is going to suck anyway, so let's
> make sure it *really* sucks".

It's more like, "this is going to suck, but if it's going to suck
anyway, let's implement the full Unicode spec in all its gory^H^H^H^H
glory, whether or not it's sane".


					- Ted

  reply	other threads:[~2023-11-21  2:03 UTC|newest]

Thread overview: 65+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-16  5:07 [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 1/9] ecryptfs: Reject casefold directory inodes Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 2/9] 9p: Split ->weak_revalidate from ->revalidate Gabriel Krisman Bertazi
2023-08-16  5:07 ` [PATCH v6 3/9] fs: Expose name under lookup to d_revalidate hooks Gabriel Krisman Bertazi
2023-11-22 20:59   ` Al Viro
2023-08-16  5:07 ` [PATCH v6 4/9] fs: Add DCACHE_CASEFOLDED_NAME flag Gabriel Krisman Bertazi
2023-11-22 20:32   ` Al Viro
2023-08-16  5:07 ` [PATCH v6 5/9] libfs: Validate negative dentries in case-insensitive directories Gabriel Krisman Bertazi
2023-11-22 20:20   ` Al Viro
2023-08-16  5:08 ` [PATCH v6 6/9] libfs: Chain encryption checks after case-insensitive revalidation Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 7/9] libfs: Merge encrypted_ci_dentry_ops and ci_dentry_ops Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 8/9] ext4: Enable negative dentries on case-insensitive lookup Gabriel Krisman Bertazi
2023-08-16  5:08 ` [PATCH v6 9/9] f2fs: " Gabriel Krisman Bertazi
2023-08-17 17:06 ` [PATCH v6 0/9] Support negative dentries on case-insensitive ext4 and f2fs Eric Biggers
2023-08-21 15:52   ` Christian Brauner
2023-08-21 18:53     ` Gabriel Krisman Bertazi
2023-08-22  9:03       ` Christian Brauner
2023-10-24 22:20         ` Gabriel Krisman Bertazi
2023-10-25 13:32 ` Christian Brauner
2023-10-25 15:19   ` Gabriel Krisman Bertazi
2023-11-19 23:11   ` [f2fs-dev] " Gabriel Krisman Bertazi
     [not found]   ` <655a9634.630a0220.d50d7.5063SMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-20 15:06     ` Christian Brauner
2023-11-20 16:59       ` Gabriel Krisman Bertazi
2023-11-20 18:07       ` Linus Torvalds
2023-11-21  2:02         ` Theodore Ts'o [this message]
2023-11-21  2:29           ` Linus Torvalds
2023-11-21  3:03             ` Linus Torvalds
2023-11-21  5:12               ` Theodore Ts'o
2023-11-22 21:04                 ` Al Viro
2023-11-21  2:27         ` Al Viro
2023-11-22 21:19           ` Al Viro
2023-11-23  0:18             ` Linus Torvalds
2023-11-23  5:09               ` Al Viro
2023-11-23 15:57               ` Gabriel Krisman Bertazi
2023-11-23 17:12                 ` Al Viro
2023-11-23 17:37                   ` Gabriel Krisman Bertazi
2023-11-23 18:24                     ` Al Viro
2023-11-23 19:06                       ` Gabriel Krisman Bertazi
2023-11-23 19:53                         ` Al Viro
2023-11-23 20:15                           ` Al Viro
2023-11-24 15:20                           ` Gabriel Krisman Bertazi
2023-11-28  0:02                             ` Al Viro
2023-11-23 21:52                         ` Al Viro
2023-11-24 15:22                           ` Gabriel Krisman Bertazi
2023-11-25 22:01                             ` Al Viro
2023-11-26  4:52                               ` Al Viro
2023-11-26 18:41                                 ` fun with d_invalidate() vs. d_splice_alias() was " Al Viro
2023-11-27  6:38                                   ` Al Viro
2023-11-27 15:47                                     ` Eric W. Biederman
2023-11-27 16:01                                       ` Eric W. Biederman
2023-11-27 17:25                                         ` Al Viro
2023-11-27 18:26                                           ` Al Viro
2023-11-27 16:03                                       ` Al Viro
2023-11-27 16:14                                         ` Al Viro
2023-11-27 18:19                                           ` Eric W. Biederman
2023-11-27 18:43                                             ` Al Viro
2023-11-27 16:33                                     ` Christian Brauner
2023-11-29  4:53                                     ` Al Viro
2023-11-29 10:21                                       ` Christian Brauner
2023-11-29 15:19                                       ` Eric W. Biederman
     [not found]               ` <655f7665.df0a0220.58a21.e84fSMTPIN_ADDED_BROKEN@mx.google.com>
2023-11-23 16:41                 ` Linus Torvalds
2023-11-23  1:12             ` Al Viro
2023-11-23  1:22               ` Al Viro
2023-11-22  3:30         ` Gabriel Krisman Bertazi
2024-01-16 19:02 ` patchwork-bot+f2fs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20231121020254.GB291888@mit.edu \
    --to=tytso@mit.edu \
    --cc=brauner@kernel.org \
    --cc=ebiggers@kernel.org \
    --cc=jaegeuk@kernel.org \
    --cc=krisman@suse.de \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).