From: Gabriel Krisman Bertazi <krisman@suse.de>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: ebiggers@kernel.org, viro@zeniv.linux.org.uk, tytso@mit.edu,
linux-fsdevel@vger.kernel.org, jaegeuk@kernel.org
Subject: [PATCH v4] libfs: Attempt exact-match comparison first during casefolded lookup
Date: Wed, 24 Jan 2024 15:13:40 -0300 [thread overview]
Message-ID: <87ttn2sip7.fsf_-_@mailhost.krisman.be> (raw)
In-Reply-To: <CAHk-=wh+4Msg7RKv6mvKz2LfNwK24zKFhnLUyxsrKzsXqni+Kg@mail.gmail.com> (Linus Torvalds's message of "Sun, 21 Jan 2024 11:09:01 -0800")
Linus Torvalds <torvalds@linux-foundation.org> writes:
> On Sun, 21 Jan 2024 at 08:34, Gabriel Krisman Bertazi <krisman@suse.de> wrote:
>>
>> Are you ok with the earlier v2 of this patchset as-is, and I'll send a
>> new series proposing this change?
>
> Yeah, possibly updating just the commit log to mention how
> __d_lookup_rcu_op_compare() takes care of the data race.
Just for completeness, below the version I intend to apply to
unicode/for-next , which is the v2, plus the comments you and Eric
requested. That is, unless something else comes up.
> I do think that just doing the exact check in
> __d_lookup_rcu_op_compare() would then avoid things like the indirect
> call for that (presumably common) case, but it's not a big deal.
I will also follow up with a new patch for this shortly.
Thanks for the reviews.
-- >8 --
Subject: [PATCH v4] libfs: Attempt exact-match comparison first during
casefolded lookup
Casefolded comparisons are (obviously) way more costly than a simple
memcmp. Try the case-sensitive comparison first, falling-back to the
case-insensitive lookup only when needed. This allows any exact-match
lookup to complete without having to walk the utf8 trie.
Note that, for strict mode, generic_ci_d_compare used to reject an
invalid UTF-8 string, which would now be considered valid if it
exact-matches the disk-name. But, if that is the case, the filesystem
is corrupt. More than that, it really doesn't matter in practice,
because the name-under-lookup will have already been rejected by
generic_ci_d_hash and we won't even get here.
The memcmp is safe under RCU because we are operating on str/len instead
of dentry->d_name directly, and the caller guarantees their consistency
between each other in __d_lookup_rcu_op_compare.
Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Gabriel Krisman Bertazi <krisman@suse.de>
---
fs/libfs.c | 40 +++++++++++++++++++++++-----------------
1 file changed, 23 insertions(+), 17 deletions(-)
diff --git a/fs/libfs.c b/fs/libfs.c
index eec6031b0155..306a0510b7dc 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1704,16 +1704,28 @@ bool is_empty_dir_inode(struct inode *inode)
static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
const char *str, const struct qstr *name)
{
- const struct dentry *parent = READ_ONCE(dentry->d_parent);
- const struct inode *dir = READ_ONCE(parent->d_inode);
- const struct super_block *sb = dentry->d_sb;
- const struct unicode_map *um = sb->s_encoding;
- struct qstr qstr = QSTR_INIT(str, len);
+ const struct dentry *parent;
+ const struct inode *dir;
char strbuf[DNAME_INLINE_LEN];
- int ret;
+ struct qstr qstr;
+
+ /*
+ * Attempt a case-sensitive match first. It is cheaper and
+ * should cover most lookups, including all the sane
+ * applications that expect a case-sensitive filesystem.
+ *
+ * This comparison is safe under RCU because the caller
+ * guarantees the consistency between str and len. See
+ * __d_lookup_rcu_op_compare() for details.
+ */
+ if (len == name->len && !memcmp(str, name->name, len))
+ return 0;
+ parent = READ_ONCE(dentry->d_parent);
+ dir = READ_ONCE(parent->d_inode);
if (!dir || !IS_CASEFOLDED(dir))
- goto fallback;
+ return 1;
+
/*
* If the dentry name is stored in-line, then it may be concurrently
* modified by a rename. If this happens, the VFS will eventually retry
@@ -1724,20 +1736,14 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
if (len <= DNAME_INLINE_LEN - 1) {
memcpy(strbuf, str, len);
strbuf[len] = 0;
- qstr.name = strbuf;
+ str = strbuf;
/* prevent compiler from optimizing out the temporary buffer */
barrier();
}
- ret = utf8_strncasecmp(um, name, &qstr);
- if (ret >= 0)
- return ret;
+ qstr.len = len;
+ qstr.name = str;
- if (sb_has_strict_encoding(sb))
- return -EINVAL;
-fallback:
- if (len != name->len)
- return 1;
- return !!memcmp(str, name->name, len);
+ return utf8_strncasecmp(dentry->d_sb->s_encoding, name, &qstr);
}
/**
--
2.43.0
next prev parent reply other threads:[~2024-01-24 18:13 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-01-19 20:25 [PATCH v3 0/2] Try exact-match comparison ahead of case-insensitive match Gabriel Krisman Bertazi
2024-01-19 20:25 ` [PATCH v3 1/2] dcache: Expose dentry_string_cmp outside of dcache Gabriel Krisman Bertazi
2024-01-19 20:48 ` Linus Torvalds
2024-01-21 16:34 ` Gabriel Krisman Bertazi
2024-01-21 19:09 ` Linus Torvalds
2024-01-24 18:13 ` Gabriel Krisman Bertazi [this message]
2024-01-24 18:42 ` [PATCH v4] libfs: Attempt exact-match comparison first during casefolded lookup Linus Torvalds
2024-01-24 22:44 ` Eric Biggers
2024-01-19 20:25 ` [PATCH v3 2/2] libfs: Attempt exact-match comparison first during casefold lookup Gabriel Krisman Bertazi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=87ttn2sip7.fsf_-_@mailhost.krisman.be \
--to=krisman@suse.de \
--cc=ebiggers@kernel.org \
--cc=jaegeuk@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=tytso@mit.edu \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).