All of lore.kernel.org
 help / color / mirror / Atom feed
From: Gabriel Krisman Bertazi <krisman@collabora.com>
To: tytso@mit.edu
Cc: linux-ext4@vger.kernel.org,
	Gabriel Krisman Bertazi <krisman@collabora.com>
Subject: [PATCH e2fsprogs 04/11] ext2fs: Implement faster CI comparison of strings
Date: Wed, 25 Mar 2020 17:18:04 -0400	[thread overview]
Message-ID: <20200325211812.2971787-5-krisman@collabora.com> (raw)
In-Reply-To: <20200325211812.2971787-1-krisman@collabora.com>

Instead of calling casefold two times and memcmp the result, which
require allocating a temporary buffer for the casefolded version, add a
strcasecmp-like method to perform the comparison of each code-point
during the casefold itself.

This method is exposed because it needs to be used directly by fsck.

Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
---
 lib/ext2fs/ext2fs.h   |  4 ++++
 lib/ext2fs/ext2fsP.h  |  4 ++++
 lib/ext2fs/nls_utf8.c | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

diff --git a/lib/ext2fs/ext2fs.h b/lib/ext2fs/ext2fs.h
index bf54130f4edb..c5815c37bbb6 100644
--- a/lib/ext2fs/ext2fs.h
+++ b/lib/ext2fs/ext2fs.h
@@ -1613,6 +1613,10 @@ extern errcode_t ext2fs_new_dir_inline_data(ext2_filsys fs, ext2_ino_t dir_ino,
 extern const struct ext2fs_nls_table *ext2fs_load_nls_table(int encoding);
 extern int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table,
 				     char *s, size_t len, char **pos);
+extern int ext2fs_casefold_cmp(const struct ext2fs_nls_table *table,
+			       const unsigned char *str1, size_t len1,
+			       const unsigned char *str2, size_t len2);
+
 
 /* mkdir.c */
 extern errcode_t ext2fs_mkdir(ext2_filsys fs, ext2_ino_t parent, ext2_ino_t inum,
diff --git a/lib/ext2fs/ext2fsP.h b/lib/ext2fs/ext2fsP.h
index 30564ded1e2b..99239be007f2 100644
--- a/lib/ext2fs/ext2fsP.h
+++ b/lib/ext2fs/ext2fsP.h
@@ -106,6 +106,10 @@ struct ext2fs_nls_ops {
 			unsigned char *dest, size_t dlen);
 	int (*validate)(const struct ext2fs_nls_table *table,
 			char *s, size_t len, char **pos);
+	int (*casefold_cmp)(const struct ext2fs_nls_table *table,
+			    const unsigned char *str1, size_t len1,
+			    const unsigned char *str2, size_t len2);
+
 };
 
 /* Function prototypes */
diff --git a/lib/ext2fs/nls_utf8.c b/lib/ext2fs/nls_utf8.c
index f59484142e19..f85b8e77e47b 100644
--- a/lib/ext2fs/nls_utf8.c
+++ b/lib/ext2fs/nls_utf8.c
@@ -949,9 +949,36 @@ static int utf8_validate(const struct ext2fs_nls_table *table,
 	return 0;
 }
 
+static int utf8_casefold_cmp(const struct ext2fs_nls_table *table,
+			     const unsigned char *str1, size_t len1,
+			     const unsigned char *str2, size_t len2)
+{
+	const struct utf8data *data = utf8nfdicf(table->version);
+	int c1, c2;
+	struct utf8cursor cur1, cur2;
+
+	if (utf8ncursor(&cur1, data, (const char *) str1, len1) < 0)
+		return -1;
+	if (utf8ncursor(&cur2, data, (const char *) str2, len2) < 0)
+		return -1;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = utf8byte(&cur2);
+
+		if (c1 < 0 || c2 < 0)
+			return -1;
+		if (c1 != c2)
+			return c1 - c2;
+	} while (c1);
+
+	return 0;
+}
+
 static const struct ext2fs_nls_ops utf8_ops = {
 	.casefold = utf8_casefold,
 	.validate = utf8_validate,
+	.casefold_cmp = utf8_casefold_cmp,
 };
 
 static const struct ext2fs_nls_table nls_utf8 = {
@@ -972,3 +999,9 @@ int ext2fs_check_encoded_name(const struct ext2fs_nls_table *table,
 {
 	return table->ops->validate(table, name, len, pos);
 }
+int ext2fs_casefold_cmp(const struct ext2fs_nls_table *table,
+			const unsigned char *str1, size_t len1,
+			const unsigned char *str2, size_t len2)
+{
+	return table->ops->casefold_cmp(table, str1, len1, str2, len2);
+}
-- 
2.25.0


  parent reply	other threads:[~2020-03-25 21:18 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-25 21:18 [PATCH e2fsprogs 00/11] Improvements for Case-insensitive handling Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 01/11] tune2fs: Allow enabling casefold feature after fs creation Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 02/11] tune2fs: Fix casefold+encrypt error message Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 03/11] ext2fs: Add method to validate casefolded strings Gabriel Krisman Bertazi
2020-03-25 21:18 ` Gabriel Krisman Bertazi [this message]
2020-03-25 21:18 ` [PATCH e2fsprogs 05/11] e2fsck: Fix entries with invalid encoded characters Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 06/11] e2fsck: Support casefold directories when rehashing Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 07/11] dict: Support comparison with context Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 08/11] e2fsck: Detect duplicated casefolded direntries for rehash Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 09/11] e2fsck: Add option to force encoded filename verification Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 10/11] e2fsck.8.in: Document check_encoding extended option Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 11/11] tests: f_bad_fname: Test fixes of invalid filenames and duplicates Gabriel Krisman Bertazi
2020-03-25 21:18 ` [PATCH e2fsprogs 11/11] tests: f_bad_fname: Validate fix " Gabriel Krisman Bertazi
2020-03-26 17:25   ` Gabriel Krisman Bertazi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200325211812.2971787-5-krisman@collabora.com \
    --to=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.