All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 0/4] Make UTF-8 encoding loadable
@ 2021-03-23 18:31 ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions and it is not
necessary to carry this large table in the kernel.
Goal is to make UTF-8 encoding loadable by converting it into a module
and adding a layer between the filesystems and the utf8 module which will
load the module whenever any filesystem that needs unicode is mounted.

1st patch in the series resolves the warning reported by kernel test robot
and 2nd patch fixes the incorrect use of utf8_unload() in ext4 and
f2fs filesystems.

Unicode is the subsystem and utf8 is a charachter encoding for the
subsystem, hence 3rd and 4th patches in the series are renaming functions
and file name to unicode for better understanding the difference between
UTF-8 module and unicode layer.

Last patch in the series adds the layer and utf8 module and also uses
static_call() function introducted for preventing speculative execution
attacks.

---
Changes in v3
  - Add a patch which checks if utf8 is loaded before calling utf8_unload()
    in ext4 and f2fs filesystems
  - Return error if strscpy() returns value < 0
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions.
  - Use static_call() for preventing speculative execution attacks.

Changes in v2
  - Remove the duplicate file from the last patch.
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.
  - Resolve the warning reported by kernel test robot.
  - Resolve all the checkpatch.pl warnings.

Shreeya Patel (4):
  fs: unicode: Use strscpy() instead of strncpy()
  fs: Check if utf8 encoding is loaded before calling utf8_unload()
  fs: unicode: Rename function names from utf8 to unicode
  fs: unicode: Rename utf8-core file to unicode-core

 fs/ext4/hash.c                             |  2 +-
 fs/ext4/namei.c                            | 12 ++---
 fs/ext4/super.c                            |  8 +--
 fs/f2fs/dir.c                              | 12 ++---
 fs/f2fs/super.c                            | 11 ++--
 fs/libfs.c                                 |  6 +--
 fs/unicode/Makefile                        |  2 +-
 fs/unicode/{utf8-core.c => unicode-core.c} | 62 +++++++++++-----------
 fs/unicode/utf8-selftest.c                 |  8 +--
 include/linux/unicode.h                    | 32 +++++------
 10 files changed, 81 insertions(+), 74 deletions(-)
 rename fs/unicode/{utf8-core.c => unicode-core.c} (72%)

-- 
2.30.1


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 0/4] Make UTF-8 encoding loadable
@ 2021-03-23 18:31 ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	andre.almeida, linux-ext4

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions and it is not
necessary to carry this large table in the kernel.
Goal is to make UTF-8 encoding loadable by converting it into a module
and adding a layer between the filesystems and the utf8 module which will
load the module whenever any filesystem that needs unicode is mounted.

1st patch in the series resolves the warning reported by kernel test robot
and 2nd patch fixes the incorrect use of utf8_unload() in ext4 and
f2fs filesystems.

Unicode is the subsystem and utf8 is a charachter encoding for the
subsystem, hence 3rd and 4th patches in the series are renaming functions
and file name to unicode for better understanding the difference between
UTF-8 module and unicode layer.

Last patch in the series adds the layer and utf8 module and also uses
static_call() function introducted for preventing speculative execution
attacks.

---
Changes in v3
  - Add a patch which checks if utf8 is loaded before calling utf8_unload()
    in ext4 and f2fs filesystems
  - Return error if strscpy() returns value < 0
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions.
  - Use static_call() for preventing speculative execution attacks.

Changes in v2
  - Remove the duplicate file from the last patch.
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.
  - Resolve the warning reported by kernel test robot.
  - Resolve all the checkpatch.pl warnings.

Shreeya Patel (4):
  fs: unicode: Use strscpy() instead of strncpy()
  fs: Check if utf8 encoding is loaded before calling utf8_unload()
  fs: unicode: Rename function names from utf8 to unicode
  fs: unicode: Rename utf8-core file to unicode-core

 fs/ext4/hash.c                             |  2 +-
 fs/ext4/namei.c                            | 12 ++---
 fs/ext4/super.c                            |  8 +--
 fs/f2fs/dir.c                              | 12 ++---
 fs/f2fs/super.c                            | 11 ++--
 fs/libfs.c                                 |  6 +--
 fs/unicode/Makefile                        |  2 +-
 fs/unicode/{utf8-core.c => unicode-core.c} | 62 +++++++++++-----------
 fs/unicode/utf8-selftest.c                 |  8 +--
 include/linux/unicode.h                    | 32 +++++------
 10 files changed, 81 insertions(+), 74 deletions(-)
 rename fs/unicode/{utf8-core.c => unicode-core.c} (72%)

-- 
2.30.1



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/5] fs: unicode: Use strscpy() instead of strncpy()
  2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 18:31   ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida, kernel test robot

Following warning was reported by Kernel Test Robot.

In function 'utf8_parse_version',
inlined from 'utf8_load' at fs/unicode/utf8mod.c:195:7:
>> fs/unicode/utf8mod.c:175:2: warning: 'strncpy' specified bound 12 equals
destination size [-Wstringop-truncation]
175 |  strncpy(version_string, version, sizeof(version_string));
    |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The -Wstringop-truncation warning highlights the unintended
uses of the strncpy function that truncate the terminating NULL
character from the source string.
Unlike strncpy(), strscpy() always null-terminates the destination string,
hence use strscpy() instead of strncpy().

Fixes: 9d53690f0d4e5 (unicode: implement higher level API for string handling)
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Reported-by: kernel test robot <lkp@intel.com>
---

Changes in v3
  - Return error if strscpy() returns value < 0

Changes in v2
  - Resolve warning of -Wstringop-truncation reported by
    kernel test robot.

 fs/unicode/utf8-core.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index dc25823bf..706f086bb 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -180,7 +180,10 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
 		{0, NULL}
 	};
 
-	strncpy(version_string, version, sizeof(version_string));
+	int ret = strscpy(version_string, version, sizeof(version_string));
+
+	if (ret < 0)
+		return ret;
 
 	if (match_token(version_string, token, args) != 1)
 		return -EINVAL;
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 1/5] fs: unicode: Use strscpy() instead of strncpy()
@ 2021-03-23 18:31   ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, kernel test robot, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, andre.almeida, linux-ext4

Following warning was reported by Kernel Test Robot.

In function 'utf8_parse_version',
inlined from 'utf8_load' at fs/unicode/utf8mod.c:195:7:
>> fs/unicode/utf8mod.c:175:2: warning: 'strncpy' specified bound 12 equals
destination size [-Wstringop-truncation]
175 |  strncpy(version_string, version, sizeof(version_string));
    |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The -Wstringop-truncation warning highlights the unintended
uses of the strncpy function that truncate the terminating NULL
character from the source string.
Unlike strncpy(), strscpy() always null-terminates the destination string,
hence use strscpy() instead of strncpy().

Fixes: 9d53690f0d4e5 (unicode: implement higher level API for string handling)
Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
Reported-by: kernel test robot <lkp@intel.com>
---

Changes in v3
  - Return error if strscpy() returns value < 0

Changes in v2
  - Resolve warning of -Wstringop-truncation reported by
    kernel test robot.

 fs/unicode/utf8-core.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index dc25823bf..706f086bb 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -180,7 +180,10 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
 		{0, NULL}
 	};
 
-	strncpy(version_string, version, sizeof(version_string));
+	int ret = strscpy(version_string, version, sizeof(version_string));
+
+	if (ret < 0)
+		return ret;
 
 	if (match_token(version_string, token, args) != 1)
 		return -EINVAL;
-- 
2.24.3 (Apple Git-128)



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/5] fs: Check if utf8 encoding is loaded before calling utf8_unload()
  2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 18:31   ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida

utf8_unload is being called if CONFIG_UNICODE is enabled.
The ifdef block doesn't check if utf8 encoding has been loaded
or not before calling the utf8_unload() function.
This is not the expected behavior since it would sometimes lead
to unloading utf8 even before loading it.
Hence, add a condition which will check if sb->encoding is NOT NULL
before calling the utf8_unload().

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---

Changes in v3
  - Add this patch to the series which checks if utf8 encoding
    was loaded before calling uft8_unload().
 
 fs/ext4/super.c | 6 ++++--
 fs/f2fs/super.c | 9 ++++++---
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ad34a3727..e438d14f9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1259,7 +1259,8 @@ static void ext4_put_super(struct super_block *sb)
 	fs_put_dax(sbi->s_daxdev);
 	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -5165,7 +5166,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		crypto_free_shash(sbi->s_chksum_driver);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 
 #ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 706979375..0a04983c2 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1430,7 +1430,8 @@ static void f2fs_put_super(struct super_block *sb)
 	for (i = 0; i < NR_PAGE_TYPE; i++)
 		kvfree(sbi->write_io[i]);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -4073,8 +4074,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		kvfree(sbi->write_io[i]);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
-	sb->s_encoding = NULL;
+	if (sb->s_encoding) {
+		utf8_unload(sb->s_encoding);
+		sb->s_encoding = NULL;
+	}
 #endif
 free_options:
 #ifdef CONFIG_QUOTA
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 2/5] fs: Check if utf8 encoding is loaded before calling utf8_unload()
@ 2021-03-23 18:31   ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	andre.almeida, linux-ext4

utf8_unload is being called if CONFIG_UNICODE is enabled.
The ifdef block doesn't check if utf8 encoding has been loaded
or not before calling the utf8_unload() function.
This is not the expected behavior since it would sometimes lead
to unloading utf8 even before loading it.
Hence, add a condition which will check if sb->encoding is NOT NULL
before calling the utf8_unload().

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---

Changes in v3
  - Add this patch to the series which checks if utf8 encoding
    was loaded before calling uft8_unload().
 
 fs/ext4/super.c | 6 ++++--
 fs/f2fs/super.c | 9 ++++++---
 2 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index ad34a3727..e438d14f9 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1259,7 +1259,8 @@ static void ext4_put_super(struct super_block *sb)
 	fs_put_dax(sbi->s_daxdev);
 	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -5165,7 +5166,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		crypto_free_shash(sbi->s_chksum_driver);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 
 #ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 706979375..0a04983c2 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1430,7 +1430,8 @@ static void f2fs_put_super(struct super_block *sb)
 	for (i = 0; i < NR_PAGE_TYPE; i++)
 		kvfree(sbi->write_io[i]);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
+	if (sb->s_encoding)
+		utf8_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -4073,8 +4074,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		kvfree(sbi->write_io[i]);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sb->s_encoding);
-	sb->s_encoding = NULL;
+	if (sb->s_encoding) {
+		utf8_unload(sb->s_encoding);
+		sb->s_encoding = NULL;
+	}
 #endif
 free_options:
 #ifdef CONFIG_QUOTA
-- 
2.24.3 (Apple Git-128)



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 3/5] fs: unicode: Rename function names from utf8 to unicode
  2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 18:31   ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida

Rename the function names from utf8 to unicode for taking the first step
towards the transformation of utf8-core file into the unicode subsystem
layer file.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---
 fs/ext4/hash.c             |  2 +-
 fs/ext4/namei.c            | 12 ++++----
 fs/ext4/super.c            |  6 ++--
 fs/f2fs/dir.c              | 12 ++++----
 fs/f2fs/super.c            |  6 ++--
 fs/libfs.c                 |  6 ++--
 fs/unicode/utf8-core.c     | 57 +++++++++++++++++++-------------------
 fs/unicode/utf8-selftest.c |  8 +++---
 include/linux/unicode.h    | 32 ++++++++++-----------
 9 files changed, 70 insertions(+), 71 deletions(-)

diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
index a92eb79de..8890a76ab 100644
--- a/fs/ext4/hash.c
+++ b/fs/ext4/hash.c
@@ -285,7 +285,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 		if (!buff)
 			return -ENOMEM;
 
-		dlen = utf8_casefold(um, &qstr, buff, PATH_MAX);
+		dlen = unicode_casefold(um, &qstr, buff, PATH_MAX);
 		if (dlen < 0) {
 			kfree(buff);
 			goto opaque_seq;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 686bf982c..dde5ce795 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1290,9 +1290,9 @@ int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
 	int ret;
 
 	if (quick)
-		ret = utf8_strncasecmp_folded(um, name, entry);
+		ret = unicode_strncasecmp_folded(um, name, entry);
 	else
-		ret = utf8_strncasecmp(um, name, entry);
+		ret = unicode_strncasecmp(um, name, entry);
 
 	if (ret < 0) {
 		/* Handle invalid character sequence as either an error
@@ -1324,9 +1324,9 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
 	if (!cf_name->name)
 		return;
 
-	len = utf8_casefold(dir->i_sb->s_encoding,
-			    iname, cf_name->name,
-			    EXT4_NAME_LEN);
+	len = unicode_casefold(dir->i_sb->s_encoding,
+			       iname, cf_name->name,
+			       EXT4_NAME_LEN);
 	if (len <= 0) {
 		kfree(cf_name->name);
 		cf_name->name = NULL;
@@ -2201,7 +2201,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
 
 #ifdef CONFIG_UNICODE
 	if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
-	    sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
+	    sb->s_encoding && unicode_validate(sb->s_encoding, &dentry->d_name))
 		return -EINVAL;
 #endif
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e438d14f9..853aeb294 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1260,7 +1260,7 @@ static void ext4_put_super(struct super_block *sb)
 	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -4305,7 +4305,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 			goto failed_mount;
 		}
 
-		encoding = utf8_load(encoding_info->version);
+		encoding = unicode_load(encoding_info->version);
 		if (IS_ERR(encoding)) {
 			ext4_msg(sb, KERN_ERR,
 				 "can't mount with superblock charset: %s-%s "
@@ -5167,7 +5167,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 
 #ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index e6270a867..f160f9dd6 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -84,10 +84,10 @@ int f2fs_init_casefolded_name(const struct inode *dir,
 						   GFP_NOFS);
 		if (!fname->cf_name.name)
 			return -ENOMEM;
-		fname->cf_name.len = utf8_casefold(sb->s_encoding,
-						   fname->usr_fname,
-						   fname->cf_name.name,
-						   F2FS_NAME_LEN);
+		fname->cf_name.len = unicode_casefold(sb->s_encoding,
+						      fname->usr_fname,
+						      fname->cf_name.name,
+						      F2FS_NAME_LEN);
 		if ((int)fname->cf_name.len <= 0) {
 			kfree(fname->cf_name.name);
 			fname->cf_name.name = NULL;
@@ -237,7 +237,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
 		entry.len = decrypted_name.len;
 	}
 
-	res = utf8_strncasecmp_folded(um, name, &entry);
+	res = unicode_strncasecmp_folded(um, name, &entry);
 	/*
 	 * In strict mode, ignore invalid names.  In non-strict mode,
 	 * fall back to treating them as opaque byte sequences.
@@ -246,7 +246,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
 		res = name->len == entry.len &&
 				memcmp(name->name, entry.name, name->len) == 0;
 	} else {
-		/* utf8_strncasecmp_folded returns 0 on match */
+		/* unicode_strncasecmp_folded returns 0 on match */
 		res = (res == 0);
 	}
 out:
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a04983c2..a0cd9bfa4 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1431,7 +1431,7 @@ static void f2fs_put_super(struct super_block *sb)
 		kvfree(sbi->write_io[i]);
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -3561,7 +3561,7 @@ static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
 			return -EINVAL;
 		}
 
-		encoding = utf8_load(encoding_info->version);
+		encoding = unicode_load(encoding_info->version);
 		if (IS_ERR(encoding)) {
 			f2fs_err(sbi,
 				 "can't mount with superblock charset: %s-%s "
@@ -4075,7 +4075,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding) {
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 		sb->s_encoding = NULL;
 	}
 #endif
diff --git a/fs/libfs.c b/fs/libfs.c
index e2de5401a..766556165 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1404,7 +1404,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 	 * If the dentry name is stored in-line, then it may be concurrently
 	 * modified by a rename.  If this happens, the VFS will eventually retry
 	 * the lookup, so it doesn't matter what ->d_compare() returns.
-	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
+	 * However, it's unsafe to call unicode_strncasecmp() with an unstable
 	 * string.  Therefore, we have to copy the name into a temporary buffer.
 	 */
 	if (len <= DNAME_INLINE_LEN - 1) {
@@ -1414,7 +1414,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 		/* prevent compiler from optimizing out the temporary buffer */
 		barrier();
 	}
-	ret = utf8_strncasecmp(um, name, &qstr);
+	ret = unicode_strncasecmp(um, name, &qstr);
 	if (ret >= 0)
 		return ret;
 
@@ -1443,7 +1443,7 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
 	if (!dir || !needs_casefold(dir))
 		return 0;
 
-	ret = utf8_casefold_hash(um, dentry, str);
+	ret = unicode_casefold_hash(um, dentry, str);
 	if (ret < 0 && sb_has_strict_encoding(sb))
 		return -EINVAL;
 	return 0;
diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index 706f086bb..686e95e90 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -10,7 +10,7 @@
 
 #include "utf8n.h"
 
-int utf8_validate(const struct unicode_map *um, const struct qstr *str)
+int unicode_validate(const struct unicode_map *um, const struct qstr *str)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 
@@ -18,10 +18,10 @@ int utf8_validate(const struct unicode_map *um, const struct qstr *str)
 		return -1;
 	return 0;
 }
-EXPORT_SYMBOL(utf8_validate);
+EXPORT_SYMBOL(unicode_validate);
 
-int utf8_strncmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2)
+int unicode_strncmp(const struct unicode_map *um,
+		    const struct qstr *s1, const struct qstr *s2)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 	struct utf8cursor cur1, cur2;
@@ -45,10 +45,10 @@ int utf8_strncmp(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncmp);
+EXPORT_SYMBOL(unicode_strncmp);
 
-int utf8_strncasecmp(const struct unicode_map *um,
-		     const struct qstr *s1, const struct qstr *s2)
+int unicode_strncasecmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur1, cur2;
@@ -72,14 +72,14 @@ int utf8_strncasecmp(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncasecmp);
+EXPORT_SYMBOL(unicode_strncasecmp);
 
 /* String cf is expected to be a valid UTF-8 casefolded
  * string.
  */
-int utf8_strncasecmp_folded(const struct unicode_map *um,
-			    const struct qstr *cf,
-			    const struct qstr *s1)
+int unicode_strncasecmp_folded(const struct unicode_map *um,
+			       const struct qstr *cf,
+			       const struct qstr *s1)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur1;
@@ -100,10 +100,10 @@ int utf8_strncasecmp_folded(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncasecmp_folded);
+EXPORT_SYMBOL(unicode_strncasecmp_folded);
 
-int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
-		  unsigned char *dest, size_t dlen)
+int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+		     unsigned char *dest, size_t dlen)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur;
@@ -123,10 +123,10 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
 	}
 	return -EINVAL;
 }
-EXPORT_SYMBOL(utf8_casefold);
+EXPORT_SYMBOL(unicode_casefold);
 
-int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
-		       struct qstr *str)
+int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+			  struct qstr *str)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur;
@@ -144,10 +144,10 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
 	str->hash = end_name_hash(hash);
 	return 0;
 }
-EXPORT_SYMBOL(utf8_casefold_hash);
+EXPORT_SYMBOL(unicode_casefold_hash);
 
-int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
-		   unsigned char *dest, size_t dlen)
+int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+		      unsigned char *dest, size_t dlen)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 	struct utf8cursor cur;
@@ -167,11 +167,10 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
 	}
 	return -EINVAL;
 }
+EXPORT_SYMBOL(unicode_normalize);
 
-EXPORT_SYMBOL(utf8_normalize);
-
-static int utf8_parse_version(const char *version, unsigned int *maj,
-			      unsigned int *min, unsigned int *rev)
+static int unicode_parse_version(const char *version, unsigned int *maj,
+				 unsigned int *min, unsigned int *rev)
 {
 	substring_t args[3];
 	char version_string[12];
@@ -195,7 +194,7 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
 	return 0;
 }
 
-struct unicode_map *utf8_load(const char *version)
+struct unicode_map *unicode_load(const char *version)
 {
 	struct unicode_map *um = NULL;
 	int unicode_version;
@@ -203,7 +202,7 @@ struct unicode_map *utf8_load(const char *version)
 	if (version) {
 		unsigned int maj, min, rev;
 
-		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
+		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
 			return ERR_PTR(-EINVAL);
 
 		if (!utf8version_is_supported(maj, min, rev))
@@ -228,12 +227,12 @@ struct unicode_map *utf8_load(const char *version)
 
 	return um;
 }
-EXPORT_SYMBOL(utf8_load);
+EXPORT_SYMBOL(unicode_load);
 
-void utf8_unload(struct unicode_map *um)
+void unicode_unload(struct unicode_map *um)
 {
 	kfree(um);
 }
-EXPORT_SYMBOL(utf8_unload);
+EXPORT_SYMBOL(unicode_unload);
 
 MODULE_LICENSE("GPL v2");
diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
index 6fe8af7ed..796c1ed92 100644
--- a/fs/unicode/utf8-selftest.c
+++ b/fs/unicode/utf8-selftest.c
@@ -235,7 +235,7 @@ static void check_utf8_nfdicf(void)
 static void check_utf8_comparisons(void)
 {
 	int i;
-	struct unicode_map *table = utf8_load("12.1.0");
+	struct unicode_map *table = unicode_load("12.1.0");
 
 	if (IS_ERR(table)) {
 		pr_err("%s: Unable to load utf8 %d.%d.%d. Skipping.\n",
@@ -249,7 +249,7 @@ static void check_utf8_comparisons(void)
 		const struct qstr s2 = {.name = nfdi_test_data[i].dec,
 					.len = sizeof(nfdi_test_data[i].dec)};
 
-		test_f(!utf8_strncmp(table, &s1, &s2),
+		test_f(!unicode_strncmp(table, &s1, &s2),
 		       "%s %s comparison mismatch\n", s1.name, s2.name);
 	}
 
@@ -259,11 +259,11 @@ static void check_utf8_comparisons(void)
 		const struct qstr s2 = {.name = nfdicf_test_data[i].ncf,
 					.len = sizeof(nfdicf_test_data[i].ncf)};
 
-		test_f(!utf8_strncasecmp(table, &s1, &s2),
+		test_f(!unicode_strncasecmp(table, &s1, &s2),
 		       "%s %s comparison mismatch\n", s1.name, s2.name);
 	}
 
-	utf8_unload(table);
+	unicode_unload(table);
 }
 
 static void check_supported_versions(void)
diff --git a/include/linux/unicode.h b/include/linux/unicode.h
index 74484d44c..de23f9ee7 100644
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -10,27 +10,27 @@ struct unicode_map {
 	int version;
 };
 
-int utf8_validate(const struct unicode_map *um, const struct qstr *str);
+int unicode_validate(const struct unicode_map *um, const struct qstr *str);
 
-int utf8_strncmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2);
+int unicode_strncmp(const struct unicode_map *um,
+		    const struct qstr *s1, const struct qstr *s2);
 
-int utf8_strncasecmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2);
-int utf8_strncasecmp_folded(const struct unicode_map *um,
-			    const struct qstr *cf,
-			    const struct qstr *s1);
+int unicode_strncasecmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2);
+int unicode_strncasecmp_folded(const struct unicode_map *um,
+			       const struct qstr *cf,
+			       const struct qstr *s1);
 
-int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
-		   unsigned char *dest, size_t dlen);
+int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+		      unsigned char *dest, size_t dlen);
 
-int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
-		  unsigned char *dest, size_t dlen);
+int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+		     unsigned char *dest, size_t dlen);
 
-int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
-		       struct qstr *str);
+int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+			  struct qstr *str);
 
-struct unicode_map *utf8_load(const char *version);
-void utf8_unload(struct unicode_map *um);
+struct unicode_map *unicode_load(const char *version);
+void unicode_unload(struct unicode_map *um);
 
 #endif /* _LINUX_UNICODE_H */
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 3/5] fs: unicode: Rename function names from utf8 to unicode
@ 2021-03-23 18:31   ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:31 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	andre.almeida, linux-ext4

Rename the function names from utf8 to unicode for taking the first step
towards the transformation of utf8-core file into the unicode subsystem
layer file.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---
 fs/ext4/hash.c             |  2 +-
 fs/ext4/namei.c            | 12 ++++----
 fs/ext4/super.c            |  6 ++--
 fs/f2fs/dir.c              | 12 ++++----
 fs/f2fs/super.c            |  6 ++--
 fs/libfs.c                 |  6 ++--
 fs/unicode/utf8-core.c     | 57 +++++++++++++++++++-------------------
 fs/unicode/utf8-selftest.c |  8 +++---
 include/linux/unicode.h    | 32 ++++++++++-----------
 9 files changed, 70 insertions(+), 71 deletions(-)

diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
index a92eb79de..8890a76ab 100644
--- a/fs/ext4/hash.c
+++ b/fs/ext4/hash.c
@@ -285,7 +285,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 		if (!buff)
 			return -ENOMEM;
 
-		dlen = utf8_casefold(um, &qstr, buff, PATH_MAX);
+		dlen = unicode_casefold(um, &qstr, buff, PATH_MAX);
 		if (dlen < 0) {
 			kfree(buff);
 			goto opaque_seq;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 686bf982c..dde5ce795 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1290,9 +1290,9 @@ int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
 	int ret;
 
 	if (quick)
-		ret = utf8_strncasecmp_folded(um, name, entry);
+		ret = unicode_strncasecmp_folded(um, name, entry);
 	else
-		ret = utf8_strncasecmp(um, name, entry);
+		ret = unicode_strncasecmp(um, name, entry);
 
 	if (ret < 0) {
 		/* Handle invalid character sequence as either an error
@@ -1324,9 +1324,9 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
 	if (!cf_name->name)
 		return;
 
-	len = utf8_casefold(dir->i_sb->s_encoding,
-			    iname, cf_name->name,
-			    EXT4_NAME_LEN);
+	len = unicode_casefold(dir->i_sb->s_encoding,
+			       iname, cf_name->name,
+			       EXT4_NAME_LEN);
 	if (len <= 0) {
 		kfree(cf_name->name);
 		cf_name->name = NULL;
@@ -2201,7 +2201,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
 
 #ifdef CONFIG_UNICODE
 	if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
-	    sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
+	    sb->s_encoding && unicode_validate(sb->s_encoding, &dentry->d_name))
 		return -EINVAL;
 #endif
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index e438d14f9..853aeb294 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1260,7 +1260,7 @@ static void ext4_put_super(struct super_block *sb)
 	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -4305,7 +4305,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 			goto failed_mount;
 		}
 
-		encoding = utf8_load(encoding_info->version);
+		encoding = unicode_load(encoding_info->version);
 		if (IS_ERR(encoding)) {
 			ext4_msg(sb, KERN_ERR,
 				 "can't mount with superblock charset: %s-%s "
@@ -5167,7 +5167,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 
 #ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index e6270a867..f160f9dd6 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -84,10 +84,10 @@ int f2fs_init_casefolded_name(const struct inode *dir,
 						   GFP_NOFS);
 		if (!fname->cf_name.name)
 			return -ENOMEM;
-		fname->cf_name.len = utf8_casefold(sb->s_encoding,
-						   fname->usr_fname,
-						   fname->cf_name.name,
-						   F2FS_NAME_LEN);
+		fname->cf_name.len = unicode_casefold(sb->s_encoding,
+						      fname->usr_fname,
+						      fname->cf_name.name,
+						      F2FS_NAME_LEN);
 		if ((int)fname->cf_name.len <= 0) {
 			kfree(fname->cf_name.name);
 			fname->cf_name.name = NULL;
@@ -237,7 +237,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
 		entry.len = decrypted_name.len;
 	}
 
-	res = utf8_strncasecmp_folded(um, name, &entry);
+	res = unicode_strncasecmp_folded(um, name, &entry);
 	/*
 	 * In strict mode, ignore invalid names.  In non-strict mode,
 	 * fall back to treating them as opaque byte sequences.
@@ -246,7 +246,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
 		res = name->len == entry.len &&
 				memcmp(name->name, entry.name, name->len) == 0;
 	} else {
-		/* utf8_strncasecmp_folded returns 0 on match */
+		/* unicode_strncasecmp_folded returns 0 on match */
 		res = (res == 0);
 	}
 out:
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 0a04983c2..a0cd9bfa4 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1431,7 +1431,7 @@ static void f2fs_put_super(struct super_block *sb)
 		kvfree(sbi->write_io[i]);
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding)
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -3561,7 +3561,7 @@ static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
 			return -EINVAL;
 		}
 
-		encoding = utf8_load(encoding_info->version);
+		encoding = unicode_load(encoding_info->version);
 		if (IS_ERR(encoding)) {
 			f2fs_err(sbi,
 				 "can't mount with superblock charset: %s-%s "
@@ -4075,7 +4075,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 
 #ifdef CONFIG_UNICODE
 	if (sb->s_encoding) {
-		utf8_unload(sb->s_encoding);
+		unicode_unload(sb->s_encoding);
 		sb->s_encoding = NULL;
 	}
 #endif
diff --git a/fs/libfs.c b/fs/libfs.c
index e2de5401a..766556165 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -1404,7 +1404,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 	 * If the dentry name is stored in-line, then it may be concurrently
 	 * modified by a rename.  If this happens, the VFS will eventually retry
 	 * the lookup, so it doesn't matter what ->d_compare() returns.
-	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
+	 * However, it's unsafe to call unicode_strncasecmp() with an unstable
 	 * string.  Therefore, we have to copy the name into a temporary buffer.
 	 */
 	if (len <= DNAME_INLINE_LEN - 1) {
@@ -1414,7 +1414,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
 		/* prevent compiler from optimizing out the temporary buffer */
 		barrier();
 	}
-	ret = utf8_strncasecmp(um, name, &qstr);
+	ret = unicode_strncasecmp(um, name, &qstr);
 	if (ret >= 0)
 		return ret;
 
@@ -1443,7 +1443,7 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
 	if (!dir || !needs_casefold(dir))
 		return 0;
 
-	ret = utf8_casefold_hash(um, dentry, str);
+	ret = unicode_casefold_hash(um, dentry, str);
 	if (ret < 0 && sb_has_strict_encoding(sb))
 		return -EINVAL;
 	return 0;
diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index 706f086bb..686e95e90 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -10,7 +10,7 @@
 
 #include "utf8n.h"
 
-int utf8_validate(const struct unicode_map *um, const struct qstr *str)
+int unicode_validate(const struct unicode_map *um, const struct qstr *str)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 
@@ -18,10 +18,10 @@ int utf8_validate(const struct unicode_map *um, const struct qstr *str)
 		return -1;
 	return 0;
 }
-EXPORT_SYMBOL(utf8_validate);
+EXPORT_SYMBOL(unicode_validate);
 
-int utf8_strncmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2)
+int unicode_strncmp(const struct unicode_map *um,
+		    const struct qstr *s1, const struct qstr *s2)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 	struct utf8cursor cur1, cur2;
@@ -45,10 +45,10 @@ int utf8_strncmp(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncmp);
+EXPORT_SYMBOL(unicode_strncmp);
 
-int utf8_strncasecmp(const struct unicode_map *um,
-		     const struct qstr *s1, const struct qstr *s2)
+int unicode_strncasecmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur1, cur2;
@@ -72,14 +72,14 @@ int utf8_strncasecmp(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncasecmp);
+EXPORT_SYMBOL(unicode_strncasecmp);
 
 /* String cf is expected to be a valid UTF-8 casefolded
  * string.
  */
-int utf8_strncasecmp_folded(const struct unicode_map *um,
-			    const struct qstr *cf,
-			    const struct qstr *s1)
+int unicode_strncasecmp_folded(const struct unicode_map *um,
+			       const struct qstr *cf,
+			       const struct qstr *s1)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur1;
@@ -100,10 +100,10 @@ int utf8_strncasecmp_folded(const struct unicode_map *um,
 
 	return 0;
 }
-EXPORT_SYMBOL(utf8_strncasecmp_folded);
+EXPORT_SYMBOL(unicode_strncasecmp_folded);
 
-int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
-		  unsigned char *dest, size_t dlen)
+int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+		     unsigned char *dest, size_t dlen)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur;
@@ -123,10 +123,10 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
 	}
 	return -EINVAL;
 }
-EXPORT_SYMBOL(utf8_casefold);
+EXPORT_SYMBOL(unicode_casefold);
 
-int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
-		       struct qstr *str)
+int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+			  struct qstr *str)
 {
 	const struct utf8data *data = utf8nfdicf(um->version);
 	struct utf8cursor cur;
@@ -144,10 +144,10 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
 	str->hash = end_name_hash(hash);
 	return 0;
 }
-EXPORT_SYMBOL(utf8_casefold_hash);
+EXPORT_SYMBOL(unicode_casefold_hash);
 
-int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
-		   unsigned char *dest, size_t dlen)
+int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+		      unsigned char *dest, size_t dlen)
 {
 	const struct utf8data *data = utf8nfdi(um->version);
 	struct utf8cursor cur;
@@ -167,11 +167,10 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
 	}
 	return -EINVAL;
 }
+EXPORT_SYMBOL(unicode_normalize);
 
-EXPORT_SYMBOL(utf8_normalize);
-
-static int utf8_parse_version(const char *version, unsigned int *maj,
-			      unsigned int *min, unsigned int *rev)
+static int unicode_parse_version(const char *version, unsigned int *maj,
+				 unsigned int *min, unsigned int *rev)
 {
 	substring_t args[3];
 	char version_string[12];
@@ -195,7 +194,7 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
 	return 0;
 }
 
-struct unicode_map *utf8_load(const char *version)
+struct unicode_map *unicode_load(const char *version)
 {
 	struct unicode_map *um = NULL;
 	int unicode_version;
@@ -203,7 +202,7 @@ struct unicode_map *utf8_load(const char *version)
 	if (version) {
 		unsigned int maj, min, rev;
 
-		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
+		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
 			return ERR_PTR(-EINVAL);
 
 		if (!utf8version_is_supported(maj, min, rev))
@@ -228,12 +227,12 @@ struct unicode_map *utf8_load(const char *version)
 
 	return um;
 }
-EXPORT_SYMBOL(utf8_load);
+EXPORT_SYMBOL(unicode_load);
 
-void utf8_unload(struct unicode_map *um)
+void unicode_unload(struct unicode_map *um)
 {
 	kfree(um);
 }
-EXPORT_SYMBOL(utf8_unload);
+EXPORT_SYMBOL(unicode_unload);
 
 MODULE_LICENSE("GPL v2");
diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
index 6fe8af7ed..796c1ed92 100644
--- a/fs/unicode/utf8-selftest.c
+++ b/fs/unicode/utf8-selftest.c
@@ -235,7 +235,7 @@ static void check_utf8_nfdicf(void)
 static void check_utf8_comparisons(void)
 {
 	int i;
-	struct unicode_map *table = utf8_load("12.1.0");
+	struct unicode_map *table = unicode_load("12.1.0");
 
 	if (IS_ERR(table)) {
 		pr_err("%s: Unable to load utf8 %d.%d.%d. Skipping.\n",
@@ -249,7 +249,7 @@ static void check_utf8_comparisons(void)
 		const struct qstr s2 = {.name = nfdi_test_data[i].dec,
 					.len = sizeof(nfdi_test_data[i].dec)};
 
-		test_f(!utf8_strncmp(table, &s1, &s2),
+		test_f(!unicode_strncmp(table, &s1, &s2),
 		       "%s %s comparison mismatch\n", s1.name, s2.name);
 	}
 
@@ -259,11 +259,11 @@ static void check_utf8_comparisons(void)
 		const struct qstr s2 = {.name = nfdicf_test_data[i].ncf,
 					.len = sizeof(nfdicf_test_data[i].ncf)};
 
-		test_f(!utf8_strncasecmp(table, &s1, &s2),
+		test_f(!unicode_strncasecmp(table, &s1, &s2),
 		       "%s %s comparison mismatch\n", s1.name, s2.name);
 	}
 
-	utf8_unload(table);
+	unicode_unload(table);
 }
 
 static void check_supported_versions(void)
diff --git a/include/linux/unicode.h b/include/linux/unicode.h
index 74484d44c..de23f9ee7 100644
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -10,27 +10,27 @@ struct unicode_map {
 	int version;
 };
 
-int utf8_validate(const struct unicode_map *um, const struct qstr *str);
+int unicode_validate(const struct unicode_map *um, const struct qstr *str);
 
-int utf8_strncmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2);
+int unicode_strncmp(const struct unicode_map *um,
+		    const struct qstr *s1, const struct qstr *s2);
 
-int utf8_strncasecmp(const struct unicode_map *um,
-		 const struct qstr *s1, const struct qstr *s2);
-int utf8_strncasecmp_folded(const struct unicode_map *um,
-			    const struct qstr *cf,
-			    const struct qstr *s1);
+int unicode_strncasecmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2);
+int unicode_strncasecmp_folded(const struct unicode_map *um,
+			       const struct qstr *cf,
+			       const struct qstr *s1);
 
-int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
-		   unsigned char *dest, size_t dlen);
+int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+		      unsigned char *dest, size_t dlen);
 
-int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
-		  unsigned char *dest, size_t dlen);
+int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+		     unsigned char *dest, size_t dlen);
 
-int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
-		       struct qstr *str);
+int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+			  struct qstr *str);
 
-struct unicode_map *utf8_load(const char *version);
-void utf8_unload(struct unicode_map *um);
+struct unicode_map *unicode_load(const char *version);
+void unicode_unload(struct unicode_map *um);
 
 #endif /* _LINUX_UNICODE_H */
-- 
2.24.3 (Apple Git-128)



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 4/5] fs: unicode: Rename utf8-core file to unicode-core
  2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 18:32   ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:32 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida

Rename the file name from utf8-core to unicode-core for transformation of
utf8-core file into the unicode subsystem layer file and also for better
understanding.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---
 fs/unicode/Makefile                        | 2 +-
 fs/unicode/{utf8-core.c => unicode-core.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename fs/unicode/{utf8-core.c => unicode-core.c} (100%)

diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
index b88aecc86..fbf9a629e 100644
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -3,7 +3,7 @@
 obj-$(CONFIG_UNICODE) += unicode.o
 obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
 
-unicode-y := utf8-norm.o utf8-core.o
+unicode-y := utf8-norm.o unicode-core.o
 
 $(obj)/utf8-norm.o: $(obj)/utf8data.h
 
diff --git a/fs/unicode/utf8-core.c b/fs/unicode/unicode-core.c
similarity index 100%
rename from fs/unicode/utf8-core.c
rename to fs/unicode/unicode-core.c
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 4/5] fs: unicode: Rename utf8-core file to unicode-core
@ 2021-03-23 18:32   ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:32 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	andre.almeida, linux-ext4

Rename the file name from utf8-core to unicode-core for transformation of
utf8-core file into the unicode subsystem layer file and also for better
understanding.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---
 fs/unicode/Makefile                        | 2 +-
 fs/unicode/{utf8-core.c => unicode-core.c} | 0
 2 files changed, 1 insertion(+), 1 deletion(-)
 rename fs/unicode/{utf8-core.c => unicode-core.c} (100%)

diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
index b88aecc86..fbf9a629e 100644
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -3,7 +3,7 @@
 obj-$(CONFIG_UNICODE) += unicode.o
 obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
 
-unicode-y := utf8-norm.o utf8-core.o
+unicode-y := utf8-norm.o unicode-core.o
 
 $(obj)/utf8-norm.o: $(obj)/utf8data.h
 
diff --git a/fs/unicode/utf8-core.c b/fs/unicode/unicode-core.c
similarity index 100%
rename from fs/unicode/utf8-core.c
rename to fs/unicode/unicode-core.c
-- 
2.24.3 (Apple Git-128)



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
  2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 18:32   ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:32 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions.
It is not necessary to load this large table in the kernel if no
file system is using it, hence make UTF-8 encoding loadable by converting
it into a module.
Modify the file called unicode-core which will act as a layer for
unicode subsystem. It will load the UTF-8 module and access it's functions
whenever any filesystem that needs unicode is mounted.
Also, indirect calls using function pointers are easily exploitable by
speculative execution attacks, hence use static_call() in unicode.h and
unicode-core.c files inorder to prevent these attacks by making direct
calls and also to improve the performance of function pointers.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---

Changes in v3
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions that could occur if the module
    is deregistered after checking utf8_ops and before doing the
    try_module_get() in the following if condition
    if (!utf8_ops || !try_module_get(utf8_ops->owner)
  - Use static_call() for preventing speculative execution attacks.
  - WARN_ON in case utf8_ops is NULL in unicode_unload().
  - Rename module file from utf8mod to unicode-utf8.

Changes in v2
  - Remove the duplicate file utf8-core.c
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.

 fs/unicode/Kconfig        |  11 +-
 fs/unicode/Makefile       |   5 +-
 fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
 fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
 include/linux/unicode.h   |  99 ++++++++++++--
 5 files changed, 441 insertions(+), 197 deletions(-)
 create mode 100644 fs/unicode/unicode-utf8.c

diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
index 2c27b9a5c..2961b0206 100644
--- a/fs/unicode/Kconfig
+++ b/fs/unicode/Kconfig
@@ -8,7 +8,16 @@ config UNICODE
 	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
 	  support.
 
+# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
+# Having UTF-8 encoding as a module will avoid carrying large
+# database table present in utf8data.h_shipped into the kernel
+# by being able to load it only when it is required by the filesystem.
+config UNICODE_UTF8
+	tristate "UTF-8 module"
+	depends on UNICODE
+	default m
+
 config UNICODE_NORMALIZATION_SELFTEST
 	tristate "Test UTF-8 normalization support"
-	depends on UNICODE
+	depends on UNICODE_UTF8
 	default n
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -1,11 +1,14 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_UNICODE) += unicode.o
+obj-$(CONFIG_UNICODE_UTF8) += utf8.o
 obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
 
-unicode-y := utf8-norm.o unicode-core.o
+unicode-y := unicode-core.o
+utf8-y := unicode-utf8.o utf8-norm.o
 
 $(obj)/utf8-norm.o: $(obj)/utf8data.h
+$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
 
 # In the normal build, the checked-in utf8data.h is just shipped.
 #
--- a/fs/unicode/unicode-core.c
+++ b/fs/unicode/unicode-core.c
@@ -1,238 +1,144 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/string.h>
 #include <linux/slab.h>
-#include <linux/parser.h>
 #include <linux/errno.h>
 #include <linux/unicode.h>
-#include <linux/stringhash.h>
+#include <linux/spinlock.h>
 
-#include "utf8n.h"
+DEFINE_SPINLOCK(utf8ops_lock);
 
-int unicode_validate(const struct unicode_map *um, const struct qstr *str)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-
-	if (utf8nlen(data, str->name, str->len) < 0)
-		return -1;
-	return 0;
-}
+struct unicode_ops *utf8_ops;
+EXPORT_SYMBOL(utf8_ops);
+
+int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_validate);
 
-int unicode_strncmp(const struct unicode_map *um,
-		    const struct qstr *s1, const struct qstr *s2)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-	struct utf8cursor cur1, cur2;
-	int c1, c2;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = utf8byte(&cur2);
-
-		if (c1 < 0 || c2 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
+		  const struct qstr *s2)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncmp);
 
-int unicode_strncasecmp(const struct unicode_map *um,
-			const struct qstr *s1, const struct qstr *s2)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur1, cur2;
-	int c1, c2;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = utf8byte(&cur2);
-
-		if (c1 < 0 || c2 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
+		      const struct qstr *s2)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncasecmp);
 
-/* String cf is expected to be a valid UTF-8 casefolded
- * string.
- */
-int unicode_strncasecmp_folded(const struct unicode_map *um,
-			       const struct qstr *cf,
-			       const struct qstr *s1)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur1;
-	int c1, c2;
-	int i = 0;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = cf->name[i++];
-		if (c1 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncasecmp_folded(const struct unicode_map *um,
+			     const struct qstr *cf, const struct qstr *s1)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncasecmp_folded);
 
-int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
-		     unsigned char *dest, size_t dlen)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur;
-	size_t nlen = 0;
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	for (nlen = 0; nlen < dlen; nlen++) {
-		int c = utf8byte(&cur);
-
-		dest[nlen] = c;
-		if (!c)
-			return nlen;
-		if (c == -1)
-			break;
-	}
-	return -EINVAL;
-}
+int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+		    unsigned char *dest, size_t dlen)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_casefold);
 
-int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
-			  struct qstr *str)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur;
-	int c;
-	unsigned long hash = init_name_hash(salt);
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	while ((c = utf8byte(&cur))) {
-		if (c < 0)
-			return -EINVAL;
-		hash = partial_name_hash((unsigned char)c, hash);
-	}
-	str->hash = end_name_hash(hash);
-	return 0;
-}
+int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+		   unsigned char *dest, size_t dlen)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_casefold_hash);
 
-int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
-		      unsigned char *dest, size_t dlen)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-	struct utf8cursor cur;
-	ssize_t nlen = 0;
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	for (nlen = 0; nlen < dlen; nlen++) {
-		int c = utf8byte(&cur);
-
-		dest[nlen] = c;
-		if (!c)
-			return nlen;
-		if (c == -1)
-			break;
-	}
-	return -EINVAL;
-}
+int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			struct qstr *str)
+{
+	return 0;
+}
+
+struct unicode_map *_utf8_load(const char *version)
+{
+	return NULL;
+}
-EXPORT_SYMBOL(unicode_normalize);
 
-static int unicode_parse_version(const char *version, unsigned int *maj,
-				 unsigned int *min, unsigned int *rev)
-{
-	substring_t args[3];
-	char version_string[12];
-	static const struct match_token token[] = {
-		{1, "%d.%d.%d"},
-		{0, NULL}
-	};
-
-	int ret = strscpy(version_string, version, sizeof(version_string));
-
-	if (ret < 0)
-		return ret;
-
-	if (match_token(version_string, token, args) != 1)
-		return -EINVAL;
-
-	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
-	    match_int(&args[2], rev))
-		return -EINVAL;
-
-	return 0;
-}
+void _utf8_unload(struct unicode_map *um)
+{
+	return;
+}
+
+DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
+DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
+DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
+DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
+DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
+DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
+DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
+DEFINE_STATIC_CALL(utf8_load, _utf8_load);
+DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
+EXPORT_STATIC_CALL(utf8_strncmp);
+EXPORT_STATIC_CALL(utf8_strncasecmp);
+EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
+
+static int unicode_load_module(void)
+{
+	int ret = request_module("utf8");
+
+	if (ret) {
+		pr_err("Failed to load UTF-8 module\n");
+		return ret;
+	}
+	return 0;
+}
 
 struct unicode_map *unicode_load(const char *version)
-{
-	struct unicode_map *um = NULL;
-	int unicode_version;
-
-	if (version) {
-		unsigned int maj, min, rev;
-
-		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
-			return ERR_PTR(-EINVAL);
-
-		if (!utf8version_is_supported(maj, min, rev))
-			return ERR_PTR(-EINVAL);
-
-		unicode_version = UNICODE_AGE(maj, min, rev);
-	} else {
-		unicode_version = utf8version_latest();
-		printk(KERN_WARNING"UTF-8 version not specified. "
-		       "Assuming latest supported version (%d.%d.%d).",
-		       (unicode_version >> 16) & 0xff,
-		       (unicode_version >> 8) & 0xff,
-		       (unicode_version & 0xff));
-	}
-
-	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
-	if (!um)
-		return ERR_PTR(-ENOMEM);
-
-	um->charset = "UTF-8";
-	um->version = unicode_version;
-
-	return um;
-}
+{
+	int ret = unicode_load_module();
+
+	if (ret)
+		return ERR_PTR(ret);
+
+	spin_lock(&utf8ops_lock);
+	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
+		spin_unlock(&utf8ops_lock);
+		return ERR_PTR(-ENODEV);
+	} else {
+		spin_unlock(&utf8ops_lock);
+		return static_call(utf8_load)(version);
+	}
+}
 EXPORT_SYMBOL(unicode_load);
 
 void unicode_unload(struct unicode_map *um)
 {
-	kfree(um);
+	if (WARN_ON(!utf8_ops))
+		return;
+
+	module_put(utf8_ops->owner);
+	static_call(utf8_unload)(um);
 }
 EXPORT_SYMBOL(unicode_unload);
 
+void unicode_register(struct unicode_ops *ops)
+{
+	spin_lock(&utf8ops_lock);
+	utf8_ops = ops;
+
+	static_call_update(utf8_validate, utf8_ops->validate);
+	static_call_update(utf8_strncmp, utf8_ops->strncmp);
+	static_call_update(utf8_strncasecmp, utf8_ops->strncasecmp);
+	static_call_update(utf8_strncasecmp_folded, utf8_ops->strncasecmp_folded);
+	static_call_update(utf8_normalize, utf8_ops->normalize);
+	static_call_update(utf8_casefold, utf8_ops->casefold);
+	static_call_update(utf8_casefold_hash, utf8_ops->casefold_hash);
+	static_call_update(utf8_load, utf8_ops->load);
+	static_call_update(utf8_unload, utf8_ops->unload);
+
+	spin_unlock(&utf8ops_lock);
+}
+EXPORT_SYMBOL(unicode_register);
+
+void unicode_unregister(void)
+{
+	spin_lock(&utf8ops_lock);
+	utf8_ops = NULL;
+	spin_unlock(&utf8ops_lock);
+}
+EXPORT_SYMBOL(unicode_unregister);
+
 MODULE_LICENSE("GPL v2");
diff --git a/fs/unicode/unicode-utf8.c b/fs/unicode/unicode-utf8.c
new file mode 100644
index 000000000..770e60696
--- /dev/null
+++ b/fs/unicode/unicode-utf8.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/errno.h>
+#include <linux/unicode.h>
+#include <linux/stringhash.h>
+
+#include "utf8n.h"
+
+static int utf8_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+
+	if (utf8nlen(data, str->name, str->len) < 0)
+		return -1;
+	return 0;
+}
+
+static int utf8_strncmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+	struct utf8cursor cur1, cur2;
+	int c1, c2;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = utf8byte(&cur2);
+
+		if (c1 < 0 || c2 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+static int utf8_strncasecmp(const struct unicode_map *um,
+			    const struct qstr *s1, const struct qstr *s2)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur1, cur2;
+	int c1, c2;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = utf8byte(&cur2);
+
+		if (c1 < 0 || c2 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+/* String cf is expected to be a valid UTF-8 casefolded
+ * string.
+ */
+static int utf8_strncasecmp_folded(const struct unicode_map *um,
+				   const struct qstr *cf,
+				   const struct qstr *s1)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur1;
+	int c1, c2;
+	int i = 0;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = cf->name[i++];
+		if (c1 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+static int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+			 unsigned char *dest, size_t dlen)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur;
+	size_t nlen = 0;
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	for (nlen = 0; nlen < dlen; nlen++) {
+		int c = utf8byte(&cur);
+
+		dest[nlen] = c;
+		if (!c)
+			return nlen;
+		if (c == -1)
+			break;
+	}
+	return -EINVAL;
+}
+
+static int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			      struct qstr *str)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur;
+	int c;
+	unsigned long hash = init_name_hash(salt);
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	while ((c = utf8byte(&cur))) {
+		if (c < 0)
+			return -EINVAL;
+		hash = partial_name_hash((unsigned char)c, hash);
+	}
+	str->hash = end_name_hash(hash);
+	return 0;
+}
+
+static int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+			  unsigned char *dest, size_t dlen)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+	struct utf8cursor cur;
+	ssize_t nlen = 0;
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	for (nlen = 0; nlen < dlen; nlen++) {
+		int c = utf8byte(&cur);
+
+		dest[nlen] = c;
+		if (!c)
+			return nlen;
+		if (c == -1)
+			break;
+	}
+	return -EINVAL;
+}
+
+static int utf8_parse_version(const char *version, unsigned int *maj,
+			      unsigned int *min, unsigned int *rev)
+{
+	substring_t args[3];
+	char version_string[12];
+	static const struct match_token token[] = {
+		{1, "%d.%d.%d"},
+		{0, NULL}
+	};
+
+	int ret = strscpy(version_string, version, sizeof(version_string));
+
+	if (ret < 0)
+		return ret;
+
+	if (match_token(version_string, token, args) != 1)
+		return -EINVAL;
+
+	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
+	    match_int(&args[2], rev))
+		return -EINVAL;
+
+	return 0;
+}
+
+static struct unicode_map *utf8_load(const char *version)
+{
+	struct unicode_map *um = NULL;
+	int unicode_version;
+
+	if (version) {
+		unsigned int maj, min, rev;
+
+		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
+			return ERR_PTR(-EINVAL);
+
+		if (!utf8version_is_supported(maj, min, rev))
+			return ERR_PTR(-EINVAL);
+
+		unicode_version = UNICODE_AGE(maj, min, rev);
+	} else {
+		unicode_version = utf8version_latest();
+		pr_warn("UTF-8 version not specified. Assuming latest supported version (%d.%d.%d).",
+			(unicode_version >> 16) & 0xff,
+			(unicode_version >> 8) & 0xff,
+			(unicode_version & 0xfe));
+	}
+
+	um = kzalloc(sizeof(*um), GFP_KERNEL);
+	if (!um)
+		return ERR_PTR(-ENOMEM);
+
+	um->charset = "UTF-8";
+	um->version = unicode_version;
+
+	return um;
+}
+
+void utf8_unload(struct unicode_map *um)
+{
+	kfree(um);
+}
+
+static struct unicode_ops ops = {
+	.owner = THIS_MODULE,
+	.validate = utf8_validate,
+	.strncmp = utf8_strncmp,
+	.strncasecmp = utf8_strncasecmp,
+	.strncasecmp_folded = utf8_strncasecmp_folded,
+	.casefold = utf8_casefold,
+	.casefold_hash = utf8_casefold_hash,
+	.normalize = utf8_normalize,
+	.load = utf8_load,
+	.unload = utf8_unload,
+};
+
+static int __init utf8_init(void)
+{
+	unicode_register(&ops);
+	return 0;
+}
+
+static void __exit utf8_exit(void)
+{
+	unicode_unregister();
+}
+
+module_init(utf8_init);
+module_exit(utf8_exit);
+
+MODULE_LICENSE("GPL v2");
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -4,33 +4,104 @@
 
 #include <linux/init.h>
 #include <linux/dcache.h>
+#include <linux/static_call.h>
+
 
 struct unicode_map {
 	const char *charset;
 	int version;
 };
 
-int unicode_validate(const struct unicode_map *um, const struct qstr *str);
+struct unicode_ops {
+	struct module *owner;
+	int (*validate)(const struct unicode_map *um, const struct qstr *str);
+	int (*strncmp)(const struct unicode_map *um, const struct qstr *s1,
+		       const struct qstr *s2);
+	int (*strncasecmp)(const struct unicode_map *um, const struct qstr *s1,
+			   const struct qstr *s2);
+	int (*strncasecmp_folded)(const struct unicode_map *um, const struct qstr *cf,
+				  const struct qstr *s1);
+	int (*normalize)(const struct unicode_map *um, const struct qstr *str,
+			 unsigned char *dest, size_t dlen);
+	int (*casefold)(const struct unicode_map *um, const struct qstr *str,
+			unsigned char *dest, size_t dlen);
+	int (*casefold_hash)(const struct unicode_map *um, const void *salt,
+			     struct qstr *str);
+	struct unicode_map* (*load)(const char *version);
+	void (*unload)(struct unicode_map *um);
+};
 
-int unicode_strncmp(const struct unicode_map *um,
-		    const struct qstr *s1, const struct qstr *s2);
+extern struct unicode_ops *utf8_ops;
 
-int unicode_strncasecmp(const struct unicode_map *um,
-			const struct qstr *s1, const struct qstr *s2);
-int unicode_strncasecmp_folded(const struct unicode_map *um,
-			       const struct qstr *cf,
-			       const struct qstr *s1);
+int _utf8_validate(const struct unicode_map *um, const struct qstr *str);
+int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
+		  const struct qstr *s2);
+int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
+		      const struct qstr *s2);
+int _utf8_strncasecmp_folded(const struct unicode_map *um,
+			     const struct qstr *cf,
+			     const struct qstr *s1);
+int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+		    unsigned char *dest, size_t dlen);
+int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+		   unsigned char *dest, size_t dlen);
+int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			struct qstr *str);
 
-int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
-		      unsigned char *dest, size_t dlen);
+DECLARE_STATIC_CALL(utf8_validate, _utf8_validate);
+DECLARE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
+DECLARE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
+DECLARE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
+DECLARE_STATIC_CALL(utf8_normalize, _utf8_normalize);
+DECLARE_STATIC_CALL(utf8_casefold, _utf8_casefold);
+DECLARE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
 
-int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
-		     unsigned char *dest, size_t dlen);
+static inline int unicode_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	return static_call(utf8_validate)(um, str);
+}
 
-int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
-			  struct qstr *str);
+static inline int unicode_strncmp(const struct unicode_map *um,
+				  const struct qstr *s1, const struct qstr *s2)
+{
+	return static_call(utf8_strncmp)(um, s1, s2);
+}
+
+static inline int unicode_strncasecmp(const struct unicode_map *um,
+				      const struct qstr *s1, const struct qstr *s2)
+{
+	return static_call(utf8_strncasecmp)(um, s1, s2);
+}
+
+static inline int unicode_strncasecmp_folded(const struct unicode_map *um,
+					     const struct qstr *cf,
+					     const struct qstr *s1)
+{
+	return static_call(utf8_strncasecmp_folded)(um, cf, s1);
+}
+
+static inline int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+				    unsigned char *dest, size_t dlen)
+{
+	return static_call(utf8_normalize)(um, str, dest, dlen);
+}
+
+static inline int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+				   unsigned char *dest, size_t dlen)
+{
+	return static_call(utf8_casefold)(um, str, dest, dlen);
+}
+
+static inline int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+					struct qstr *str)
+{
+	return static_call(utf8_casefold_hash)(um, salt, str);
+}
 
 struct unicode_map *unicode_load(const char *version);
 void unicode_unload(struct unicode_map *um);
 
+void unicode_register(struct unicode_ops *ops);
+void unicode_unregister(void);
+
 #endif /* _LINUX_UNICODE_H */
-- 
2.24.3 (Apple Git-128)


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [f2fs-dev] [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
@ 2021-03-23 18:32   ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 18:32 UTC (permalink / raw)
  To: tytso, adilger.kernel, jaegeuk, chao, krisman, ebiggers, drosen,
	ebiggers, yuchao0
  Cc: kernel, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	andre.almeida, linux-ext4

utf8data.h_shipped has a large database table which is an auto-generated
decodification trie for the unicode normalization functions.
It is not necessary to load this large table in the kernel if no
file system is using it, hence make UTF-8 encoding loadable by converting
it into a module.
Modify the file called unicode-core which will act as a layer for
unicode subsystem. It will load the UTF-8 module and access it's functions
whenever any filesystem that needs unicode is mounted.
Also, indirect calls using function pointers are easily exploitable by
speculative execution attacks, hence use static_call() in unicode.h and
unicode-core.c files inorder to prevent these attacks by making direct
calls and also to improve the performance of function pointers.

Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
---

Changes in v3
  - Correct the conditions to prevent NULL pointer dereference while
    accessing functions via utf8_ops variable.
  - Add spinlock to avoid race conditions that could occur if the module
    is deregistered after checking utf8_ops and before doing the
    try_module_get() in the following if condition
    if (!utf8_ops || !try_module_get(utf8_ops->owner)
  - Use static_call() for preventing speculative execution attacks.
  - WARN_ON in case utf8_ops is NULL in unicode_unload().
  - Rename module file from utf8mod to unicode-utf8.

Changes in v2
  - Remove the duplicate file utf8-core.c
  - Make the wrapper functions inline.
  - Remove msleep and use try_module_get() and module_put()
    for ensuring that module is loaded correctly and also
    doesn't get unloaded while in use.

 fs/unicode/Kconfig        |  11 +-
 fs/unicode/Makefile       |   5 +-
 fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
 fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
 include/linux/unicode.h   |  99 ++++++++++++--
 5 files changed, 441 insertions(+), 197 deletions(-)
 create mode 100644 fs/unicode/unicode-utf8.c

diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
index 2c27b9a5c..2961b0206 100644
--- a/fs/unicode/Kconfig
+++ b/fs/unicode/Kconfig
@@ -8,7 +8,16 @@ config UNICODE
 	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
 	  support.
 
+# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
+# Having UTF-8 encoding as a module will avoid carrying large
+# database table present in utf8data.h_shipped into the kernel
+# by being able to load it only when it is required by the filesystem.
+config UNICODE_UTF8
+	tristate "UTF-8 module"
+	depends on UNICODE
+	default m
+
 config UNICODE_NORMALIZATION_SELFTEST
 	tristate "Test UTF-8 normalization support"
-	depends on UNICODE
+	depends on UNICODE_UTF8
 	default n
--- a/fs/unicode/Makefile
+++ b/fs/unicode/Makefile
@@ -1,11 +1,14 @@
 # SPDX-License-Identifier: GPL-2.0
 
 obj-$(CONFIG_UNICODE) += unicode.o
+obj-$(CONFIG_UNICODE_UTF8) += utf8.o
 obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
 
-unicode-y := utf8-norm.o unicode-core.o
+unicode-y := unicode-core.o
+utf8-y := unicode-utf8.o utf8-norm.o
 
 $(obj)/utf8-norm.o: $(obj)/utf8data.h
+$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
 
 # In the normal build, the checked-in utf8data.h is just shipped.
 #
--- a/fs/unicode/unicode-core.c
+++ b/fs/unicode/unicode-core.c
@@ -1,238 +1,144 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 #include <linux/module.h>
 #include <linux/kernel.h>
-#include <linux/string.h>
 #include <linux/slab.h>
-#include <linux/parser.h>
 #include <linux/errno.h>
 #include <linux/unicode.h>
-#include <linux/stringhash.h>
+#include <linux/spinlock.h>
 
-#include "utf8n.h"
+DEFINE_SPINLOCK(utf8ops_lock);
 
-int unicode_validate(const struct unicode_map *um, const struct qstr *str)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-
-	if (utf8nlen(data, str->name, str->len) < 0)
-		return -1;
-	return 0;
-}
+struct unicode_ops *utf8_ops;
+EXPORT_SYMBOL(utf8_ops);
+
+int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_validate);
 
-int unicode_strncmp(const struct unicode_map *um,
-		    const struct qstr *s1, const struct qstr *s2)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-	struct utf8cursor cur1, cur2;
-	int c1, c2;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = utf8byte(&cur2);
-
-		if (c1 < 0 || c2 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
+		  const struct qstr *s2)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncmp);
 
-int unicode_strncasecmp(const struct unicode_map *um,
-			const struct qstr *s1, const struct qstr *s2)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur1, cur2;
-	int c1, c2;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = utf8byte(&cur2);
-
-		if (c1 < 0 || c2 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
+		      const struct qstr *s2)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncasecmp);
 
-/* String cf is expected to be a valid UTF-8 casefolded
- * string.
- */
-int unicode_strncasecmp_folded(const struct unicode_map *um,
-			       const struct qstr *cf,
-			       const struct qstr *s1)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur1;
-	int c1, c2;
-	int i = 0;
-
-	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
-		return -EINVAL;
-
-	do {
-		c1 = utf8byte(&cur1);
-		c2 = cf->name[i++];
-		if (c1 < 0)
-			return -EINVAL;
-		if (c1 != c2)
-			return 1;
-	} while (c1);
-
-	return 0;
-}
+int _utf8_strncasecmp_folded(const struct unicode_map *um,
+			     const struct qstr *cf, const struct qstr *s1)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_strncasecmp_folded);
 
-int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
-		     unsigned char *dest, size_t dlen)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur;
-	size_t nlen = 0;
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	for (nlen = 0; nlen < dlen; nlen++) {
-		int c = utf8byte(&cur);
-
-		dest[nlen] = c;
-		if (!c)
-			return nlen;
-		if (c == -1)
-			break;
-	}
-	return -EINVAL;
-}
+int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+		    unsigned char *dest, size_t dlen)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_casefold);
 
-int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
-			  struct qstr *str)
-{
-	const struct utf8data *data = utf8nfdicf(um->version);
-	struct utf8cursor cur;
-	int c;
-	unsigned long hash = init_name_hash(salt);
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	while ((c = utf8byte(&cur))) {
-		if (c < 0)
-			return -EINVAL;
-		hash = partial_name_hash((unsigned char)c, hash);
-	}
-	str->hash = end_name_hash(hash);
-	return 0;
-}
+int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+		   unsigned char *dest, size_t dlen)
+{
+	return 0;
+}
-EXPORT_SYMBOL(unicode_casefold_hash);
 
-int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
-		      unsigned char *dest, size_t dlen)
-{
-	const struct utf8data *data = utf8nfdi(um->version);
-	struct utf8cursor cur;
-	ssize_t nlen = 0;
-
-	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
-		return -EINVAL;
-
-	for (nlen = 0; nlen < dlen; nlen++) {
-		int c = utf8byte(&cur);
-
-		dest[nlen] = c;
-		if (!c)
-			return nlen;
-		if (c == -1)
-			break;
-	}
-	return -EINVAL;
-}
+int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			struct qstr *str)
+{
+	return 0;
+}
+
+struct unicode_map *_utf8_load(const char *version)
+{
+	return NULL;
+}
-EXPORT_SYMBOL(unicode_normalize);
 
-static int unicode_parse_version(const char *version, unsigned int *maj,
-				 unsigned int *min, unsigned int *rev)
-{
-	substring_t args[3];
-	char version_string[12];
-	static const struct match_token token[] = {
-		{1, "%d.%d.%d"},
-		{0, NULL}
-	};
-
-	int ret = strscpy(version_string, version, sizeof(version_string));
-
-	if (ret < 0)
-		return ret;
-
-	if (match_token(version_string, token, args) != 1)
-		return -EINVAL;
-
-	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
-	    match_int(&args[2], rev))
-		return -EINVAL;
-
-	return 0;
-}
+void _utf8_unload(struct unicode_map *um)
+{
+	return;
+}
+
+DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
+DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
+DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
+DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
+DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
+DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
+DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
+DEFINE_STATIC_CALL(utf8_load, _utf8_load);
+DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
+EXPORT_STATIC_CALL(utf8_strncmp);
+EXPORT_STATIC_CALL(utf8_strncasecmp);
+EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
+
+static int unicode_load_module(void)
+{
+	int ret = request_module("utf8");
+
+	if (ret) {
+		pr_err("Failed to load UTF-8 module\n");
+		return ret;
+	}
+	return 0;
+}
 
 struct unicode_map *unicode_load(const char *version)
-{
-	struct unicode_map *um = NULL;
-	int unicode_version;
-
-	if (version) {
-		unsigned int maj, min, rev;
-
-		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
-			return ERR_PTR(-EINVAL);
-
-		if (!utf8version_is_supported(maj, min, rev))
-			return ERR_PTR(-EINVAL);
-
-		unicode_version = UNICODE_AGE(maj, min, rev);
-	} else {
-		unicode_version = utf8version_latest();
-		printk(KERN_WARNING"UTF-8 version not specified. "
-		       "Assuming latest supported version (%d.%d.%d).",
-		       (unicode_version >> 16) & 0xff,
-		       (unicode_version >> 8) & 0xff,
-		       (unicode_version & 0xff));
-	}
-
-	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
-	if (!um)
-		return ERR_PTR(-ENOMEM);
-
-	um->charset = "UTF-8";
-	um->version = unicode_version;
-
-	return um;
-}
+{
+	int ret = unicode_load_module();
+
+	if (ret)
+		return ERR_PTR(ret);
+
+	spin_lock(&utf8ops_lock);
+	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
+		spin_unlock(&utf8ops_lock);
+		return ERR_PTR(-ENODEV);
+	} else {
+		spin_unlock(&utf8ops_lock);
+		return static_call(utf8_load)(version);
+	}
+}
 EXPORT_SYMBOL(unicode_load);
 
 void unicode_unload(struct unicode_map *um)
 {
-	kfree(um);
+	if (WARN_ON(!utf8_ops))
+		return;
+
+	module_put(utf8_ops->owner);
+	static_call(utf8_unload)(um);
 }
 EXPORT_SYMBOL(unicode_unload);
 
+void unicode_register(struct unicode_ops *ops)
+{
+	spin_lock(&utf8ops_lock);
+	utf8_ops = ops;
+
+	static_call_update(utf8_validate, utf8_ops->validate);
+	static_call_update(utf8_strncmp, utf8_ops->strncmp);
+	static_call_update(utf8_strncasecmp, utf8_ops->strncasecmp);
+	static_call_update(utf8_strncasecmp_folded, utf8_ops->strncasecmp_folded);
+	static_call_update(utf8_normalize, utf8_ops->normalize);
+	static_call_update(utf8_casefold, utf8_ops->casefold);
+	static_call_update(utf8_casefold_hash, utf8_ops->casefold_hash);
+	static_call_update(utf8_load, utf8_ops->load);
+	static_call_update(utf8_unload, utf8_ops->unload);
+
+	spin_unlock(&utf8ops_lock);
+}
+EXPORT_SYMBOL(unicode_register);
+
+void unicode_unregister(void)
+{
+	spin_lock(&utf8ops_lock);
+	utf8_ops = NULL;
+	spin_unlock(&utf8ops_lock);
+}
+EXPORT_SYMBOL(unicode_unregister);
+
 MODULE_LICENSE("GPL v2");
diff --git a/fs/unicode/unicode-utf8.c b/fs/unicode/unicode-utf8.c
new file mode 100644
index 000000000..770e60696
--- /dev/null
+++ b/fs/unicode/unicode-utf8.c
@@ -0,0 +1,255 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/module.h>
+#include <linux/kernel.h>
+#include <linux/string.h>
+#include <linux/slab.h>
+#include <linux/parser.h>
+#include <linux/errno.h>
+#include <linux/unicode.h>
+#include <linux/stringhash.h>
+
+#include "utf8n.h"
+
+static int utf8_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+
+	if (utf8nlen(data, str->name, str->len) < 0)
+		return -1;
+	return 0;
+}
+
+static int utf8_strncmp(const struct unicode_map *um,
+			const struct qstr *s1, const struct qstr *s2)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+	struct utf8cursor cur1, cur2;
+	int c1, c2;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = utf8byte(&cur2);
+
+		if (c1 < 0 || c2 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+static int utf8_strncasecmp(const struct unicode_map *um,
+			    const struct qstr *s1, const struct qstr *s2)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur1, cur2;
+	int c1, c2;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = utf8byte(&cur2);
+
+		if (c1 < 0 || c2 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+/* String cf is expected to be a valid UTF-8 casefolded
+ * string.
+ */
+static int utf8_strncasecmp_folded(const struct unicode_map *um,
+				   const struct qstr *cf,
+				   const struct qstr *s1)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur1;
+	int c1, c2;
+	int i = 0;
+
+	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
+		return -EINVAL;
+
+	do {
+		c1 = utf8byte(&cur1);
+		c2 = cf->name[i++];
+		if (c1 < 0)
+			return -EINVAL;
+		if (c1 != c2)
+			return 1;
+	} while (c1);
+
+	return 0;
+}
+
+static int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+			 unsigned char *dest, size_t dlen)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur;
+	size_t nlen = 0;
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	for (nlen = 0; nlen < dlen; nlen++) {
+		int c = utf8byte(&cur);
+
+		dest[nlen] = c;
+		if (!c)
+			return nlen;
+		if (c == -1)
+			break;
+	}
+	return -EINVAL;
+}
+
+static int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			      struct qstr *str)
+{
+	const struct utf8data *data = utf8nfdicf(um->version);
+	struct utf8cursor cur;
+	int c;
+	unsigned long hash = init_name_hash(salt);
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	while ((c = utf8byte(&cur))) {
+		if (c < 0)
+			return -EINVAL;
+		hash = partial_name_hash((unsigned char)c, hash);
+	}
+	str->hash = end_name_hash(hash);
+	return 0;
+}
+
+static int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+			  unsigned char *dest, size_t dlen)
+{
+	const struct utf8data *data = utf8nfdi(um->version);
+	struct utf8cursor cur;
+	ssize_t nlen = 0;
+
+	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
+		return -EINVAL;
+
+	for (nlen = 0; nlen < dlen; nlen++) {
+		int c = utf8byte(&cur);
+
+		dest[nlen] = c;
+		if (!c)
+			return nlen;
+		if (c == -1)
+			break;
+	}
+	return -EINVAL;
+}
+
+static int utf8_parse_version(const char *version, unsigned int *maj,
+			      unsigned int *min, unsigned int *rev)
+{
+	substring_t args[3];
+	char version_string[12];
+	static const struct match_token token[] = {
+		{1, "%d.%d.%d"},
+		{0, NULL}
+	};
+
+	int ret = strscpy(version_string, version, sizeof(version_string));
+
+	if (ret < 0)
+		return ret;
+
+	if (match_token(version_string, token, args) != 1)
+		return -EINVAL;
+
+	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
+	    match_int(&args[2], rev))
+		return -EINVAL;
+
+	return 0;
+}
+
+static struct unicode_map *utf8_load(const char *version)
+{
+	struct unicode_map *um = NULL;
+	int unicode_version;
+
+	if (version) {
+		unsigned int maj, min, rev;
+
+		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
+			return ERR_PTR(-EINVAL);
+
+		if (!utf8version_is_supported(maj, min, rev))
+			return ERR_PTR(-EINVAL);
+
+		unicode_version = UNICODE_AGE(maj, min, rev);
+	} else {
+		unicode_version = utf8version_latest();
+		pr_warn("UTF-8 version not specified. Assuming latest supported version (%d.%d.%d).",
+			(unicode_version >> 16) & 0xff,
+			(unicode_version >> 8) & 0xff,
+			(unicode_version & 0xfe));
+	}
+
+	um = kzalloc(sizeof(*um), GFP_KERNEL);
+	if (!um)
+		return ERR_PTR(-ENOMEM);
+
+	um->charset = "UTF-8";
+	um->version = unicode_version;
+
+	return um;
+}
+
+void utf8_unload(struct unicode_map *um)
+{
+	kfree(um);
+}
+
+static struct unicode_ops ops = {
+	.owner = THIS_MODULE,
+	.validate = utf8_validate,
+	.strncmp = utf8_strncmp,
+	.strncasecmp = utf8_strncasecmp,
+	.strncasecmp_folded = utf8_strncasecmp_folded,
+	.casefold = utf8_casefold,
+	.casefold_hash = utf8_casefold_hash,
+	.normalize = utf8_normalize,
+	.load = utf8_load,
+	.unload = utf8_unload,
+};
+
+static int __init utf8_init(void)
+{
+	unicode_register(&ops);
+	return 0;
+}
+
+static void __exit utf8_exit(void)
+{
+	unicode_unregister();
+}
+
+module_init(utf8_init);
+module_exit(utf8_exit);
+
+MODULE_LICENSE("GPL v2");
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -4,33 +4,104 @@
 
 #include <linux/init.h>
 #include <linux/dcache.h>
+#include <linux/static_call.h>
+
 
 struct unicode_map {
 	const char *charset;
 	int version;
 };
 
-int unicode_validate(const struct unicode_map *um, const struct qstr *str);
+struct unicode_ops {
+	struct module *owner;
+	int (*validate)(const struct unicode_map *um, const struct qstr *str);
+	int (*strncmp)(const struct unicode_map *um, const struct qstr *s1,
+		       const struct qstr *s2);
+	int (*strncasecmp)(const struct unicode_map *um, const struct qstr *s1,
+			   const struct qstr *s2);
+	int (*strncasecmp_folded)(const struct unicode_map *um, const struct qstr *cf,
+				  const struct qstr *s1);
+	int (*normalize)(const struct unicode_map *um, const struct qstr *str,
+			 unsigned char *dest, size_t dlen);
+	int (*casefold)(const struct unicode_map *um, const struct qstr *str,
+			unsigned char *dest, size_t dlen);
+	int (*casefold_hash)(const struct unicode_map *um, const void *salt,
+			     struct qstr *str);
+	struct unicode_map* (*load)(const char *version);
+	void (*unload)(struct unicode_map *um);
+};
 
-int unicode_strncmp(const struct unicode_map *um,
-		    const struct qstr *s1, const struct qstr *s2);
+extern struct unicode_ops *utf8_ops;
 
-int unicode_strncasecmp(const struct unicode_map *um,
-			const struct qstr *s1, const struct qstr *s2);
-int unicode_strncasecmp_folded(const struct unicode_map *um,
-			       const struct qstr *cf,
-			       const struct qstr *s1);
+int _utf8_validate(const struct unicode_map *um, const struct qstr *str);
+int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
+		  const struct qstr *s2);
+int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
+		      const struct qstr *s2);
+int _utf8_strncasecmp_folded(const struct unicode_map *um,
+			     const struct qstr *cf,
+			     const struct qstr *s1);
+int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
+		    unsigned char *dest, size_t dlen);
+int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
+		   unsigned char *dest, size_t dlen);
+int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
+			struct qstr *str);
 
-int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
-		      unsigned char *dest, size_t dlen);
+DECLARE_STATIC_CALL(utf8_validate, _utf8_validate);
+DECLARE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
+DECLARE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
+DECLARE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
+DECLARE_STATIC_CALL(utf8_normalize, _utf8_normalize);
+DECLARE_STATIC_CALL(utf8_casefold, _utf8_casefold);
+DECLARE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
 
-int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
-		     unsigned char *dest, size_t dlen);
+static inline int unicode_validate(const struct unicode_map *um, const struct qstr *str)
+{
+	return static_call(utf8_validate)(um, str);
+}
 
-int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
-			  struct qstr *str);
+static inline int unicode_strncmp(const struct unicode_map *um,
+				  const struct qstr *s1, const struct qstr *s2)
+{
+	return static_call(utf8_strncmp)(um, s1, s2);
+}
+
+static inline int unicode_strncasecmp(const struct unicode_map *um,
+				      const struct qstr *s1, const struct qstr *s2)
+{
+	return static_call(utf8_strncasecmp)(um, s1, s2);
+}
+
+static inline int unicode_strncasecmp_folded(const struct unicode_map *um,
+					     const struct qstr *cf,
+					     const struct qstr *s1)
+{
+	return static_call(utf8_strncasecmp_folded)(um, cf, s1);
+}
+
+static inline int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
+				    unsigned char *dest, size_t dlen)
+{
+	return static_call(utf8_normalize)(um, str, dest, dlen);
+}
+
+static inline int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
+				   unsigned char *dest, size_t dlen)
+{
+	return static_call(utf8_casefold)(um, str, dest, dlen);
+}
+
+static inline int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
+					struct qstr *str)
+{
+	return static_call(utf8_casefold_hash)(um, salt, str);
+}
 
 struct unicode_map *unicode_load(const char *version);
 void unicode_unload(struct unicode_map *um);
 
+void unicode_register(struct unicode_ops *ops);
+void unicode_unregister(void);
+
 #endif /* _LINUX_UNICODE_H */
-- 
2.24.3 (Apple Git-128)



_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 1/5] fs: unicode: Use strscpy() instead of strncpy()
  2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 19:09     ` Gabriel Krisman Bertazi
  -1 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:09 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida, kernel test robot

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Following warning was reported by Kernel Test Robot.
>
> In function 'utf8_parse_version',
> inlined from 'utf8_load' at fs/unicode/utf8mod.c:195:7:
>>> fs/unicode/utf8mod.c:175:2: warning: 'strncpy' specified bound 12 equals
> destination size [-Wstringop-truncation]
> 175 |  strncpy(version_string, version, sizeof(version_string));
>     |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The -Wstringop-truncation warning highlights the unintended
> uses of the strncpy function that truncate the terminating NULL
> character from the source string.
> Unlike strncpy(), strscpy() always null-terminates the destination string,
> hence use strscpy() instead of strncpy().
>
> Fixes: 9d53690f0d4e5 (unicode: implement higher level API for string handling)
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> Reported-by: kernel test robot <lkp@intel.com>
> ---
>
> Changes in v3
>   - Return error if strscpy() returns value < 0
>
> Changes in v2
>   - Resolve warning of -Wstringop-truncation reported by
>     kernel test robot.
>
>  fs/unicode/utf8-core.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>

Hi Shreeya,

Thanks for fixing this.

> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index dc25823bf..706f086bb 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c
> @@ -180,7 +180,10 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
>  		{0, NULL}
>  	};
>  
> -	strncpy(version_string, version, sizeof(version_string));
> +	int ret = strscpy(version_string, version, sizeof(version_string));

Usually, no spaces between variable declarations

Other than that,

Acked-by: Gabriel Krisman Bertazi <krisman@collabora.com>

> +
> +	if (ret < 0)
> +		return ret;
>  	if (match_token(version_string, token, args) != 1)
>  		return -EINVAL;

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 1/5] fs: unicode: Use strscpy() instead of strncpy()
@ 2021-03-23 19:09     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:09 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, kernel test robot, drosen, ebiggers, linux-kernel,
	linux-f2fs-devel, ebiggers, kernel, adilger.kernel,
	linux-fsdevel, jaegeuk, andre.almeida, linux-ext4

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Following warning was reported by Kernel Test Robot.
>
> In function 'utf8_parse_version',
> inlined from 'utf8_load' at fs/unicode/utf8mod.c:195:7:
>>> fs/unicode/utf8mod.c:175:2: warning: 'strncpy' specified bound 12 equals
> destination size [-Wstringop-truncation]
> 175 |  strncpy(version_string, version, sizeof(version_string));
>     |  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>
> The -Wstringop-truncation warning highlights the unintended
> uses of the strncpy function that truncate the terminating NULL
> character from the source string.
> Unlike strncpy(), strscpy() always null-terminates the destination string,
> hence use strscpy() instead of strncpy().
>
> Fixes: 9d53690f0d4e5 (unicode: implement higher level API for string handling)
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> Reported-by: kernel test robot <lkp@intel.com>
> ---
>
> Changes in v3
>   - Return error if strscpy() returns value < 0
>
> Changes in v2
>   - Resolve warning of -Wstringop-truncation reported by
>     kernel test robot.
>
>  fs/unicode/utf8-core.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>

Hi Shreeya,

Thanks for fixing this.

> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index dc25823bf..706f086bb 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c
> @@ -180,7 +180,10 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
>  		{0, NULL}
>  	};
>  
> -	strncpy(version_string, version, sizeof(version_string));
> +	int ret = strscpy(version_string, version, sizeof(version_string));

Usually, no spaces between variable declarations

Other than that,

Acked-by: Gabriel Krisman Bertazi <krisman@collabora.com>

> +
> +	if (ret < 0)
> +		return ret;
>  	if (match_token(version_string, token, args) != 1)
>  		return -EINVAL;

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/5] fs: Check if utf8 encoding is loaded before calling utf8_unload()
  2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 19:10     ` Gabriel Krisman Bertazi
  -1 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:10 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida

Shreeya Patel <shreeya.patel@collabora.com> writes:

> utf8_unload is being called if CONFIG_UNICODE is enabled.
> The ifdef block doesn't check if utf8 encoding has been loaded
> or not before calling the utf8_unload() function.
> This is not the expected behavior since it would sometimes lead
> to unloading utf8 even before loading it.
> Hence, add a condition which will check if sb->encoding is NOT NULL
> before calling the utf8_unload().

Just to mention this used to be safe, since it was just doing a
kfree(NULL), but won't be anymore after the rest of this series.

Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>

>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> ---
>
> Changes in v3
>   - Add this patch to the series which checks if utf8 encoding
>     was loaded before calling uft8_unload().
>  
>  fs/ext4/super.c | 6 ++++--
>  fs/f2fs/super.c | 9 ++++++---
>  2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index ad34a3727..e438d14f9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1259,7 +1259,8 @@ static void ext4_put_super(struct super_block *sb)
>  	fs_put_dax(sbi->s_daxdev);
>  	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -5165,7 +5166,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		crypto_free_shash(sbi->s_chksum_driver);
>  
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  
>  #ifdef CONFIG_QUOTA
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 706979375..0a04983c2 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1430,7 +1430,8 @@ static void f2fs_put_super(struct super_block *sb)
>  	for (i = 0; i < NR_PAGE_TYPE; i++)
>  		kvfree(sbi->write_io[i]);
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -4073,8 +4074,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  		kvfree(sbi->write_io[i]);
>  
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> -	sb->s_encoding = NULL;
> +	if (sb->s_encoding) {
> +		utf8_unload(sb->s_encoding);
> +		sb->s_encoding = NULL;
> +	}
>  #endif
>  free_options:
>  #ifdef CONFIG_QUOTA

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 2/5] fs: Check if utf8 encoding is loaded before calling utf8_unload()
@ 2021-03-23 19:10     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:10 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, drosen, ebiggers, linux-kernel, linux-f2fs-devel,
	ebiggers, kernel, adilger.kernel, linux-fsdevel, jaegeuk,
	andre.almeida, linux-ext4

Shreeya Patel <shreeya.patel@collabora.com> writes:

> utf8_unload is being called if CONFIG_UNICODE is enabled.
> The ifdef block doesn't check if utf8 encoding has been loaded
> or not before calling the utf8_unload() function.
> This is not the expected behavior since it would sometimes lead
> to unloading utf8 even before loading it.
> Hence, add a condition which will check if sb->encoding is NOT NULL
> before calling the utf8_unload().

Just to mention this used to be safe, since it was just doing a
kfree(NULL), but won't be anymore after the rest of this series.

Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>

>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>
> ---
>
> Changes in v3
>   - Add this patch to the series which checks if utf8 encoding
>     was loaded before calling uft8_unload().
>  
>  fs/ext4/super.c | 6 ++++--
>  fs/f2fs/super.c | 9 ++++++---
>  2 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index ad34a3727..e438d14f9 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1259,7 +1259,8 @@ static void ext4_put_super(struct super_block *sb)
>  	fs_put_dax(sbi->s_daxdev);
>  	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -5165,7 +5166,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		crypto_free_shash(sbi->s_chksum_driver);
>  
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  
>  #ifdef CONFIG_QUOTA
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 706979375..0a04983c2 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1430,7 +1430,8 @@ static void f2fs_put_super(struct super_block *sb)
>  	for (i = 0; i < NR_PAGE_TYPE; i++)
>  		kvfree(sbi->write_io[i]);
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> +	if (sb->s_encoding)
> +		utf8_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -4073,8 +4074,10 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  		kvfree(sbi->write_io[i]);
>  
>  #ifdef CONFIG_UNICODE
> -	utf8_unload(sb->s_encoding);
> -	sb->s_encoding = NULL;
> +	if (sb->s_encoding) {
> +		utf8_unload(sb->s_encoding);
> +		sb->s_encoding = NULL;
> +	}
>  #endif
>  free_options:
>  #ifdef CONFIG_QUOTA

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 3/5] fs: unicode: Rename function names from utf8 to unicode
  2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 19:14     ` Gabriel Krisman Bertazi
  -1 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:14 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Rename the function names from utf8 to unicode for taking the first step
> towards the transformation of utf8-core file into the unicode subsystem
> layer file.
>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>

Thanks,

> ---
>  fs/ext4/hash.c             |  2 +-
>  fs/ext4/namei.c            | 12 ++++----
>  fs/ext4/super.c            |  6 ++--
>  fs/f2fs/dir.c              | 12 ++++----
>  fs/f2fs/super.c            |  6 ++--
>  fs/libfs.c                 |  6 ++--
>  fs/unicode/utf8-core.c     | 57 +++++++++++++++++++-------------------
>  fs/unicode/utf8-selftest.c |  8 +++---
>  include/linux/unicode.h    | 32 ++++++++++-----------
>  9 files changed, 70 insertions(+), 71 deletions(-)
>
> diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
> index a92eb79de..8890a76ab 100644
> --- a/fs/ext4/hash.c
> +++ b/fs/ext4/hash.c
> @@ -285,7 +285,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
>  		if (!buff)
>  			return -ENOMEM;
>  
> -		dlen = utf8_casefold(um, &qstr, buff, PATH_MAX);
> +		dlen = unicode_casefold(um, &qstr, buff, PATH_MAX);
>  		if (dlen < 0) {
>  			kfree(buff);
>  			goto opaque_seq;
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 686bf982c..dde5ce795 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1290,9 +1290,9 @@ int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>  	int ret;
>  
>  	if (quick)
> -		ret = utf8_strncasecmp_folded(um, name, entry);
> +		ret = unicode_strncasecmp_folded(um, name, entry);
>  	else
> -		ret = utf8_strncasecmp(um, name, entry);
> +		ret = unicode_strncasecmp(um, name, entry);
>  
>  	if (ret < 0) {
>  		/* Handle invalid character sequence as either an error
> @@ -1324,9 +1324,9 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
>  	if (!cf_name->name)
>  		return;
>  
> -	len = utf8_casefold(dir->i_sb->s_encoding,
> -			    iname, cf_name->name,
> -			    EXT4_NAME_LEN);
> +	len = unicode_casefold(dir->i_sb->s_encoding,
> +			       iname, cf_name->name,
> +			       EXT4_NAME_LEN);
>  	if (len <= 0) {
>  		kfree(cf_name->name);
>  		cf_name->name = NULL;
> @@ -2201,7 +2201,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
> -	    sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
> +	    sb->s_encoding && unicode_validate(sb->s_encoding, &dentry->d_name))
>  		return -EINVAL;
>  #endif
>  
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index e438d14f9..853aeb294 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1260,7 +1260,7 @@ static void ext4_put_super(struct super_block *sb)
>  	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -4305,7 +4305,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  			goto failed_mount;
>  		}
>  
> -		encoding = utf8_load(encoding_info->version);
> +		encoding = unicode_load(encoding_info->version);
>  		if (IS_ERR(encoding)) {
>  			ext4_msg(sb, KERN_ERR,
>  				 "can't mount with superblock charset: %s-%s "
> @@ -5167,7 +5167,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  
>  #ifdef CONFIG_QUOTA
> diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
> index e6270a867..f160f9dd6 100644
> --- a/fs/f2fs/dir.c
> +++ b/fs/f2fs/dir.c
> @@ -84,10 +84,10 @@ int f2fs_init_casefolded_name(const struct inode *dir,
>  						   GFP_NOFS);
>  		if (!fname->cf_name.name)
>  			return -ENOMEM;
> -		fname->cf_name.len = utf8_casefold(sb->s_encoding,
> -						   fname->usr_fname,
> -						   fname->cf_name.name,
> -						   F2FS_NAME_LEN);
> +		fname->cf_name.len = unicode_casefold(sb->s_encoding,
> +						      fname->usr_fname,
> +						      fname->cf_name.name,
> +						      F2FS_NAME_LEN);
>  		if ((int)fname->cf_name.len <= 0) {
>  			kfree(fname->cf_name.name);
>  			fname->cf_name.name = NULL;
> @@ -237,7 +237,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
>  		entry.len = decrypted_name.len;
>  	}
>  
> -	res = utf8_strncasecmp_folded(um, name, &entry);
> +	res = unicode_strncasecmp_folded(um, name, &entry);
>  	/*
>  	 * In strict mode, ignore invalid names.  In non-strict mode,
>  	 * fall back to treating them as opaque byte sequences.
> @@ -246,7 +246,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
>  		res = name->len == entry.len &&
>  				memcmp(name->name, entry.name, name->len) == 0;
>  	} else {
> -		/* utf8_strncasecmp_folded returns 0 on match */
> +		/* unicode_strncasecmp_folded returns 0 on match */
>  		res = (res == 0);
>  	}
>  out:
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 0a04983c2..a0cd9bfa4 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1431,7 +1431,7 @@ static void f2fs_put_super(struct super_block *sb)
>  		kvfree(sbi->write_io[i]);
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -3561,7 +3561,7 @@ static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
>  			return -EINVAL;
>  		}
>  
> -		encoding = utf8_load(encoding_info->version);
> +		encoding = unicode_load(encoding_info->version);
>  		if (IS_ERR(encoding)) {
>  			f2fs_err(sbi,
>  				 "can't mount with superblock charset: %s-%s "
> @@ -4075,7 +4075,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding) {
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  		sb->s_encoding = NULL;
>  	}
>  #endif
> diff --git a/fs/libfs.c b/fs/libfs.c
> index e2de5401a..766556165 100644
> --- a/fs/libfs.c
> +++ b/fs/libfs.c
> @@ -1404,7 +1404,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
>  	 * If the dentry name is stored in-line, then it may be concurrently
>  	 * modified by a rename.  If this happens, the VFS will eventually retry
>  	 * the lookup, so it doesn't matter what ->d_compare() returns.
> -	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
> +	 * However, it's unsafe to call unicode_strncasecmp() with an unstable
>  	 * string.  Therefore, we have to copy the name into a temporary buffer.
>  	 */
>  	if (len <= DNAME_INLINE_LEN - 1) {
> @@ -1414,7 +1414,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
>  		/* prevent compiler from optimizing out the temporary buffer */
>  		barrier();
>  	}
> -	ret = utf8_strncasecmp(um, name, &qstr);
> +	ret = unicode_strncasecmp(um, name, &qstr);
>  	if (ret >= 0)
>  		return ret;
>  
> @@ -1443,7 +1443,7 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
>  	if (!dir || !needs_casefold(dir))
>  		return 0;
>  
> -	ret = utf8_casefold_hash(um, dentry, str);
> +	ret = unicode_casefold_hash(um, dentry, str);
>  	if (ret < 0 && sb_has_strict_encoding(sb))
>  		return -EINVAL;
>  	return 0;
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index 706f086bb..686e95e90 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c
> @@ -10,7 +10,7 @@
>  
>  #include "utf8n.h"
>  
> -int utf8_validate(const struct unicode_map *um, const struct qstr *str)
> +int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  
> @@ -18,10 +18,10 @@ int utf8_validate(const struct unicode_map *um, const struct qstr *str)
>  		return -1;
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_validate);
> +EXPORT_SYMBOL(unicode_validate);
>  
> -int utf8_strncmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2)
> +int unicode_strncmp(const struct unicode_map *um,
> +		    const struct qstr *s1, const struct qstr *s2)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  	struct utf8cursor cur1, cur2;
> @@ -45,10 +45,10 @@ int utf8_strncmp(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncmp);
> +EXPORT_SYMBOL(unicode_strncmp);
>  
> -int utf8_strncasecmp(const struct unicode_map *um,
> -		     const struct qstr *s1, const struct qstr *s2)
> +int unicode_strncasecmp(const struct unicode_map *um,
> +			const struct qstr *s1, const struct qstr *s2)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur1, cur2;
> @@ -72,14 +72,14 @@ int utf8_strncasecmp(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncasecmp);
> +EXPORT_SYMBOL(unicode_strncasecmp);
>  
>  /* String cf is expected to be a valid UTF-8 casefolded
>   * string.
>   */
> -int utf8_strncasecmp_folded(const struct unicode_map *um,
> -			    const struct qstr *cf,
> -			    const struct qstr *s1)
> +int unicode_strncasecmp_folded(const struct unicode_map *um,
> +			       const struct qstr *cf,
> +			       const struct qstr *s1)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur1;
> @@ -100,10 +100,10 @@ int utf8_strncasecmp_folded(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncasecmp_folded);
> +EXPORT_SYMBOL(unicode_strncasecmp_folded);
>  
> -int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> -		  unsigned char *dest, size_t dlen)
> +int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> +		     unsigned char *dest, size_t dlen)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur;
> @@ -123,10 +123,10 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>  	}
>  	return -EINVAL;
>  }
> -EXPORT_SYMBOL(utf8_casefold);
> +EXPORT_SYMBOL(unicode_casefold);
>  
> -int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> -		       struct qstr *str)
> +int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> +			  struct qstr *str)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur;
> @@ -144,10 +144,10 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
>  	str->hash = end_name_hash(hash);
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_casefold_hash);
> +EXPORT_SYMBOL(unicode_casefold_hash);
>  
> -int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> -		   unsigned char *dest, size_t dlen)
> +int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> +		      unsigned char *dest, size_t dlen)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  	struct utf8cursor cur;
> @@ -167,11 +167,10 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>  	}
>  	return -EINVAL;
>  }
> +EXPORT_SYMBOL(unicode_normalize);
>  
> -EXPORT_SYMBOL(utf8_normalize);
> -
> -static int utf8_parse_version(const char *version, unsigned int *maj,
> -			      unsigned int *min, unsigned int *rev)
> +static int unicode_parse_version(const char *version, unsigned int *maj,
> +				 unsigned int *min, unsigned int *rev)
>  {
>  	substring_t args[3];
>  	char version_string[12];
> @@ -195,7 +194,7 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
>  	return 0;
>  }
>  
> -struct unicode_map *utf8_load(const char *version)
> +struct unicode_map *unicode_load(const char *version)
>  {
>  	struct unicode_map *um = NULL;
>  	int unicode_version;
> @@ -203,7 +202,7 @@ struct unicode_map *utf8_load(const char *version)
>  	if (version) {
>  		unsigned int maj, min, rev;
>  
> -		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
> +		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
>  			return ERR_PTR(-EINVAL);
>  
>  		if (!utf8version_is_supported(maj, min, rev))
> @@ -228,12 +227,12 @@ struct unicode_map *utf8_load(const char *version)
>  
>  	return um;
>  }
> -EXPORT_SYMBOL(utf8_load);
> +EXPORT_SYMBOL(unicode_load);
>  
> -void utf8_unload(struct unicode_map *um)
> +void unicode_unload(struct unicode_map *um)
>  {
>  	kfree(um);
>  }
> -EXPORT_SYMBOL(utf8_unload);
> +EXPORT_SYMBOL(unicode_unload);
>  
>  MODULE_LICENSE("GPL v2");
> diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
> index 6fe8af7ed..796c1ed92 100644
> --- a/fs/unicode/utf8-selftest.c
> +++ b/fs/unicode/utf8-selftest.c
> @@ -235,7 +235,7 @@ static void check_utf8_nfdicf(void)
>  static void check_utf8_comparisons(void)
>  {
>  	int i;
> -	struct unicode_map *table = utf8_load("12.1.0");
> +	struct unicode_map *table = unicode_load("12.1.0");
>  
>  	if (IS_ERR(table)) {
>  		pr_err("%s: Unable to load utf8 %d.%d.%d. Skipping.\n",
> @@ -249,7 +249,7 @@ static void check_utf8_comparisons(void)
>  		const struct qstr s2 = {.name = nfdi_test_data[i].dec,
>  					.len = sizeof(nfdi_test_data[i].dec)};
>  
> -		test_f(!utf8_strncmp(table, &s1, &s2),
> +		test_f(!unicode_strncmp(table, &s1, &s2),
>  		       "%s %s comparison mismatch\n", s1.name, s2.name);
>  	}
>  
> @@ -259,11 +259,11 @@ static void check_utf8_comparisons(void)
>  		const struct qstr s2 = {.name = nfdicf_test_data[i].ncf,
>  					.len = sizeof(nfdicf_test_data[i].ncf)};
>  
> -		test_f(!utf8_strncasecmp(table, &s1, &s2),
> +		test_f(!unicode_strncasecmp(table, &s1, &s2),
>  		       "%s %s comparison mismatch\n", s1.name, s2.name);
>  	}
>  
> -	utf8_unload(table);
> +	unicode_unload(table);
>  }
>  
>  static void check_supported_versions(void)
> diff --git a/include/linux/unicode.h b/include/linux/unicode.h
> index 74484d44c..de23f9ee7 100644
> --- a/include/linux/unicode.h
> +++ b/include/linux/unicode.h
> @@ -10,27 +10,27 @@ struct unicode_map {
>  	int version;
>  };
>  
> -int utf8_validate(const struct unicode_map *um, const struct qstr *str);
> +int unicode_validate(const struct unicode_map *um, const struct qstr *str);
>  
> -int utf8_strncmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2);
> +int unicode_strncmp(const struct unicode_map *um,
> +		    const struct qstr *s1, const struct qstr *s2);
>  
> -int utf8_strncasecmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2);
> -int utf8_strncasecmp_folded(const struct unicode_map *um,
> -			    const struct qstr *cf,
> -			    const struct qstr *s1);
> +int unicode_strncasecmp(const struct unicode_map *um,
> +			const struct qstr *s1, const struct qstr *s2);
> +int unicode_strncasecmp_folded(const struct unicode_map *um,
> +			       const struct qstr *cf,
> +			       const struct qstr *s1);
>  
> -int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> -		   unsigned char *dest, size_t dlen);
> +int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> +		      unsigned char *dest, size_t dlen);
>  
> -int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> -		  unsigned char *dest, size_t dlen);
> +int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> +		     unsigned char *dest, size_t dlen);
>  
> -int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> -		       struct qstr *str);
> +int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> +			  struct qstr *str);
>  
> -struct unicode_map *utf8_load(const char *version);
> -void utf8_unload(struct unicode_map *um);
> +struct unicode_map *unicode_load(const char *version);
> +void unicode_unload(struct unicode_map *um);
>  
>  #endif /* _LINUX_UNICODE_H */

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 3/5] fs: unicode: Rename function names from utf8 to unicode
@ 2021-03-23 19:14     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:14 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, drosen, ebiggers, linux-kernel, linux-f2fs-devel,
	ebiggers, kernel, adilger.kernel, linux-fsdevel, jaegeuk,
	andre.almeida, linux-ext4

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Rename the function names from utf8 to unicode for taking the first step
> towards the transformation of utf8-core file into the unicode subsystem
> layer file.
>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

Reviewed-by: Gabriel Krisman Bertazi <krisman@collabora.com>

Thanks,

> ---
>  fs/ext4/hash.c             |  2 +-
>  fs/ext4/namei.c            | 12 ++++----
>  fs/ext4/super.c            |  6 ++--
>  fs/f2fs/dir.c              | 12 ++++----
>  fs/f2fs/super.c            |  6 ++--
>  fs/libfs.c                 |  6 ++--
>  fs/unicode/utf8-core.c     | 57 +++++++++++++++++++-------------------
>  fs/unicode/utf8-selftest.c |  8 +++---
>  include/linux/unicode.h    | 32 ++++++++++-----------
>  9 files changed, 70 insertions(+), 71 deletions(-)
>
> diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
> index a92eb79de..8890a76ab 100644
> --- a/fs/ext4/hash.c
> +++ b/fs/ext4/hash.c
> @@ -285,7 +285,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
>  		if (!buff)
>  			return -ENOMEM;
>  
> -		dlen = utf8_casefold(um, &qstr, buff, PATH_MAX);
> +		dlen = unicode_casefold(um, &qstr, buff, PATH_MAX);
>  		if (dlen < 0) {
>  			kfree(buff);
>  			goto opaque_seq;
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 686bf982c..dde5ce795 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1290,9 +1290,9 @@ int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
>  	int ret;
>  
>  	if (quick)
> -		ret = utf8_strncasecmp_folded(um, name, entry);
> +		ret = unicode_strncasecmp_folded(um, name, entry);
>  	else
> -		ret = utf8_strncasecmp(um, name, entry);
> +		ret = unicode_strncasecmp(um, name, entry);
>  
>  	if (ret < 0) {
>  		/* Handle invalid character sequence as either an error
> @@ -1324,9 +1324,9 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
>  	if (!cf_name->name)
>  		return;
>  
> -	len = utf8_casefold(dir->i_sb->s_encoding,
> -			    iname, cf_name->name,
> -			    EXT4_NAME_LEN);
> +	len = unicode_casefold(dir->i_sb->s_encoding,
> +			       iname, cf_name->name,
> +			       EXT4_NAME_LEN);
>  	if (len <= 0) {
>  		kfree(cf_name->name);
>  		cf_name->name = NULL;
> @@ -2201,7 +2201,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb_has_strict_encoding(sb) && IS_CASEFOLDED(dir) &&
> -	    sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
> +	    sb->s_encoding && unicode_validate(sb->s_encoding, &dentry->d_name))
>  		return -EINVAL;
>  #endif
>  
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index e438d14f9..853aeb294 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -1260,7 +1260,7 @@ static void ext4_put_super(struct super_block *sb)
>  	fscrypt_free_dummy_policy(&sbi->s_dummy_enc_policy);
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -4305,7 +4305,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  			goto failed_mount;
>  		}
>  
> -		encoding = utf8_load(encoding_info->version);
> +		encoding = unicode_load(encoding_info->version);
>  		if (IS_ERR(encoding)) {
>  			ext4_msg(sb, KERN_ERR,
>  				 "can't mount with superblock charset: %s-%s "
> @@ -5167,7 +5167,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  
>  #ifdef CONFIG_QUOTA
> diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
> index e6270a867..f160f9dd6 100644
> --- a/fs/f2fs/dir.c
> +++ b/fs/f2fs/dir.c
> @@ -84,10 +84,10 @@ int f2fs_init_casefolded_name(const struct inode *dir,
>  						   GFP_NOFS);
>  		if (!fname->cf_name.name)
>  			return -ENOMEM;
> -		fname->cf_name.len = utf8_casefold(sb->s_encoding,
> -						   fname->usr_fname,
> -						   fname->cf_name.name,
> -						   F2FS_NAME_LEN);
> +		fname->cf_name.len = unicode_casefold(sb->s_encoding,
> +						      fname->usr_fname,
> +						      fname->cf_name.name,
> +						      F2FS_NAME_LEN);
>  		if ((int)fname->cf_name.len <= 0) {
>  			kfree(fname->cf_name.name);
>  			fname->cf_name.name = NULL;
> @@ -237,7 +237,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
>  		entry.len = decrypted_name.len;
>  	}
>  
> -	res = utf8_strncasecmp_folded(um, name, &entry);
> +	res = unicode_strncasecmp_folded(um, name, &entry);
>  	/*
>  	 * In strict mode, ignore invalid names.  In non-strict mode,
>  	 * fall back to treating them as opaque byte sequences.
> @@ -246,7 +246,7 @@ static int f2fs_match_ci_name(const struct inode *dir, const struct qstr *name,
>  		res = name->len == entry.len &&
>  				memcmp(name->name, entry.name, name->len) == 0;
>  	} else {
> -		/* utf8_strncasecmp_folded returns 0 on match */
> +		/* unicode_strncasecmp_folded returns 0 on match */
>  		res = (res == 0);
>  	}
>  out:
> diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
> index 0a04983c2..a0cd9bfa4 100644
> --- a/fs/f2fs/super.c
> +++ b/fs/f2fs/super.c
> @@ -1431,7 +1431,7 @@ static void f2fs_put_super(struct super_block *sb)
>  		kvfree(sbi->write_io[i]);
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding)
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  #endif
>  	kfree(sbi);
>  }
> @@ -3561,7 +3561,7 @@ static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
>  			return -EINVAL;
>  		}
>  
> -		encoding = utf8_load(encoding_info->version);
> +		encoding = unicode_load(encoding_info->version);
>  		if (IS_ERR(encoding)) {
>  			f2fs_err(sbi,
>  				 "can't mount with superblock charset: %s-%s "
> @@ -4075,7 +4075,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
>  
>  #ifdef CONFIG_UNICODE
>  	if (sb->s_encoding) {
> -		utf8_unload(sb->s_encoding);
> +		unicode_unload(sb->s_encoding);
>  		sb->s_encoding = NULL;
>  	}
>  #endif
> diff --git a/fs/libfs.c b/fs/libfs.c
> index e2de5401a..766556165 100644
> --- a/fs/libfs.c
> +++ b/fs/libfs.c
> @@ -1404,7 +1404,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
>  	 * If the dentry name is stored in-line, then it may be concurrently
>  	 * modified by a rename.  If this happens, the VFS will eventually retry
>  	 * the lookup, so it doesn't matter what ->d_compare() returns.
> -	 * However, it's unsafe to call utf8_strncasecmp() with an unstable
> +	 * However, it's unsafe to call unicode_strncasecmp() with an unstable
>  	 * string.  Therefore, we have to copy the name into a temporary buffer.
>  	 */
>  	if (len <= DNAME_INLINE_LEN - 1) {
> @@ -1414,7 +1414,7 @@ static int generic_ci_d_compare(const struct dentry *dentry, unsigned int len,
>  		/* prevent compiler from optimizing out the temporary buffer */
>  		barrier();
>  	}
> -	ret = utf8_strncasecmp(um, name, &qstr);
> +	ret = unicode_strncasecmp(um, name, &qstr);
>  	if (ret >= 0)
>  		return ret;
>  
> @@ -1443,7 +1443,7 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str)
>  	if (!dir || !needs_casefold(dir))
>  		return 0;
>  
> -	ret = utf8_casefold_hash(um, dentry, str);
> +	ret = unicode_casefold_hash(um, dentry, str);
>  	if (ret < 0 && sb_has_strict_encoding(sb))
>  		return -EINVAL;
>  	return 0;
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index 706f086bb..686e95e90 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c
> @@ -10,7 +10,7 @@
>  
>  #include "utf8n.h"
>  
> -int utf8_validate(const struct unicode_map *um, const struct qstr *str)
> +int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  
> @@ -18,10 +18,10 @@ int utf8_validate(const struct unicode_map *um, const struct qstr *str)
>  		return -1;
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_validate);
> +EXPORT_SYMBOL(unicode_validate);
>  
> -int utf8_strncmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2)
> +int unicode_strncmp(const struct unicode_map *um,
> +		    const struct qstr *s1, const struct qstr *s2)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  	struct utf8cursor cur1, cur2;
> @@ -45,10 +45,10 @@ int utf8_strncmp(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncmp);
> +EXPORT_SYMBOL(unicode_strncmp);
>  
> -int utf8_strncasecmp(const struct unicode_map *um,
> -		     const struct qstr *s1, const struct qstr *s2)
> +int unicode_strncasecmp(const struct unicode_map *um,
> +			const struct qstr *s1, const struct qstr *s2)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur1, cur2;
> @@ -72,14 +72,14 @@ int utf8_strncasecmp(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncasecmp);
> +EXPORT_SYMBOL(unicode_strncasecmp);
>  
>  /* String cf is expected to be a valid UTF-8 casefolded
>   * string.
>   */
> -int utf8_strncasecmp_folded(const struct unicode_map *um,
> -			    const struct qstr *cf,
> -			    const struct qstr *s1)
> +int unicode_strncasecmp_folded(const struct unicode_map *um,
> +			       const struct qstr *cf,
> +			       const struct qstr *s1)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur1;
> @@ -100,10 +100,10 @@ int utf8_strncasecmp_folded(const struct unicode_map *um,
>  
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_strncasecmp_folded);
> +EXPORT_SYMBOL(unicode_strncasecmp_folded);
>  
> -int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> -		  unsigned char *dest, size_t dlen)
> +int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> +		     unsigned char *dest, size_t dlen)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur;
> @@ -123,10 +123,10 @@ int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>  	}
>  	return -EINVAL;
>  }
> -EXPORT_SYMBOL(utf8_casefold);
> +EXPORT_SYMBOL(unicode_casefold);
>  
> -int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> -		       struct qstr *str)
> +int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> +			  struct qstr *str)
>  {
>  	const struct utf8data *data = utf8nfdicf(um->version);
>  	struct utf8cursor cur;
> @@ -144,10 +144,10 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
>  	str->hash = end_name_hash(hash);
>  	return 0;
>  }
> -EXPORT_SYMBOL(utf8_casefold_hash);
> +EXPORT_SYMBOL(unicode_casefold_hash);
>  
> -int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> -		   unsigned char *dest, size_t dlen)
> +int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> +		      unsigned char *dest, size_t dlen)
>  {
>  	const struct utf8data *data = utf8nfdi(um->version);
>  	struct utf8cursor cur;
> @@ -167,11 +167,10 @@ int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>  	}
>  	return -EINVAL;
>  }
> +EXPORT_SYMBOL(unicode_normalize);
>  
> -EXPORT_SYMBOL(utf8_normalize);
> -
> -static int utf8_parse_version(const char *version, unsigned int *maj,
> -			      unsigned int *min, unsigned int *rev)
> +static int unicode_parse_version(const char *version, unsigned int *maj,
> +				 unsigned int *min, unsigned int *rev)
>  {
>  	substring_t args[3];
>  	char version_string[12];
> @@ -195,7 +194,7 @@ static int utf8_parse_version(const char *version, unsigned int *maj,
>  	return 0;
>  }
>  
> -struct unicode_map *utf8_load(const char *version)
> +struct unicode_map *unicode_load(const char *version)
>  {
>  	struct unicode_map *um = NULL;
>  	int unicode_version;
> @@ -203,7 +202,7 @@ struct unicode_map *utf8_load(const char *version)
>  	if (version) {
>  		unsigned int maj, min, rev;
>  
> -		if (utf8_parse_version(version, &maj, &min, &rev) < 0)
> +		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
>  			return ERR_PTR(-EINVAL);
>  
>  		if (!utf8version_is_supported(maj, min, rev))
> @@ -228,12 +227,12 @@ struct unicode_map *utf8_load(const char *version)
>  
>  	return um;
>  }
> -EXPORT_SYMBOL(utf8_load);
> +EXPORT_SYMBOL(unicode_load);
>  
> -void utf8_unload(struct unicode_map *um)
> +void unicode_unload(struct unicode_map *um)
>  {
>  	kfree(um);
>  }
> -EXPORT_SYMBOL(utf8_unload);
> +EXPORT_SYMBOL(unicode_unload);
>  
>  MODULE_LICENSE("GPL v2");
> diff --git a/fs/unicode/utf8-selftest.c b/fs/unicode/utf8-selftest.c
> index 6fe8af7ed..796c1ed92 100644
> --- a/fs/unicode/utf8-selftest.c
> +++ b/fs/unicode/utf8-selftest.c
> @@ -235,7 +235,7 @@ static void check_utf8_nfdicf(void)
>  static void check_utf8_comparisons(void)
>  {
>  	int i;
> -	struct unicode_map *table = utf8_load("12.1.0");
> +	struct unicode_map *table = unicode_load("12.1.0");
>  
>  	if (IS_ERR(table)) {
>  		pr_err("%s: Unable to load utf8 %d.%d.%d. Skipping.\n",
> @@ -249,7 +249,7 @@ static void check_utf8_comparisons(void)
>  		const struct qstr s2 = {.name = nfdi_test_data[i].dec,
>  					.len = sizeof(nfdi_test_data[i].dec)};
>  
> -		test_f(!utf8_strncmp(table, &s1, &s2),
> +		test_f(!unicode_strncmp(table, &s1, &s2),
>  		       "%s %s comparison mismatch\n", s1.name, s2.name);
>  	}
>  
> @@ -259,11 +259,11 @@ static void check_utf8_comparisons(void)
>  		const struct qstr s2 = {.name = nfdicf_test_data[i].ncf,
>  					.len = sizeof(nfdicf_test_data[i].ncf)};
>  
> -		test_f(!utf8_strncasecmp(table, &s1, &s2),
> +		test_f(!unicode_strncasecmp(table, &s1, &s2),
>  		       "%s %s comparison mismatch\n", s1.name, s2.name);
>  	}
>  
> -	utf8_unload(table);
> +	unicode_unload(table);
>  }
>  
>  static void check_supported_versions(void)
> diff --git a/include/linux/unicode.h b/include/linux/unicode.h
> index 74484d44c..de23f9ee7 100644
> --- a/include/linux/unicode.h
> +++ b/include/linux/unicode.h
> @@ -10,27 +10,27 @@ struct unicode_map {
>  	int version;
>  };
>  
> -int utf8_validate(const struct unicode_map *um, const struct qstr *str);
> +int unicode_validate(const struct unicode_map *um, const struct qstr *str);
>  
> -int utf8_strncmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2);
> +int unicode_strncmp(const struct unicode_map *um,
> +		    const struct qstr *s1, const struct qstr *s2);
>  
> -int utf8_strncasecmp(const struct unicode_map *um,
> -		 const struct qstr *s1, const struct qstr *s2);
> -int utf8_strncasecmp_folded(const struct unicode_map *um,
> -			    const struct qstr *cf,
> -			    const struct qstr *s1);
> +int unicode_strncasecmp(const struct unicode_map *um,
> +			const struct qstr *s1, const struct qstr *s2);
> +int unicode_strncasecmp_folded(const struct unicode_map *um,
> +			       const struct qstr *cf,
> +			       const struct qstr *s1);
>  
> -int utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> -		   unsigned char *dest, size_t dlen);
> +int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> +		      unsigned char *dest, size_t dlen);
>  
> -int utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> -		  unsigned char *dest, size_t dlen);
> +int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> +		     unsigned char *dest, size_t dlen);
>  
> -int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> -		       struct qstr *str);
> +int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> +			  struct qstr *str);
>  
> -struct unicode_map *utf8_load(const char *version);
> -void utf8_unload(struct unicode_map *um);
> +struct unicode_map *unicode_load(const char *version);
> +void unicode_unload(struct unicode_map *um);
>  
>  #endif /* _LINUX_UNICODE_H */

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 4/5] fs: unicode: Rename utf8-core file to unicode-core
  2021-03-23 18:32   ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 19:15     ` Gabriel Krisman Bertazi
  -1 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:15 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Rename the file name from utf8-core to unicode-core for transformation of
> utf8-core file into the unicode subsystem layer file and also for better
> understanding.
>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

Acked-by: Gabriel Krisman Bertazi <krisman@collabora.com>

Thanks,

> ---
>  fs/unicode/Makefile                        | 2 +-
>  fs/unicode/{utf8-core.c => unicode-core.c} | 0
>  2 files changed, 1 insertion(+), 1 deletion(-)
>  rename fs/unicode/{utf8-core.c => unicode-core.c} (100%)
>
> diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
> index b88aecc86..fbf9a629e 100644
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -3,7 +3,7 @@
>  obj-$(CONFIG_UNICODE) += unicode.o
>  obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>  
> -unicode-y := utf8-norm.o utf8-core.o
> +unicode-y := utf8-norm.o unicode-core.o
>  
>  $(obj)/utf8-norm.o: $(obj)/utf8data.h
>  
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/unicode-core.c
> similarity index 100%
> rename from fs/unicode/utf8-core.c
> rename to fs/unicode/unicode-core.c

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 4/5] fs: unicode: Rename utf8-core file to unicode-core
@ 2021-03-23 19:15     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:15 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, drosen, ebiggers, linux-kernel, linux-f2fs-devel,
	ebiggers, kernel, adilger.kernel, linux-fsdevel, jaegeuk,
	andre.almeida, linux-ext4

Shreeya Patel <shreeya.patel@collabora.com> writes:

> Rename the file name from utf8-core to unicode-core for transformation of
> utf8-core file into the unicode subsystem layer file and also for better
> understanding.
>
> Signed-off-by: Shreeya Patel <shreeya.patel@collabora.com>

Acked-by: Gabriel Krisman Bertazi <krisman@collabora.com>

Thanks,

> ---
>  fs/unicode/Makefile                        | 2 +-
>  fs/unicode/{utf8-core.c => unicode-core.c} | 0
>  2 files changed, 1 insertion(+), 1 deletion(-)
>  rename fs/unicode/{utf8-core.c => unicode-core.c} (100%)
>
> diff --git a/fs/unicode/Makefile b/fs/unicode/Makefile
> index b88aecc86..fbf9a629e 100644
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -3,7 +3,7 @@
>  obj-$(CONFIG_UNICODE) += unicode.o
>  obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>  
> -unicode-y := utf8-norm.o utf8-core.o
> +unicode-y := utf8-norm.o unicode-core.o
>  
>  $(obj)/utf8-norm.o: $(obj)/utf8data.h
>  
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/unicode-core.c
> similarity index 100%
> rename from fs/unicode/utf8-core.c
> rename to fs/unicode/unicode-core.c

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
  2021-03-23 18:32   ` [f2fs-dev] " Shreeya Patel
@ 2021-03-23 19:51     ` Gabriel Krisman Bertazi
  -1 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:51 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida

Shreeya Patel <shreeya.patel@collabora.com> writes:

> utf8data.h_shipped has a large database table which is an auto-generated
> decodification trie for the unicode normalization functions.
> It is not necessary to load this large table in the kernel if no
> file system is using it, hence make UTF-8 encoding loadable by converting
> it into a module.
> Modify the file called unicode-core which will act as a layer for
> unicode subsystem. It will load the UTF-8 module and access it's functions
> whenever any filesystem that needs unicode is mounted.
> Also, indirect calls using function pointers are easily exploitable by
> speculative execution attacks, hence use static_call() in unicode.h and
> unicode-core.c files inorder to prevent these attacks by making direct
> calls and also to improve the performance of function pointers.
>

This static call mechanism is indeed really interesting.  Thanks for
doing it.  A few comments inline

> ---
>
> Changes in v3
>   - Correct the conditions to prevent NULL pointer dereference while
>     accessing functions via utf8_ops variable.
>   - Add spinlock to avoid race conditions that could occur if the module
>     is deregistered after checking utf8_ops and before doing the
>     try_module_get() in the following if condition
>     if (!utf8_ops || !try_module_get(utf8_ops->owner)
>   - Use static_call() for preventing speculative execution attacks.
>   - WARN_ON in case utf8_ops is NULL in unicode_unload().
>   - Rename module file from utf8mod to unicode-utf8.
>
> Changes in v2
>   - Remove the duplicate file utf8-core.c
>   - Make the wrapper functions inline.
>   - Remove msleep and use try_module_get() and module_put()
>     for ensuring that module is loaded correctly and also
>     doesn't get unloaded while in use.
>
>  fs/unicode/Kconfig        |  11 +-
>  fs/unicode/Makefile       |   5 +-
>  fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
>  fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
>  include/linux/unicode.h   |  99 ++++++++++++--
>  5 files changed, 441 insertions(+), 197 deletions(-)
>  create mode 100644 fs/unicode/unicode-utf8.c
>
> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
> index 2c27b9a5c..2961b0206 100644
> --- a/fs/unicode/Kconfig
> +++ b/fs/unicode/Kconfig
> @@ -8,7 +8,16 @@ config UNICODE
>  	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
>  	  support.
>  
> +# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
> +# Having UTF-8 encoding as a module will avoid carrying large
> +# database table present in utf8data.h_shipped into the kernel
> +# by being able to load it only when it is required by the filesystem.
> +config UNICODE_UTF8
> +	tristate "UTF-8 module"
> +	depends on UNICODE
> +	default m
> +
>  config UNICODE_NORMALIZATION_SELFTEST
>  	tristate "Test UTF-8 normalization support"
> -	depends on UNICODE
> +	depends on UNICODE_UTF8
>  	default n
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -1,11 +1,14 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  obj-$(CONFIG_UNICODE) += unicode.o
> +obj-$(CONFIG_UNICODE_UTF8) += utf8.o
>  obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>  
> -unicode-y := utf8-norm.o unicode-core.o
> +unicode-y := unicode-core.o
> +utf8-y := unicode-utf8.o utf8-norm.o
>  
>  $(obj)/utf8-norm.o: $(obj)/utf8data.h
> +$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
>  
>  # In the normal build, the checked-in utf8data.h is just shipped.
>  #
> --- a/fs/unicode/unicode-core.c
> +++ b/fs/unicode/unicode-core.c
> @@ -1,238 +1,144 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> -#include <linux/string.h>
>  #include <linux/slab.h>
> -#include <linux/parser.h>
>  #include <linux/errno.h>
>  #include <linux/unicode.h>
> -#include <linux/stringhash.h>
> +#include <linux/spinlock.h>
>  
> -#include "utf8n.h"
> +DEFINE_SPINLOCK(utf8ops_lock);
>  
> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -
> -	if (utf8nlen(data, str->name, str->len) < 0)
> -		return -1;
> -	return 0;
> -}
> +struct unicode_ops *utf8_ops;
> +EXPORT_SYMBOL(utf8_ops);
> +
> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_validate);

I think that any calls to the default static calls should return errors
instead of succeeding without doing anything.

In fact, are the default calls really necessary?  If someone gets here,
there is a bug elsewhere, so WARN_ON and maybe -EIO.  

int unicode_validate_default_static_call(...)
{
   WARN_ON(1);
   return -EIO;
}

Or just have a NULL default, as I mentioned below, if that is possible.

Eric?

> -int unicode_strncmp(const struct unicode_map *um,
> -		    const struct qstr *s1, const struct qstr *s2)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -	struct utf8cursor cur1, cur2;
> -	int c1, c2;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = utf8byte(&cur2);
> -
> -		if (c1 < 0 || c2 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
> +		  const struct qstr *s2)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncmp);
>  
> -int unicode_strncasecmp(const struct unicode_map *um,
> -			const struct qstr *s1, const struct qstr *s2)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur1, cur2;
> -	int c1, c2;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = utf8byte(&cur2);
> -
> -		if (c1 < 0 || c2 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
> +		      const struct qstr *s2)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncasecmp);
>  
> -/* String cf is expected to be a valid UTF-8 casefolded
> - * string.
> - */
> -int unicode_strncasecmp_folded(const struct unicode_map *um,
> -			       const struct qstr *cf,
> -			       const struct qstr *s1)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur1;
> -	int c1, c2;
> -	int i = 0;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = cf->name[i++];
> -		if (c1 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncasecmp_folded(const struct unicode_map *um,
> +			     const struct qstr *cf, const struct qstr *s1)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncasecmp_folded);
>  
> -int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> -		     unsigned char *dest, size_t dlen)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur;
> -	size_t nlen = 0;
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	for (nlen = 0; nlen < dlen; nlen++) {
> -		int c = utf8byte(&cur);
> -
> -		dest[nlen] = c;
> -		if (!c)
> -			return nlen;
> -		if (c == -1)
> -			break;
> -	}
> -	return -EINVAL;
> -}
> +int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> +		    unsigned char *dest, size_t dlen)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_casefold);
>  
> -int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> -			  struct qstr *str)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur;
> -	int c;
> -	unsigned long hash = init_name_hash(salt);
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	while ((c = utf8byte(&cur))) {
> -		if (c < 0)
> -			return -EINVAL;
> -		hash = partial_name_hash((unsigned char)c, hash);
> -	}
> -	str->hash = end_name_hash(hash);
> -	return 0;
> -}
> +int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> +		   unsigned char *dest, size_t dlen)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_casefold_hash);
>  
> -int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> -		      unsigned char *dest, size_t dlen)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -	struct utf8cursor cur;
> -	ssize_t nlen = 0;
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	for (nlen = 0; nlen < dlen; nlen++) {
> -		int c = utf8byte(&cur);
> -
> -		dest[nlen] = c;
> -		if (!c)
> -			return nlen;
> -		if (c == -1)
> -			break;
> -	}
> -	return -EINVAL;
> -}
> +int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> +			struct qstr *str)
> +{
> +	return 0;
> +}
> +
> +struct unicode_map *_utf8_load(const char *version)
> +{
> +	return NULL;
> +}
> -EXPORT_SYMBOL(unicode_normalize);
>  
> -static int unicode_parse_version(const char *version, unsigned int *maj,
> -				 unsigned int *min, unsigned int *rev)
> -{
> -	substring_t args[3];
> -	char version_string[12];
> -	static const struct match_token token[] = {
> -		{1, "%d.%d.%d"},
> -		{0, NULL}
> -	};
> -
> -	int ret = strscpy(version_string, version, sizeof(version_string));
> -
> -	if (ret < 0)
> -		return ret;
> -
> -	if (match_token(version_string, token, args) != 1)
> -		return -EINVAL;
> -
> -	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
> -	    match_int(&args[2], rev))
> -		return -EINVAL;
> -
> -	return 0;
> -}
> +void _utf8_unload(struct unicode_map *um)
> +{
> +	return;
> +}
> +
> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
> +EXPORT_STATIC_CALL(utf8_strncmp);
> +EXPORT_STATIC_CALL(utf8_strncasecmp);
> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);

I'm having a hard time understanding why some use
DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
static call API is new to me :).  None of this can be called if the
module is not loaded anyway, so perhaps the default function can just be
NULL, per the documentation of include/linux/static_call.h?

Anyway, Aren't utf8_{validate,casefold,normalize} missing the
equivalent EXPORT_STATIC_CALL?

> +
> +static int unicode_load_module(void)
> +{
> +	int ret = request_module("utf8");
> +
> +	if (ret) {
> +		pr_err("Failed to load UTF-8 module\n");
> +		return ret;
> +	}
> +	return 0;
> +}
>  
>  struct unicode_map *unicode_load(const char *version)
> -{
> -	struct unicode_map *um = NULL;
> -	int unicode_version;
> -
> -	if (version) {
> -		unsigned int maj, min, rev;
> -
> -		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
> -			return ERR_PTR(-EINVAL);
> -
> -		if (!utf8version_is_supported(maj, min, rev))
> -			return ERR_PTR(-EINVAL);
> -
> -		unicode_version = UNICODE_AGE(maj, min, rev);
> -	} else {
> -		unicode_version = utf8version_latest();
> -		printk(KERN_WARNING"UTF-8 version not specified. "
> -		       "Assuming latest supported version (%d.%d.%d).",
> -		       (unicode_version >> 16) & 0xff,
> -		       (unicode_version >> 8) & 0xff,
> -		       (unicode_version & 0xff));
> -	}
> -
> -	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
> -	if (!um)
> -		return ERR_PTR(-ENOMEM);
> -
> -	um->charset = "UTF-8";
> -	um->version = unicode_version;
> -
> -	return um;
> -}
> +{
> +	int ret = unicode_load_module();
> +
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	spin_lock(&utf8ops_lock);
> +	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
> +		spin_unlock(&utf8ops_lock);
> +		return ERR_PTR(-ENODEV);
> +	} else {
> +		spin_unlock(&utf8ops_lock);
> +		return static_call(utf8_load)(version);
> +	}
> +}
>  EXPORT_SYMBOL(unicode_load);
>  
>  void unicode_unload(struct unicode_map *um)
>  {
> -	kfree(um);
> +	if (WARN_ON(!utf8_ops))
> +		return;
> +
> +	module_put(utf8_ops->owner);
> +	static_call(utf8_unload)(um);

The module reference drop should happen after utf8_unload to prevent
calling utf8_unload after it is removed if you race with module removal.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
@ 2021-03-23 19:51     ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 28+ messages in thread
From: Gabriel Krisman Bertazi @ 2021-03-23 19:51 UTC (permalink / raw)
  To: Shreeya Patel
  Cc: tytso, drosen, ebiggers, linux-kernel, linux-f2fs-devel,
	ebiggers, kernel, adilger.kernel, linux-fsdevel, jaegeuk,
	andre.almeida, linux-ext4

Shreeya Patel <shreeya.patel@collabora.com> writes:

> utf8data.h_shipped has a large database table which is an auto-generated
> decodification trie for the unicode normalization functions.
> It is not necessary to load this large table in the kernel if no
> file system is using it, hence make UTF-8 encoding loadable by converting
> it into a module.
> Modify the file called unicode-core which will act as a layer for
> unicode subsystem. It will load the UTF-8 module and access it's functions
> whenever any filesystem that needs unicode is mounted.
> Also, indirect calls using function pointers are easily exploitable by
> speculative execution attacks, hence use static_call() in unicode.h and
> unicode-core.c files inorder to prevent these attacks by making direct
> calls and also to improve the performance of function pointers.
>

This static call mechanism is indeed really interesting.  Thanks for
doing it.  A few comments inline

> ---
>
> Changes in v3
>   - Correct the conditions to prevent NULL pointer dereference while
>     accessing functions via utf8_ops variable.
>   - Add spinlock to avoid race conditions that could occur if the module
>     is deregistered after checking utf8_ops and before doing the
>     try_module_get() in the following if condition
>     if (!utf8_ops || !try_module_get(utf8_ops->owner)
>   - Use static_call() for preventing speculative execution attacks.
>   - WARN_ON in case utf8_ops is NULL in unicode_unload().
>   - Rename module file from utf8mod to unicode-utf8.
>
> Changes in v2
>   - Remove the duplicate file utf8-core.c
>   - Make the wrapper functions inline.
>   - Remove msleep and use try_module_get() and module_put()
>     for ensuring that module is loaded correctly and also
>     doesn't get unloaded while in use.
>
>  fs/unicode/Kconfig        |  11 +-
>  fs/unicode/Makefile       |   5 +-
>  fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
>  fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
>  include/linux/unicode.h   |  99 ++++++++++++--
>  5 files changed, 441 insertions(+), 197 deletions(-)
>  create mode 100644 fs/unicode/unicode-utf8.c
>
> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
> index 2c27b9a5c..2961b0206 100644
> --- a/fs/unicode/Kconfig
> +++ b/fs/unicode/Kconfig
> @@ -8,7 +8,16 @@ config UNICODE
>  	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
>  	  support.
>  
> +# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
> +# Having UTF-8 encoding as a module will avoid carrying large
> +# database table present in utf8data.h_shipped into the kernel
> +# by being able to load it only when it is required by the filesystem.
> +config UNICODE_UTF8
> +	tristate "UTF-8 module"
> +	depends on UNICODE
> +	default m
> +
>  config UNICODE_NORMALIZATION_SELFTEST
>  	tristate "Test UTF-8 normalization support"
> -	depends on UNICODE
> +	depends on UNICODE_UTF8
>  	default n
> --- a/fs/unicode/Makefile
> +++ b/fs/unicode/Makefile
> @@ -1,11 +1,14 @@
>  # SPDX-License-Identifier: GPL-2.0
>  
>  obj-$(CONFIG_UNICODE) += unicode.o
> +obj-$(CONFIG_UNICODE_UTF8) += utf8.o
>  obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>  
> -unicode-y := utf8-norm.o unicode-core.o
> +unicode-y := unicode-core.o
> +utf8-y := unicode-utf8.o utf8-norm.o
>  
>  $(obj)/utf8-norm.o: $(obj)/utf8data.h
> +$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
>  
>  # In the normal build, the checked-in utf8data.h is just shipped.
>  #
> --- a/fs/unicode/unicode-core.c
> +++ b/fs/unicode/unicode-core.c
> @@ -1,238 +1,144 @@
>  /* SPDX-License-Identifier: GPL-2.0 */
>  #include <linux/module.h>
>  #include <linux/kernel.h>
> -#include <linux/string.h>
>  #include <linux/slab.h>
> -#include <linux/parser.h>
>  #include <linux/errno.h>
>  #include <linux/unicode.h>
> -#include <linux/stringhash.h>
> +#include <linux/spinlock.h>
>  
> -#include "utf8n.h"
> +DEFINE_SPINLOCK(utf8ops_lock);
>  
> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -
> -	if (utf8nlen(data, str->name, str->len) < 0)
> -		return -1;
> -	return 0;
> -}
> +struct unicode_ops *utf8_ops;
> +EXPORT_SYMBOL(utf8_ops);
> +
> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_validate);

I think that any calls to the default static calls should return errors
instead of succeeding without doing anything.

In fact, are the default calls really necessary?  If someone gets here,
there is a bug elsewhere, so WARN_ON and maybe -EIO.  

int unicode_validate_default_static_call(...)
{
   WARN_ON(1);
   return -EIO;
}

Or just have a NULL default, as I mentioned below, if that is possible.

Eric?

> -int unicode_strncmp(const struct unicode_map *um,
> -		    const struct qstr *s1, const struct qstr *s2)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -	struct utf8cursor cur1, cur2;
> -	int c1, c2;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = utf8byte(&cur2);
> -
> -		if (c1 < 0 || c2 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
> +		  const struct qstr *s2)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncmp);
>  
> -int unicode_strncasecmp(const struct unicode_map *um,
> -			const struct qstr *s1, const struct qstr *s2)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur1, cur2;
> -	int c1, c2;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = utf8byte(&cur2);
> -
> -		if (c1 < 0 || c2 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
> +		      const struct qstr *s2)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncasecmp);
>  
> -/* String cf is expected to be a valid UTF-8 casefolded
> - * string.
> - */
> -int unicode_strncasecmp_folded(const struct unicode_map *um,
> -			       const struct qstr *cf,
> -			       const struct qstr *s1)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur1;
> -	int c1, c2;
> -	int i = 0;
> -
> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
> -		return -EINVAL;
> -
> -	do {
> -		c1 = utf8byte(&cur1);
> -		c2 = cf->name[i++];
> -		if (c1 < 0)
> -			return -EINVAL;
> -		if (c1 != c2)
> -			return 1;
> -	} while (c1);
> -
> -	return 0;
> -}
> +int _utf8_strncasecmp_folded(const struct unicode_map *um,
> +			     const struct qstr *cf, const struct qstr *s1)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_strncasecmp_folded);
>  
> -int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
> -		     unsigned char *dest, size_t dlen)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur;
> -	size_t nlen = 0;
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	for (nlen = 0; nlen < dlen; nlen++) {
> -		int c = utf8byte(&cur);
> -
> -		dest[nlen] = c;
> -		if (!c)
> -			return nlen;
> -		if (c == -1)
> -			break;
> -	}
> -	return -EINVAL;
> -}
> +int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
> +		    unsigned char *dest, size_t dlen)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_casefold);
>  
> -int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
> -			  struct qstr *str)
> -{
> -	const struct utf8data *data = utf8nfdicf(um->version);
> -	struct utf8cursor cur;
> -	int c;
> -	unsigned long hash = init_name_hash(salt);
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	while ((c = utf8byte(&cur))) {
> -		if (c < 0)
> -			return -EINVAL;
> -		hash = partial_name_hash((unsigned char)c, hash);
> -	}
> -	str->hash = end_name_hash(hash);
> -	return 0;
> -}
> +int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
> +		   unsigned char *dest, size_t dlen)
> +{
> +	return 0;
> +}
> -EXPORT_SYMBOL(unicode_casefold_hash);
>  
> -int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
> -		      unsigned char *dest, size_t dlen)
> -{
> -	const struct utf8data *data = utf8nfdi(um->version);
> -	struct utf8cursor cur;
> -	ssize_t nlen = 0;
> -
> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
> -		return -EINVAL;
> -
> -	for (nlen = 0; nlen < dlen; nlen++) {
> -		int c = utf8byte(&cur);
> -
> -		dest[nlen] = c;
> -		if (!c)
> -			return nlen;
> -		if (c == -1)
> -			break;
> -	}
> -	return -EINVAL;
> -}
> +int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
> +			struct qstr *str)
> +{
> +	return 0;
> +}
> +
> +struct unicode_map *_utf8_load(const char *version)
> +{
> +	return NULL;
> +}
> -EXPORT_SYMBOL(unicode_normalize);
>  
> -static int unicode_parse_version(const char *version, unsigned int *maj,
> -				 unsigned int *min, unsigned int *rev)
> -{
> -	substring_t args[3];
> -	char version_string[12];
> -	static const struct match_token token[] = {
> -		{1, "%d.%d.%d"},
> -		{0, NULL}
> -	};
> -
> -	int ret = strscpy(version_string, version, sizeof(version_string));
> -
> -	if (ret < 0)
> -		return ret;
> -
> -	if (match_token(version_string, token, args) != 1)
> -		return -EINVAL;
> -
> -	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
> -	    match_int(&args[2], rev))
> -		return -EINVAL;
> -
> -	return 0;
> -}
> +void _utf8_unload(struct unicode_map *um)
> +{
> +	return;
> +}
> +
> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
> +EXPORT_STATIC_CALL(utf8_strncmp);
> +EXPORT_STATIC_CALL(utf8_strncasecmp);
> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);

I'm having a hard time understanding why some use
DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
static call API is new to me :).  None of this can be called if the
module is not loaded anyway, so perhaps the default function can just be
NULL, per the documentation of include/linux/static_call.h?

Anyway, Aren't utf8_{validate,casefold,normalize} missing the
equivalent EXPORT_STATIC_CALL?

> +
> +static int unicode_load_module(void)
> +{
> +	int ret = request_module("utf8");
> +
> +	if (ret) {
> +		pr_err("Failed to load UTF-8 module\n");
> +		return ret;
> +	}
> +	return 0;
> +}
>  
>  struct unicode_map *unicode_load(const char *version)
> -{
> -	struct unicode_map *um = NULL;
> -	int unicode_version;
> -
> -	if (version) {
> -		unsigned int maj, min, rev;
> -
> -		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
> -			return ERR_PTR(-EINVAL);
> -
> -		if (!utf8version_is_supported(maj, min, rev))
> -			return ERR_PTR(-EINVAL);
> -
> -		unicode_version = UNICODE_AGE(maj, min, rev);
> -	} else {
> -		unicode_version = utf8version_latest();
> -		printk(KERN_WARNING"UTF-8 version not specified. "
> -		       "Assuming latest supported version (%d.%d.%d).",
> -		       (unicode_version >> 16) & 0xff,
> -		       (unicode_version >> 8) & 0xff,
> -		       (unicode_version & 0xff));
> -	}
> -
> -	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
> -	if (!um)
> -		return ERR_PTR(-ENOMEM);
> -
> -	um->charset = "UTF-8";
> -	um->version = unicode_version;
> -
> -	return um;
> -}
> +{
> +	int ret = unicode_load_module();
> +
> +	if (ret)
> +		return ERR_PTR(ret);
> +
> +	spin_lock(&utf8ops_lock);
> +	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
> +		spin_unlock(&utf8ops_lock);
> +		return ERR_PTR(-ENODEV);
> +	} else {
> +		spin_unlock(&utf8ops_lock);
> +		return static_call(utf8_load)(version);
> +	}
> +}
>  EXPORT_SYMBOL(unicode_load);
>  
>  void unicode_unload(struct unicode_map *um)
>  {
> -	kfree(um);
> +	if (WARN_ON(!utf8_ops))
> +		return;
> +
> +	module_put(utf8_ops->owner);
> +	static_call(utf8_unload)(um);

The module reference drop should happen after utf8_unload to prevent
calling utf8_unload after it is removed if you race with module removal.

-- 
Gabriel Krisman Bertazi


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
  2021-03-23 19:51     ` [f2fs-dev] " Gabriel Krisman Bertazi
@ 2021-03-23 20:29       ` Eric Biggers
  -1 siblings, 0 replies; 28+ messages in thread
From: Eric Biggers @ 2021-03-23 20:29 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Shreeya Patel, tytso, adilger.kernel, jaegeuk, chao, drosen,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida

On Tue, Mar 23, 2021 at 03:51:44PM -0400, Gabriel Krisman Bertazi wrote:
> > -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
> > -{
> > -	const struct utf8data *data = utf8nfdi(um->version);
> > -
> > -	if (utf8nlen(data, str->name, str->len) < 0)
> > -		return -1;
> > -	return 0;
> > -}
> > +struct unicode_ops *utf8_ops;
> > +EXPORT_SYMBOL(utf8_ops);
> > +
> > +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
> > +{
> > +	return 0;
> > +}
> > -EXPORT_SYMBOL(unicode_validate);
> 
> I think that any calls to the default static calls should return errors
> instead of succeeding without doing anything.
> 
> In fact, are the default calls really necessary?  If someone gets here,
> there is a bug elsewhere, so WARN_ON and maybe -EIO.  
> 
> int unicode_validate_default_static_call(...)
> {
>    WARN_ON(1);
>    return -EIO;
> }
> 
> Or just have a NULL default, as I mentioned below, if that is possible.
> 
[...]
> > +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
> > +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
> > +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
> > +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
> > +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
> > +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
> > +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
> > +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
> > +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
> > +EXPORT_STATIC_CALL(utf8_strncmp);
> > +EXPORT_STATIC_CALL(utf8_strncasecmp);
> > +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
> 
> I'm having a hard time understanding why some use
> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
> static call API is new to me :).  None of this can be called if the
> module is not loaded anyway, so perhaps the default function can just be
> NULL, per the documentation of include/linux/static_call.h?
> 
> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
> equivalent EXPORT_STATIC_CALL?
> 

The static_call API is fairly new to me too.  But the intent of this patch seems
to be that none of the utf8 functions are called without the utf8 module loaded.
If they are called, it's a kernel bug.  So there are two options for what to do
if it happens anyway:

  1. call a "null" static call, which does nothing

*or*

  2. call a default function which does WARN_ON_ONCE() and returns an error if
     possible.

(or 3. don't use static calls and instead dereference a NULL utf8_ops like
previous versions of this patch did.)

It shouldn't really matter which of these approaches you take, but please be
consistent and use the same one everywhere.

> + void unicode_unregister(void)
> + {
> +         spin_lock(&utf8ops_lock);
> +         utf8_ops = NULL;
> +         spin_unlock(&utf8ops_lock);
> + }
> + EXPORT_SYMBOL(unicode_unregister);

This should restore the static calls to their default values (either NULL or the
default functions, depending on what you decide).

Also, it's weird to still have the utf8_ops structure when using static calls.
It seems it should be one way or the other: static calls *or* utf8_ops.

The static calls could be exported, and the module could be responsible for
updating them.  That would eliminate the need for utf8_ops.

- Eric

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
@ 2021-03-23 20:29       ` Eric Biggers
  0 siblings, 0 replies; 28+ messages in thread
From: Eric Biggers @ 2021-03-23 20:29 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: tytso, drosen, linux-kernel, linux-f2fs-devel, kernel,
	adilger.kernel, linux-fsdevel, jaegeuk, andre.almeida,
	linux-ext4, Shreeya Patel

On Tue, Mar 23, 2021 at 03:51:44PM -0400, Gabriel Krisman Bertazi wrote:
> > -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
> > -{
> > -	const struct utf8data *data = utf8nfdi(um->version);
> > -
> > -	if (utf8nlen(data, str->name, str->len) < 0)
> > -		return -1;
> > -	return 0;
> > -}
> > +struct unicode_ops *utf8_ops;
> > +EXPORT_SYMBOL(utf8_ops);
> > +
> > +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
> > +{
> > +	return 0;
> > +}
> > -EXPORT_SYMBOL(unicode_validate);
> 
> I think that any calls to the default static calls should return errors
> instead of succeeding without doing anything.
> 
> In fact, are the default calls really necessary?  If someone gets here,
> there is a bug elsewhere, so WARN_ON and maybe -EIO.  
> 
> int unicode_validate_default_static_call(...)
> {
>    WARN_ON(1);
>    return -EIO;
> }
> 
> Or just have a NULL default, as I mentioned below, if that is possible.
> 
[...]
> > +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
> > +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
> > +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
> > +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
> > +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
> > +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
> > +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
> > +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
> > +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
> > +EXPORT_STATIC_CALL(utf8_strncmp);
> > +EXPORT_STATIC_CALL(utf8_strncasecmp);
> > +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
> 
> I'm having a hard time understanding why some use
> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
> static call API is new to me :).  None of this can be called if the
> module is not loaded anyway, so perhaps the default function can just be
> NULL, per the documentation of include/linux/static_call.h?
> 
> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
> equivalent EXPORT_STATIC_CALL?
> 

The static_call API is fairly new to me too.  But the intent of this patch seems
to be that none of the utf8 functions are called without the utf8 module loaded.
If they are called, it's a kernel bug.  So there are two options for what to do
if it happens anyway:

  1. call a "null" static call, which does nothing

*or*

  2. call a default function which does WARN_ON_ONCE() and returns an error if
     possible.

(or 3. don't use static calls and instead dereference a NULL utf8_ops like
previous versions of this patch did.)

It shouldn't really matter which of these approaches you take, but please be
consistent and use the same one everywhere.

> + void unicode_unregister(void)
> + {
> +         spin_lock(&utf8ops_lock);
> +         utf8_ops = NULL;
> +         spin_unlock(&utf8ops_lock);
> + }
> + EXPORT_SYMBOL(unicode_unregister);

This should restore the static calls to their default values (either NULL or the
default functions, depending on what you decide).

Also, it's weird to still have the utf8_ops structure when using static calls.
It seems it should be one way or the other: static calls *or* utf8_ops.

The static calls could be exported, and the module could be responsible for
updating them.  That would eliminate the need for utf8_ops.

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
  2021-03-23 19:51     ` [f2fs-dev] " Gabriel Krisman Bertazi
@ 2021-03-23 22:12       ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 22:12 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: tytso, adilger.kernel, jaegeuk, chao, ebiggers, drosen, ebiggers,
	yuchao0, linux-ext4, linux-kernel, linux-f2fs-devel,
	linux-fsdevel, kernel, andre.almeida


On 24/03/21 1:21 am, Gabriel Krisman Bertazi wrote:
> Shreeya Patel <shreeya.patel@collabora.com> writes:
>
>> utf8data.h_shipped has a large database table which is an auto-generated
>> decodification trie for the unicode normalization functions.
>> It is not necessary to load this large table in the kernel if no
>> file system is using it, hence make UTF-8 encoding loadable by converting
>> it into a module.
>> Modify the file called unicode-core which will act as a layer for
>> unicode subsystem. It will load the UTF-8 module and access it's functions
>> whenever any filesystem that needs unicode is mounted.
>> Also, indirect calls using function pointers are easily exploitable by
>> speculative execution attacks, hence use static_call() in unicode.h and
>> unicode-core.c files inorder to prevent these attacks by making direct
>> calls and also to improve the performance of function pointers.
>>
> This static call mechanism is indeed really interesting.  Thanks for
> doing it.  A few comments inline
>
>> ---
>>
>> Changes in v3
>>    - Correct the conditions to prevent NULL pointer dereference while
>>      accessing functions via utf8_ops variable.
>>    - Add spinlock to avoid race conditions that could occur if the module
>>      is deregistered after checking utf8_ops and before doing the
>>      try_module_get() in the following if condition
>>      if (!utf8_ops || !try_module_get(utf8_ops->owner)
>>    - Use static_call() for preventing speculative execution attacks.
>>    - WARN_ON in case utf8_ops is NULL in unicode_unload().
>>    - Rename module file from utf8mod to unicode-utf8.
>>
>> Changes in v2
>>    - Remove the duplicate file utf8-core.c
>>    - Make the wrapper functions inline.
>>    - Remove msleep and use try_module_get() and module_put()
>>      for ensuring that module is loaded correctly and also
>>      doesn't get unloaded while in use.
>>
>>   fs/unicode/Kconfig        |  11 +-
>>   fs/unicode/Makefile       |   5 +-
>>   fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
>>   fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
>>   include/linux/unicode.h   |  99 ++++++++++++--
>>   5 files changed, 441 insertions(+), 197 deletions(-)
>>   create mode 100644 fs/unicode/unicode-utf8.c
>>
>> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
>> index 2c27b9a5c..2961b0206 100644
>> --- a/fs/unicode/Kconfig
>> +++ b/fs/unicode/Kconfig
>> @@ -8,7 +8,16 @@ config UNICODE
>>   	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
>>   	  support.
>>   
>> +# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
>> +# Having UTF-8 encoding as a module will avoid carrying large
>> +# database table present in utf8data.h_shipped into the kernel
>> +# by being able to load it only when it is required by the filesystem.
>> +config UNICODE_UTF8
>> +	tristate "UTF-8 module"
>> +	depends on UNICODE
>> +	default m
>> +
>>   config UNICODE_NORMALIZATION_SELFTEST
>>   	tristate "Test UTF-8 normalization support"
>> -	depends on UNICODE
>> +	depends on UNICODE_UTF8
>>   	default n
>> --- a/fs/unicode/Makefile
>> +++ b/fs/unicode/Makefile
>> @@ -1,11 +1,14 @@
>>   # SPDX-License-Identifier: GPL-2.0
>>   
>>   obj-$(CONFIG_UNICODE) += unicode.o
>> +obj-$(CONFIG_UNICODE_UTF8) += utf8.o
>>   obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>>   
>> -unicode-y := utf8-norm.o unicode-core.o
>> +unicode-y := unicode-core.o
>> +utf8-y := unicode-utf8.o utf8-norm.o
>>   
>>   $(obj)/utf8-norm.o: $(obj)/utf8data.h
>> +$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
>>   
>>   # In the normal build, the checked-in utf8data.h is just shipped.
>>   #
>> --- a/fs/unicode/unicode-core.c
>> +++ b/fs/unicode/unicode-core.c
>> @@ -1,238 +1,144 @@
>>   /* SPDX-License-Identifier: GPL-2.0 */
>>   #include <linux/module.h>
>>   #include <linux/kernel.h>
>> -#include <linux/string.h>
>>   #include <linux/slab.h>
>> -#include <linux/parser.h>
>>   #include <linux/errno.h>
>>   #include <linux/unicode.h>
>> -#include <linux/stringhash.h>
>> +#include <linux/spinlock.h>
>>   
>> -#include "utf8n.h"
>> +DEFINE_SPINLOCK(utf8ops_lock);
>>   
>> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -
>> -	if (utf8nlen(data, str->name, str->len) < 0)
>> -		return -1;
>> -	return 0;
>> -}
>> +struct unicode_ops *utf8_ops;
>> +EXPORT_SYMBOL(utf8_ops);
>> +
>> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_validate);
> I think that any calls to the default static calls should return errors
> instead of succeeding without doing anything.
>
> In fact, are the default calls really necessary?


I used DEFINE_STATIC_CALL() for functions having non-void return type and
it isn't possible to return nothing from it and hence had to use return 0.
But as you and Eric said, succeeding without doing anything doesn't seem 
right
so I'll use DEFINE_STATIC_CALL_NULL() which would allow me to return 
nothing.


>    If someone gets here,
> there is a bug elsewhere, so WARN_ON and maybe -EIO.
>
> int unicode_validate_default_static_call(...)
> {
>     WARN_ON(1);
>     return -EIO;
> }
>
> Or just have a NULL default, as I mentioned below, if that is possible.
>
> Eric?
>
>> -int unicode_strncmp(const struct unicode_map *um,
>> -		    const struct qstr *s1, const struct qstr *s2)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -	struct utf8cursor cur1, cur2;
>> -	int c1, c2;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = utf8byte(&cur2);
>> -
>> -		if (c1 < 0 || c2 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
>> +		  const struct qstr *s2)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncmp);
>>   
>> -int unicode_strncasecmp(const struct unicode_map *um,
>> -			const struct qstr *s1, const struct qstr *s2)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur1, cur2;
>> -	int c1, c2;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = utf8byte(&cur2);
>> -
>> -		if (c1 < 0 || c2 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
>> +		      const struct qstr *s2)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncasecmp);
>>   
>> -/* String cf is expected to be a valid UTF-8 casefolded
>> - * string.
>> - */
>> -int unicode_strncasecmp_folded(const struct unicode_map *um,
>> -			       const struct qstr *cf,
>> -			       const struct qstr *s1)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur1;
>> -	int c1, c2;
>> -	int i = 0;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = cf->name[i++];
>> -		if (c1 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncasecmp_folded(const struct unicode_map *um,
>> +			     const struct qstr *cf, const struct qstr *s1)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncasecmp_folded);
>>   
>> -int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
>> -		     unsigned char *dest, size_t dlen)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur;
>> -	size_t nlen = 0;
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	for (nlen = 0; nlen < dlen; nlen++) {
>> -		int c = utf8byte(&cur);
>> -
>> -		dest[nlen] = c;
>> -		if (!c)
>> -			return nlen;
>> -		if (c == -1)
>> -			break;
>> -	}
>> -	return -EINVAL;
>> -}
>> +int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>> +		    unsigned char *dest, size_t dlen)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_casefold);
>>   
>> -int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
>> -			  struct qstr *str)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur;
>> -	int c;
>> -	unsigned long hash = init_name_hash(salt);
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	while ((c = utf8byte(&cur))) {
>> -		if (c < 0)
>> -			return -EINVAL;
>> -		hash = partial_name_hash((unsigned char)c, hash);
>> -	}
>> -	str->hash = end_name_hash(hash);
>> -	return 0;
>> -}
>> +int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>> +		   unsigned char *dest, size_t dlen)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_casefold_hash);
>>   
>> -int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
>> -		      unsigned char *dest, size_t dlen)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -	struct utf8cursor cur;
>> -	ssize_t nlen = 0;
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	for (nlen = 0; nlen < dlen; nlen++) {
>> -		int c = utf8byte(&cur);
>> -
>> -		dest[nlen] = c;
>> -		if (!c)
>> -			return nlen;
>> -		if (c == -1)
>> -			break;
>> -	}
>> -	return -EINVAL;
>> -}
>> +int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
>> +			struct qstr *str)
>> +{
>> +	return 0;
>> +}
>> +
>> +struct unicode_map *_utf8_load(const char *version)
>> +{
>> +	return NULL;
>> +}
>> -EXPORT_SYMBOL(unicode_normalize);
>>   
>> -static int unicode_parse_version(const char *version, unsigned int *maj,
>> -				 unsigned int *min, unsigned int *rev)
>> -{
>> -	substring_t args[3];
>> -	char version_string[12];
>> -	static const struct match_token token[] = {
>> -		{1, "%d.%d.%d"},
>> -		{0, NULL}
>> -	};
>> -
>> -	int ret = strscpy(version_string, version, sizeof(version_string));
>> -
>> -	if (ret < 0)
>> -		return ret;
>> -
>> -	if (match_token(version_string, token, args) != 1)
>> -		return -EINVAL;
>> -
>> -	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
>> -	    match_int(&args[2], rev))
>> -		return -EINVAL;
>> -
>> -	return 0;
>> -}
>> +void _utf8_unload(struct unicode_map *um)
>> +{
>> +	return;
>> +}
>> +
>> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
>> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
>> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
>> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
>> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
>> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
>> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
>> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
>> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
>> +EXPORT_STATIC_CALL(utf8_strncmp);
>> +EXPORT_STATIC_CALL(utf8_strncasecmp);
>> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
> I'm having a hard time understanding why some use
> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
> static call API is new to me :).  None of this can be called if the
> module is not loaded anyway, so perhaps the default function can just be
> NULL, per the documentation of include/linux/static_call.h?
>
> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
> equivalent EXPORT_STATIC_CALL?


These functions aren't used by utf8-selftest.c files and hence there is 
no need to
export them.


>> +
>> +static int unicode_load_module(void)
>> +{
>> +	int ret = request_module("utf8");
>> +
>> +	if (ret) {
>> +		pr_err("Failed to load UTF-8 module\n");
>> +		return ret;
>> +	}
>> +	return 0;
>> +}
>>   
>>   struct unicode_map *unicode_load(const char *version)
>> -{
>> -	struct unicode_map *um = NULL;
>> -	int unicode_version;
>> -
>> -	if (version) {
>> -		unsigned int maj, min, rev;
>> -
>> -		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
>> -			return ERR_PTR(-EINVAL);
>> -
>> -		if (!utf8version_is_supported(maj, min, rev))
>> -			return ERR_PTR(-EINVAL);
>> -
>> -		unicode_version = UNICODE_AGE(maj, min, rev);
>> -	} else {
>> -		unicode_version = utf8version_latest();
>> -		printk(KERN_WARNING"UTF-8 version not specified. "
>> -		       "Assuming latest supported version (%d.%d.%d).",
>> -		       (unicode_version >> 16) & 0xff,
>> -		       (unicode_version >> 8) & 0xff,
>> -		       (unicode_version & 0xff));
>> -	}
>> -
>> -	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
>> -	if (!um)
>> -		return ERR_PTR(-ENOMEM);
>> -
>> -	um->charset = "UTF-8";
>> -	um->version = unicode_version;
>> -
>> -	return um;
>> -}
>> +{
>> +	int ret = unicode_load_module();
>> +
>> +	if (ret)
>> +		return ERR_PTR(ret);
>> +
>> +	spin_lock(&utf8ops_lock);
>> +	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
>> +		spin_unlock(&utf8ops_lock);
>> +		return ERR_PTR(-ENODEV);
>> +	} else {
>> +		spin_unlock(&utf8ops_lock);
>> +		return static_call(utf8_load)(version);
>> +	}
>> +}
>>   EXPORT_SYMBOL(unicode_load);
>>   
>>   void unicode_unload(struct unicode_map *um)
>>   {
>> -	kfree(um);
>> +	if (WARN_ON(!utf8_ops))
>> +		return;
>> +
>> +	module_put(utf8_ops->owner);
>> +	static_call(utf8_unload)(um);
> The module reference drop should happen after utf8_unload to prevent
> calling utf8_unload after it is removed if you race with module removal.
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
@ 2021-03-23 22:12       ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 22:12 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: tytso, drosen, ebiggers, linux-kernel, linux-f2fs-devel,
	ebiggers, kernel, adilger.kernel, linux-fsdevel, jaegeuk,
	andre.almeida, linux-ext4


On 24/03/21 1:21 am, Gabriel Krisman Bertazi wrote:
> Shreeya Patel <shreeya.patel@collabora.com> writes:
>
>> utf8data.h_shipped has a large database table which is an auto-generated
>> decodification trie for the unicode normalization functions.
>> It is not necessary to load this large table in the kernel if no
>> file system is using it, hence make UTF-8 encoding loadable by converting
>> it into a module.
>> Modify the file called unicode-core which will act as a layer for
>> unicode subsystem. It will load the UTF-8 module and access it's functions
>> whenever any filesystem that needs unicode is mounted.
>> Also, indirect calls using function pointers are easily exploitable by
>> speculative execution attacks, hence use static_call() in unicode.h and
>> unicode-core.c files inorder to prevent these attacks by making direct
>> calls and also to improve the performance of function pointers.
>>
> This static call mechanism is indeed really interesting.  Thanks for
> doing it.  A few comments inline
>
>> ---
>>
>> Changes in v3
>>    - Correct the conditions to prevent NULL pointer dereference while
>>      accessing functions via utf8_ops variable.
>>    - Add spinlock to avoid race conditions that could occur if the module
>>      is deregistered after checking utf8_ops and before doing the
>>      try_module_get() in the following if condition
>>      if (!utf8_ops || !try_module_get(utf8_ops->owner)
>>    - Use static_call() for preventing speculative execution attacks.
>>    - WARN_ON in case utf8_ops is NULL in unicode_unload().
>>    - Rename module file from utf8mod to unicode-utf8.
>>
>> Changes in v2
>>    - Remove the duplicate file utf8-core.c
>>    - Make the wrapper functions inline.
>>    - Remove msleep and use try_module_get() and module_put()
>>      for ensuring that module is loaded correctly and also
>>      doesn't get unloaded while in use.
>>
>>   fs/unicode/Kconfig        |  11 +-
>>   fs/unicode/Makefile       |   5 +-
>>   fs/unicode/unicode-core.c | 268 +++++++++++++-------------------------
>>   fs/unicode/unicode-utf8.c | 255 ++++++++++++++++++++++++++++++++++++
>>   include/linux/unicode.h   |  99 ++++++++++++--
>>   5 files changed, 441 insertions(+), 197 deletions(-)
>>   create mode 100644 fs/unicode/unicode-utf8.c
>>
>> diff --git a/fs/unicode/Kconfig b/fs/unicode/Kconfig
>> index 2c27b9a5c..2961b0206 100644
>> --- a/fs/unicode/Kconfig
>> +++ b/fs/unicode/Kconfig
>> @@ -8,7 +8,16 @@ config UNICODE
>>   	  Say Y here to enable UTF-8 NFD normalization and NFD+CF casefolding
>>   	  support.
>>   
>> +# UTF-8 encoding can be compiled as a module using UNICODE_UTF8 option.
>> +# Having UTF-8 encoding as a module will avoid carrying large
>> +# database table present in utf8data.h_shipped into the kernel
>> +# by being able to load it only when it is required by the filesystem.
>> +config UNICODE_UTF8
>> +	tristate "UTF-8 module"
>> +	depends on UNICODE
>> +	default m
>> +
>>   config UNICODE_NORMALIZATION_SELFTEST
>>   	tristate "Test UTF-8 normalization support"
>> -	depends on UNICODE
>> +	depends on UNICODE_UTF8
>>   	default n
>> --- a/fs/unicode/Makefile
>> +++ b/fs/unicode/Makefile
>> @@ -1,11 +1,14 @@
>>   # SPDX-License-Identifier: GPL-2.0
>>   
>>   obj-$(CONFIG_UNICODE) += unicode.o
>> +obj-$(CONFIG_UNICODE_UTF8) += utf8.o
>>   obj-$(CONFIG_UNICODE_NORMALIZATION_SELFTEST) += utf8-selftest.o
>>   
>> -unicode-y := utf8-norm.o unicode-core.o
>> +unicode-y := unicode-core.o
>> +utf8-y := unicode-utf8.o utf8-norm.o
>>   
>>   $(obj)/utf8-norm.o: $(obj)/utf8data.h
>> +$(obj)/unicode-utf8.o: $(obj)/utf8-norm.o
>>   
>>   # In the normal build, the checked-in utf8data.h is just shipped.
>>   #
>> --- a/fs/unicode/unicode-core.c
>> +++ b/fs/unicode/unicode-core.c
>> @@ -1,238 +1,144 @@
>>   /* SPDX-License-Identifier: GPL-2.0 */
>>   #include <linux/module.h>
>>   #include <linux/kernel.h>
>> -#include <linux/string.h>
>>   #include <linux/slab.h>
>> -#include <linux/parser.h>
>>   #include <linux/errno.h>
>>   #include <linux/unicode.h>
>> -#include <linux/stringhash.h>
>> +#include <linux/spinlock.h>
>>   
>> -#include "utf8n.h"
>> +DEFINE_SPINLOCK(utf8ops_lock);
>>   
>> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -
>> -	if (utf8nlen(data, str->name, str->len) < 0)
>> -		return -1;
>> -	return 0;
>> -}
>> +struct unicode_ops *utf8_ops;
>> +EXPORT_SYMBOL(utf8_ops);
>> +
>> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_validate);
> I think that any calls to the default static calls should return errors
> instead of succeeding without doing anything.
>
> In fact, are the default calls really necessary?


I used DEFINE_STATIC_CALL() for functions having non-void return type and
it isn't possible to return nothing from it and hence had to use return 0.
But as you and Eric said, succeeding without doing anything doesn't seem 
right
so I'll use DEFINE_STATIC_CALL_NULL() which would allow me to return 
nothing.


>    If someone gets here,
> there is a bug elsewhere, so WARN_ON and maybe -EIO.
>
> int unicode_validate_default_static_call(...)
> {
>     WARN_ON(1);
>     return -EIO;
> }
>
> Or just have a NULL default, as I mentioned below, if that is possible.
>
> Eric?
>
>> -int unicode_strncmp(const struct unicode_map *um,
>> -		    const struct qstr *s1, const struct qstr *s2)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -	struct utf8cursor cur1, cur2;
>> -	int c1, c2;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = utf8byte(&cur2);
>> -
>> -		if (c1 < 0 || c2 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncmp(const struct unicode_map *um, const struct qstr *s1,
>> +		  const struct qstr *s2)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncmp);
>>   
>> -int unicode_strncasecmp(const struct unicode_map *um,
>> -			const struct qstr *s1, const struct qstr *s2)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur1, cur2;
>> -	int c1, c2;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	if (utf8ncursor(&cur2, data, s2->name, s2->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = utf8byte(&cur2);
>> -
>> -		if (c1 < 0 || c2 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncasecmp(const struct unicode_map *um, const struct qstr *s1,
>> +		      const struct qstr *s2)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncasecmp);
>>   
>> -/* String cf is expected to be a valid UTF-8 casefolded
>> - * string.
>> - */
>> -int unicode_strncasecmp_folded(const struct unicode_map *um,
>> -			       const struct qstr *cf,
>> -			       const struct qstr *s1)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur1;
>> -	int c1, c2;
>> -	int i = 0;
>> -
>> -	if (utf8ncursor(&cur1, data, s1->name, s1->len) < 0)
>> -		return -EINVAL;
>> -
>> -	do {
>> -		c1 = utf8byte(&cur1);
>> -		c2 = cf->name[i++];
>> -		if (c1 < 0)
>> -			return -EINVAL;
>> -		if (c1 != c2)
>> -			return 1;
>> -	} while (c1);
>> -
>> -	return 0;
>> -}
>> +int _utf8_strncasecmp_folded(const struct unicode_map *um,
>> +			     const struct qstr *cf, const struct qstr *s1)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_strncasecmp_folded);
>>   
>> -int unicode_casefold(const struct unicode_map *um, const struct qstr *str,
>> -		     unsigned char *dest, size_t dlen)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur;
>> -	size_t nlen = 0;
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	for (nlen = 0; nlen < dlen; nlen++) {
>> -		int c = utf8byte(&cur);
>> -
>> -		dest[nlen] = c;
>> -		if (!c)
>> -			return nlen;
>> -		if (c == -1)
>> -			break;
>> -	}
>> -	return -EINVAL;
>> -}
>> +int _utf8_normalize(const struct unicode_map *um, const struct qstr *str,
>> +		    unsigned char *dest, size_t dlen)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_casefold);
>>   
>> -int unicode_casefold_hash(const struct unicode_map *um, const void *salt,
>> -			  struct qstr *str)
>> -{
>> -	const struct utf8data *data = utf8nfdicf(um->version);
>> -	struct utf8cursor cur;
>> -	int c;
>> -	unsigned long hash = init_name_hash(salt);
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	while ((c = utf8byte(&cur))) {
>> -		if (c < 0)
>> -			return -EINVAL;
>> -		hash = partial_name_hash((unsigned char)c, hash);
>> -	}
>> -	str->hash = end_name_hash(hash);
>> -	return 0;
>> -}
>> +int _utf8_casefold(const struct unicode_map *um, const struct qstr *str,
>> +		   unsigned char *dest, size_t dlen)
>> +{
>> +	return 0;
>> +}
>> -EXPORT_SYMBOL(unicode_casefold_hash);
>>   
>> -int unicode_normalize(const struct unicode_map *um, const struct qstr *str,
>> -		      unsigned char *dest, size_t dlen)
>> -{
>> -	const struct utf8data *data = utf8nfdi(um->version);
>> -	struct utf8cursor cur;
>> -	ssize_t nlen = 0;
>> -
>> -	if (utf8ncursor(&cur, data, str->name, str->len) < 0)
>> -		return -EINVAL;
>> -
>> -	for (nlen = 0; nlen < dlen; nlen++) {
>> -		int c = utf8byte(&cur);
>> -
>> -		dest[nlen] = c;
>> -		if (!c)
>> -			return nlen;
>> -		if (c == -1)
>> -			break;
>> -	}
>> -	return -EINVAL;
>> -}
>> +int _utf8_casefold_hash(const struct unicode_map *um, const void *salt,
>> +			struct qstr *str)
>> +{
>> +	return 0;
>> +}
>> +
>> +struct unicode_map *_utf8_load(const char *version)
>> +{
>> +	return NULL;
>> +}
>> -EXPORT_SYMBOL(unicode_normalize);
>>   
>> -static int unicode_parse_version(const char *version, unsigned int *maj,
>> -				 unsigned int *min, unsigned int *rev)
>> -{
>> -	substring_t args[3];
>> -	char version_string[12];
>> -	static const struct match_token token[] = {
>> -		{1, "%d.%d.%d"},
>> -		{0, NULL}
>> -	};
>> -
>> -	int ret = strscpy(version_string, version, sizeof(version_string));
>> -
>> -	if (ret < 0)
>> -		return ret;
>> -
>> -	if (match_token(version_string, token, args) != 1)
>> -		return -EINVAL;
>> -
>> -	if (match_int(&args[0], maj) || match_int(&args[1], min) ||
>> -	    match_int(&args[2], rev))
>> -		return -EINVAL;
>> -
>> -	return 0;
>> -}
>> +void _utf8_unload(struct unicode_map *um)
>> +{
>> +	return;
>> +}
>> +
>> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
>> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
>> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
>> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
>> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
>> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
>> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
>> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
>> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
>> +EXPORT_STATIC_CALL(utf8_strncmp);
>> +EXPORT_STATIC_CALL(utf8_strncasecmp);
>> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
> I'm having a hard time understanding why some use
> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
> static call API is new to me :).  None of this can be called if the
> module is not loaded anyway, so perhaps the default function can just be
> NULL, per the documentation of include/linux/static_call.h?
>
> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
> equivalent EXPORT_STATIC_CALL?


These functions aren't used by utf8-selftest.c files and hence there is 
no need to
export them.


>> +
>> +static int unicode_load_module(void)
>> +{
>> +	int ret = request_module("utf8");
>> +
>> +	if (ret) {
>> +		pr_err("Failed to load UTF-8 module\n");
>> +		return ret;
>> +	}
>> +	return 0;
>> +}
>>   
>>   struct unicode_map *unicode_load(const char *version)
>> -{
>> -	struct unicode_map *um = NULL;
>> -	int unicode_version;
>> -
>> -	if (version) {
>> -		unsigned int maj, min, rev;
>> -
>> -		if (unicode_parse_version(version, &maj, &min, &rev) < 0)
>> -			return ERR_PTR(-EINVAL);
>> -
>> -		if (!utf8version_is_supported(maj, min, rev))
>> -			return ERR_PTR(-EINVAL);
>> -
>> -		unicode_version = UNICODE_AGE(maj, min, rev);
>> -	} else {
>> -		unicode_version = utf8version_latest();
>> -		printk(KERN_WARNING"UTF-8 version not specified. "
>> -		       "Assuming latest supported version (%d.%d.%d).",
>> -		       (unicode_version >> 16) & 0xff,
>> -		       (unicode_version >> 8) & 0xff,
>> -		       (unicode_version & 0xff));
>> -	}
>> -
>> -	um = kzalloc(sizeof(struct unicode_map), GFP_KERNEL);
>> -	if (!um)
>> -		return ERR_PTR(-ENOMEM);
>> -
>> -	um->charset = "UTF-8";
>> -	um->version = unicode_version;
>> -
>> -	return um;
>> -}
>> +{
>> +	int ret = unicode_load_module();
>> +
>> +	if (ret)
>> +		return ERR_PTR(ret);
>> +
>> +	spin_lock(&utf8ops_lock);
>> +	if (!utf8_ops || !try_module_get(utf8_ops->owner)) {
>> +		spin_unlock(&utf8ops_lock);
>> +		return ERR_PTR(-ENODEV);
>> +	} else {
>> +		spin_unlock(&utf8ops_lock);
>> +		return static_call(utf8_load)(version);
>> +	}
>> +}
>>   EXPORT_SYMBOL(unicode_load);
>>   
>>   void unicode_unload(struct unicode_map *um)
>>   {
>> -	kfree(um);
>> +	if (WARN_ON(!utf8_ops))
>> +		return;
>> +
>> +	module_put(utf8_ops->owner);
>> +	static_call(utf8_unload)(um);
> The module reference drop should happen after utf8_unload to prevent
> calling utf8_unload after it is removed if you race with module removal.
>


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
  2021-03-23 20:29       ` [f2fs-dev] " Eric Biggers
@ 2021-03-23 22:18         ` Shreeya Patel
  -1 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 22:18 UTC (permalink / raw)
  To: Eric Biggers, Gabriel Krisman Bertazi
  Cc: tytso, adilger.kernel, jaegeuk, chao, drosen, yuchao0,
	linux-ext4, linux-kernel, linux-f2fs-devel, linux-fsdevel,
	kernel, andre.almeida


On 24/03/21 1:59 am, Eric Biggers wrote:
> On Tue, Mar 23, 2021 at 03:51:44PM -0400, Gabriel Krisman Bertazi wrote:
>>> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>>> -{
>>> -	const struct utf8data *data = utf8nfdi(um->version);
>>> -
>>> -	if (utf8nlen(data, str->name, str->len) < 0)
>>> -		return -1;
>>> -	return 0;
>>> -}
>>> +struct unicode_ops *utf8_ops;
>>> +EXPORT_SYMBOL(utf8_ops);
>>> +
>>> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
>>> +{
>>> +	return 0;
>>> +}
>>> -EXPORT_SYMBOL(unicode_validate);
>> I think that any calls to the default static calls should return errors
>> instead of succeeding without doing anything.
>>
>> In fact, are the default calls really necessary?  If someone gets here,
>> there is a bug elsewhere, so WARN_ON and maybe -EIO.
>>
>> int unicode_validate_default_static_call(...)
>> {
>>     WARN_ON(1);
>>     return -EIO;
>> }
>>
>> Or just have a NULL default, as I mentioned below, if that is possible.
>>
> [...]
>>> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
>>> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
>>> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
>>> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
>>> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
>>> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
>>> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
>>> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
>>> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
>>> +EXPORT_STATIC_CALL(utf8_strncmp);
>>> +EXPORT_STATIC_CALL(utf8_strncasecmp);
>>> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
>> I'm having a hard time understanding why some use
>> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
>> static call API is new to me :).  None of this can be called if the
>> module is not loaded anyway, so perhaps the default function can just be
>> NULL, per the documentation of include/linux/static_call.h?
>>
>> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
>> equivalent EXPORT_STATIC_CALL?
>>
> The static_call API is fairly new to me too.  But the intent of this patch seems
> to be that none of the utf8 functions are called without the utf8 module loaded.
> If they are called, it's a kernel bug.  So there are two options for what to do
> if it happens anyway:
>
>    1. call a "null" static call, which does nothing
>
> *or*
>
>    2. call a default function which does WARN_ON_ONCE() and returns an error if
>       possible.
>
> (or 3. don't use static calls and instead dereference a NULL utf8_ops like
> previous versions of this patch did.)
>
> It shouldn't really matter which of these approaches you take, but please be
> consistent and use the same one everywhere.
>
>> + void unicode_unregister(void)
>> + {
>> +         spin_lock(&utf8ops_lock);
>> +         utf8_ops = NULL;
>> +         spin_unlock(&utf8ops_lock);
>> + }
>> + EXPORT_SYMBOL(unicode_unregister);
> This should restore the static calls to their default values (either NULL or the
> default functions, depending on what you decide).
>
> Also, it's weird to still have the utf8_ops structure when using static calls.
> It seems it should be one way or the other: static calls *or* utf8_ops.
>
> The static calls could be exported, and the module could be responsible for
> updating them.  That would eliminate the need for utf8_ops.


Hmmm yes, I think we are just using utf8_ops for getting the owner details
which we can now remove and instead pass it as an argument while 
registering the module.
Will make this change in v4. Thanks


>
> - Eric

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [f2fs-dev] [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer
@ 2021-03-23 22:18         ` Shreeya Patel
  0 siblings, 0 replies; 28+ messages in thread
From: Shreeya Patel @ 2021-03-23 22:18 UTC (permalink / raw)
  To: Eric Biggers, Gabriel Krisman Bertazi
  Cc: tytso, drosen, linux-kernel, linux-f2fs-devel, kernel,
	adilger.kernel, linux-fsdevel, jaegeuk, andre.almeida,
	linux-ext4


On 24/03/21 1:59 am, Eric Biggers wrote:
> On Tue, Mar 23, 2021 at 03:51:44PM -0400, Gabriel Krisman Bertazi wrote:
>>> -int unicode_validate(const struct unicode_map *um, const struct qstr *str)
>>> -{
>>> -	const struct utf8data *data = utf8nfdi(um->version);
>>> -
>>> -	if (utf8nlen(data, str->name, str->len) < 0)
>>> -		return -1;
>>> -	return 0;
>>> -}
>>> +struct unicode_ops *utf8_ops;
>>> +EXPORT_SYMBOL(utf8_ops);
>>> +
>>> +int _utf8_validate(const struct unicode_map *um, const struct qstr *str)
>>> +{
>>> +	return 0;
>>> +}
>>> -EXPORT_SYMBOL(unicode_validate);
>> I think that any calls to the default static calls should return errors
>> instead of succeeding without doing anything.
>>
>> In fact, are the default calls really necessary?  If someone gets here,
>> there is a bug elsewhere, so WARN_ON and maybe -EIO.
>>
>> int unicode_validate_default_static_call(...)
>> {
>>     WARN_ON(1);
>>     return -EIO;
>> }
>>
>> Or just have a NULL default, as I mentioned below, if that is possible.
>>
> [...]
>>> +DEFINE_STATIC_CALL(utf8_validate, _utf8_validate);
>>> +DEFINE_STATIC_CALL(utf8_strncmp, _utf8_strncmp);
>>> +DEFINE_STATIC_CALL(utf8_strncasecmp, _utf8_strncasecmp);
>>> +DEFINE_STATIC_CALL(utf8_strncasecmp_folded, _utf8_strncasecmp_folded);
>>> +DEFINE_STATIC_CALL(utf8_normalize, _utf8_normalize);
>>> +DEFINE_STATIC_CALL(utf8_casefold, _utf8_casefold);
>>> +DEFINE_STATIC_CALL(utf8_casefold_hash, _utf8_casefold_hash);
>>> +DEFINE_STATIC_CALL(utf8_load, _utf8_load);
>>> +DEFINE_STATIC_CALL_NULL(utf8_unload, _utf8_unload);
>>> +EXPORT_STATIC_CALL(utf8_strncmp);
>>> +EXPORT_STATIC_CALL(utf8_strncasecmp);
>>> +EXPORT_STATIC_CALL(utf8_strncasecmp_folded);
>> I'm having a hard time understanding why some use
>> DEFINE_STATIC_CALL_NULL, while other use DEFINE_STATIC_CALL.  This new
>> static call API is new to me :).  None of this can be called if the
>> module is not loaded anyway, so perhaps the default function can just be
>> NULL, per the documentation of include/linux/static_call.h?
>>
>> Anyway, Aren't utf8_{validate,casefold,normalize} missing the
>> equivalent EXPORT_STATIC_CALL?
>>
> The static_call API is fairly new to me too.  But the intent of this patch seems
> to be that none of the utf8 functions are called without the utf8 module loaded.
> If they are called, it's a kernel bug.  So there are two options for what to do
> if it happens anyway:
>
>    1. call a "null" static call, which does nothing
>
> *or*
>
>    2. call a default function which does WARN_ON_ONCE() and returns an error if
>       possible.
>
> (or 3. don't use static calls and instead dereference a NULL utf8_ops like
> previous versions of this patch did.)
>
> It shouldn't really matter which of these approaches you take, but please be
> consistent and use the same one everywhere.
>
>> + void unicode_unregister(void)
>> + {
>> +         spin_lock(&utf8ops_lock);
>> +         utf8_ops = NULL;
>> +         spin_unlock(&utf8ops_lock);
>> + }
>> + EXPORT_SYMBOL(unicode_unregister);
> This should restore the static calls to their default values (either NULL or the
> default functions, depending on what you decide).
>
> Also, it's weird to still have the utf8_ops structure when using static calls.
> It seems it should be one way or the other: static calls *or* utf8_ops.
>
> The static calls could be exported, and the module could be responsible for
> updating them.  That would eliminate the need for utf8_ops.


Hmmm yes, I think we are just using utf8_ops for getting the owner details
which we can now remove and instead pass it as an argument while 
registering the module.
Will make this change in v4. Thanks


>
> - Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-03-23 22:20 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-23 18:31 [PATCH v3 0/4] Make UTF-8 encoding loadable Shreeya Patel
2021-03-23 18:31 ` [f2fs-dev] " Shreeya Patel
2021-03-23 18:31 ` [PATCH v3 1/5] fs: unicode: Use strscpy() instead of strncpy() Shreeya Patel
2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
2021-03-23 19:09   ` Gabriel Krisman Bertazi
2021-03-23 19:09     ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-03-23 18:31 ` [PATCH v3 2/5] fs: Check if utf8 encoding is loaded before calling utf8_unload() Shreeya Patel
2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
2021-03-23 19:10   ` Gabriel Krisman Bertazi
2021-03-23 19:10     ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-03-23 18:31 ` [PATCH v3 3/5] fs: unicode: Rename function names from utf8 to unicode Shreeya Patel
2021-03-23 18:31   ` [f2fs-dev] " Shreeya Patel
2021-03-23 19:14   ` Gabriel Krisman Bertazi
2021-03-23 19:14     ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-03-23 18:32 ` [PATCH v3 4/5] fs: unicode: Rename utf8-core file to unicode-core Shreeya Patel
2021-03-23 18:32   ` [f2fs-dev] " Shreeya Patel
2021-03-23 19:15   ` Gabriel Krisman Bertazi
2021-03-23 19:15     ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-03-23 18:32 ` [PATCH v3 5/5] fs: unicode: Add utf8 module and a unicode layer Shreeya Patel
2021-03-23 18:32   ` [f2fs-dev] " Shreeya Patel
2021-03-23 19:51   ` Gabriel Krisman Bertazi
2021-03-23 19:51     ` [f2fs-dev] " Gabriel Krisman Bertazi
2021-03-23 20:29     ` Eric Biggers
2021-03-23 20:29       ` [f2fs-dev] " Eric Biggers
2021-03-23 22:18       ` Shreeya Patel
2021-03-23 22:18         ` [f2fs-dev] " Shreeya Patel
2021-03-23 22:12     ` Shreeya Patel
2021-03-23 22:12       ` [f2fs-dev] " Shreeya Patel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.