All of lore.kernel.org
 help / color / mirror / Atom feed
From: Eric Biggers <ebiggers@kernel.org>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: tytso@mit.edu, adilger.kernel@dilger.ca, jaegeuk@kernel.org,
	linux-ext4@vger.kernel.org,
	linux-f2fs-devel@lists.sourceforge.net, kernel@collabora.com
Subject: Re: [PATCH v4 04/10] ext4: Implement ci comparison using unicode_name
Date: Wed, 11 May 2022 22:35:23 -0700	[thread overview]
Message-ID: <Ynycm9QGS7MIU4io@sol.localdomain> (raw)
In-Reply-To: <20220511193146.27526-5-krisman@collabora.com>

On Wed, May 11, 2022 at 03:31:40PM -0400, Gabriel Krisman Bertazi wrote:
> By using a new type here, we can hide most of the caching casefold logic
> from ext4.  The condition in ext4_match is now quite redundant, but this
> is addressed in the next patch.
> 
> This doesn't use ext4_filename to keep it generic, since the function
> will be moved to libfs to be shared with f2fs.
> 
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> 
> --
> Changes since v1:
>   - Instead of (ab)using fscrypt_name, create a new type (ebiggers).
> ---
>  fs/ext4/namei.c    | 32 +++++++++++++++-----------------
>  include/linux/fs.h |  5 +++++
>  2 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 84fdb23f09b8..5296ced2e43e 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1321,20 +1321,19 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>  /**
>   * ext4_match_ci() - Match (case-insensitive) a name with a dirent.
>   * @parent: Inode of the parent of the dentry.
> - * @name: name under lookup.
> + * @uname: name under lookup.
>   * @de_name: Dirent name.
>   * @de_name_len: dirent name length.
> - * @quick: whether @name is already casefolded.
>   *
>   * Test whether a case-insensitive directory entry matches the filename
> - * being searched.  If quick is set, the @name being looked up is
> - * already in the casefolded form.
> + * being searched.
>   *
>   * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>   * < 0 on error.
>   */
> -static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
> -			 u8 *de_name, size_t de_name_len, bool quick)
> +static int ext4_match_ci(const struct inode *parent,
> +			 const struct unicode_name *uname,
> +			 u8 *de_name, size_t de_name_len)
>  {
>  	const struct super_block *sb = parent->i_sb;
>  	const struct unicode_map *um = sb->s_encoding;
> @@ -1357,10 +1356,10 @@ static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
>  		entry.len = decrypted_name.len;
>  	}
>  
> -	if (quick)
> -		ret = utf8_strncasecmp_folded(um, name, &entry);
> +	if (uname->folded_name->name)
> +		ret = utf8_strncasecmp_folded(um, uname->folded_name, &entry);
>  	else
> -		ret = utf8_strncasecmp(um, name, &entry);
> +		ret = utf8_strncasecmp(um, uname->usr_name, &entry);
>  
>  	if (!ret)
>  		match = true;
> @@ -1370,8 +1369,8 @@ static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
>  		 * the names have invalid characters.
>  		 */
>  		ret = 0;
> -		match = ((name->len == entry.len) &&
> -			 !memcmp(name->name, entry.name, entry.len));
> +		match = ((uname->usr_name->len == entry.len) &&
> +			 !memcmp(uname->usr_name->name, entry.name, entry.len));
>  	}
>  
>  out:
> @@ -1441,6 +1440,10 @@ static bool ext4_match(struct inode *parent,
>  #if IS_ENABLED(CONFIG_UNICODE)
>  	if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
>  	    (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
> +		struct unicode_name u = {
> +			.folded_name = &fname->cf_name,
> +			.usr_name = fname->usr_fname
> +		};
>  		int ret;
>  
>  		if (fname->cf_name.name) {
> @@ -1452,14 +1455,9 @@ static bool ext4_match(struct inode *parent,
>  					return false;
>  				}
>  			}
> -
> -			ret = ext4_match_ci(parent, &fname->cf_name, de->name,
> -					    de->name_len, true);
> -		} else {
> -			ret = ext4_match_ci(parent, fname->usr_fname,
> -					    de->name, de->name_len, false);
>  		}
>  
> +		ret = ext4_match_ci(parent, &u, de->name, de->name_len);
>  		if (ret < 0) {
>  			/*
>  			 * Treat comparison errors as not a match.  The
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e2d892b201b0..3f76a18a5f40 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3358,6 +3358,11 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);
>  
>  extern int generic_check_addressable(unsigned, u64);
>  
> +struct unicode_name {
> +	const struct qstr *folded_name;
> +	const struct qstr *usr_name;
> +};
> +
>  extern void generic_set_encrypted_ci_d_ops(struct dentry *dentry);
>  
>  #ifdef CONFIG_MIGRATION

I don't really see the point of this.  The only times struct unicode_name gets
used are when one is initialized on the stack for a single call to
generic_ci_match().  So the end result is just that the function prototype is:

int generic_ci_match(const struct inode *parent,
		     const struct unicode_name *uname,
		     const u8 *de_name, size_t de_name_len);

... instead of:

int generic_ci_match(const struct inode *parent, const struct qstr *usr_fname,
		     const struct qstr *folded_name,
		     const u8 *de_name, size_t de_name_len);

So the only effect is to consolidate two parameters into one.  I don't think
it's worth it, given that the struct is being created on-demand.

Also note that filenames are not necessarily valid Unicode, so "unicode_name" is
a bit misleading.

- Eric

WARNING: multiple messages have this Message-ID (diff)
From: Eric Biggers <ebiggers@kernel.org>
To: Gabriel Krisman Bertazi <krisman@collabora.com>
Cc: linux-ext4@vger.kernel.org, tytso@mit.edu,
	linux-f2fs-devel@lists.sourceforge.net, adilger.kernel@dilger.ca,
	jaegeuk@kernel.org, kernel@collabora.com
Subject: Re: [f2fs-dev] [PATCH v4 04/10] ext4: Implement ci comparison using unicode_name
Date: Wed, 11 May 2022 22:35:23 -0700	[thread overview]
Message-ID: <Ynycm9QGS7MIU4io@sol.localdomain> (raw)
In-Reply-To: <20220511193146.27526-5-krisman@collabora.com>

On Wed, May 11, 2022 at 03:31:40PM -0400, Gabriel Krisman Bertazi wrote:
> By using a new type here, we can hide most of the caching casefold logic
> from ext4.  The condition in ext4_match is now quite redundant, but this
> is addressed in the next patch.
> 
> This doesn't use ext4_filename to keep it generic, since the function
> will be moved to libfs to be shared with f2fs.
> 
> Signed-off-by: Gabriel Krisman Bertazi <krisman@collabora.com>
> 
> --
> Changes since v1:
>   - Instead of (ab)using fscrypt_name, create a new type (ebiggers).
> ---
>  fs/ext4/namei.c    | 32 +++++++++++++++-----------------
>  include/linux/fs.h |  5 +++++
>  2 files changed, 20 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 84fdb23f09b8..5296ced2e43e 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1321,20 +1321,19 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
>  /**
>   * ext4_match_ci() - Match (case-insensitive) a name with a dirent.
>   * @parent: Inode of the parent of the dentry.
> - * @name: name under lookup.
> + * @uname: name under lookup.
>   * @de_name: Dirent name.
>   * @de_name_len: dirent name length.
> - * @quick: whether @name is already casefolded.
>   *
>   * Test whether a case-insensitive directory entry matches the filename
> - * being searched.  If quick is set, the @name being looked up is
> - * already in the casefolded form.
> + * being searched.
>   *
>   * Return: > 0 if the directory entry matches, 0 if it doesn't match, or
>   * < 0 on error.
>   */
> -static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
> -			 u8 *de_name, size_t de_name_len, bool quick)
> +static int ext4_match_ci(const struct inode *parent,
> +			 const struct unicode_name *uname,
> +			 u8 *de_name, size_t de_name_len)
>  {
>  	const struct super_block *sb = parent->i_sb;
>  	const struct unicode_map *um = sb->s_encoding;
> @@ -1357,10 +1356,10 @@ static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
>  		entry.len = decrypted_name.len;
>  	}
>  
> -	if (quick)
> -		ret = utf8_strncasecmp_folded(um, name, &entry);
> +	if (uname->folded_name->name)
> +		ret = utf8_strncasecmp_folded(um, uname->folded_name, &entry);
>  	else
> -		ret = utf8_strncasecmp(um, name, &entry);
> +		ret = utf8_strncasecmp(um, uname->usr_name, &entry);
>  
>  	if (!ret)
>  		match = true;
> @@ -1370,8 +1369,8 @@ static int ext4_match_ci(const struct inode *parent, const struct qstr *name,
>  		 * the names have invalid characters.
>  		 */
>  		ret = 0;
> -		match = ((name->len == entry.len) &&
> -			 !memcmp(name->name, entry.name, entry.len));
> +		match = ((uname->usr_name->len == entry.len) &&
> +			 !memcmp(uname->usr_name->name, entry.name, entry.len));
>  	}
>  
>  out:
> @@ -1441,6 +1440,10 @@ static bool ext4_match(struct inode *parent,
>  #if IS_ENABLED(CONFIG_UNICODE)
>  	if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent) &&
>  	    (!IS_ENCRYPTED(parent) || fscrypt_has_encryption_key(parent))) {
> +		struct unicode_name u = {
> +			.folded_name = &fname->cf_name,
> +			.usr_name = fname->usr_fname
> +		};
>  		int ret;
>  
>  		if (fname->cf_name.name) {
> @@ -1452,14 +1455,9 @@ static bool ext4_match(struct inode *parent,
>  					return false;
>  				}
>  			}
> -
> -			ret = ext4_match_ci(parent, &fname->cf_name, de->name,
> -					    de->name_len, true);
> -		} else {
> -			ret = ext4_match_ci(parent, fname->usr_fname,
> -					    de->name, de->name_len, false);
>  		}
>  
> +		ret = ext4_match_ci(parent, &u, de->name, de->name_len);
>  		if (ret < 0) {
>  			/*
>  			 * Treat comparison errors as not a match.  The
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index e2d892b201b0..3f76a18a5f40 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3358,6 +3358,11 @@ extern int generic_file_fsync(struct file *, loff_t, loff_t, int);
>  
>  extern int generic_check_addressable(unsigned, u64);
>  
> +struct unicode_name {
> +	const struct qstr *folded_name;
> +	const struct qstr *usr_name;
> +};
> +
>  extern void generic_set_encrypted_ci_d_ops(struct dentry *dentry);
>  
>  #ifdef CONFIG_MIGRATION

I don't really see the point of this.  The only times struct unicode_name gets
used are when one is initialized on the stack for a single call to
generic_ci_match().  So the end result is just that the function prototype is:

int generic_ci_match(const struct inode *parent,
		     const struct unicode_name *uname,
		     const u8 *de_name, size_t de_name_len);

... instead of:

int generic_ci_match(const struct inode *parent, const struct qstr *usr_fname,
		     const struct qstr *folded_name,
		     const u8 *de_name, size_t de_name_len);

So the only effect is to consolidate two parameters into one.  I don't think
it's worth it, given that the struct is being created on-demand.

Also note that filenames are not necessarily valid Unicode, so "unicode_name" is
a bit misleading.

- Eric


_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel

  reply	other threads:[~2022-05-12  5:35 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-11 19:31 [PATCH v4 00/10] Clean up the case-insensitive lookup path Gabriel Krisman Bertazi
2022-05-11 19:31 ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-11 19:31 ` [PATCH v4 01/10] ext4: Match the f2fs ci_compare implementation Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-11 19:31 ` [PATCH v4 02/10] ext4: Simplify the handling of cached insensitive names Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-11 19:31 ` [PATCH v4 03/10] f2fs: " Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  4:49   ` Eric Biggers
2022-05-12  4:49     ` [f2fs-dev] " Eric Biggers
2022-05-11 19:31 ` [PATCH v4 04/10] ext4: Implement ci comparison using unicode_name Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  5:35   ` Eric Biggers [this message]
2022-05-12  5:35     ` Eric Biggers
2022-05-11 19:31 ` [PATCH v4 05/10] ext4: Simplify hash check on ext4_match Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  5:46   ` Eric Biggers
2022-05-12  5:46     ` [f2fs-dev] " Eric Biggers
2022-05-11 19:31 ` [PATCH v4 06/10] ext4: Log error when lookup of encoded dentry fails Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  5:48   ` Eric Biggers
2022-05-12  5:48     ` [f2fs-dev] " Eric Biggers
2022-05-11 19:31 ` [PATCH v4 07/10] ext4: Move ext4_match_ci into libfs Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  5:24   ` Eric Biggers
2022-05-12  5:24     ` [f2fs-dev] " Eric Biggers
2022-05-11 19:31 ` [PATCH v4 08/10] f2fs: Reuse generic_ci_match for ci comparisons Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-11 19:31 ` [PATCH v4 09/10] ext4: Move CONFIG_UNICODE defguards into the code flow Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  4:59   ` Eric Biggers
2022-05-12  4:59     ` [f2fs-dev] " Eric Biggers
2022-05-11 19:31 ` [PATCH v4 10/10] f2fs: " Gabriel Krisman Bertazi
2022-05-11 19:31   ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-12  4:55   ` Eric Biggers
2022-05-12  4:55     ` [f2fs-dev] " Eric Biggers
2022-05-17 19:37 ` [PATCH v4 00/10] Clean up the case-insensitive lookup path Theodore Ts'o
2022-05-17 19:37   ` [f2fs-dev] " Theodore Ts'o
2022-05-17 19:57   ` Gabriel Krisman Bertazi
2022-05-17 19:57     ` [f2fs-dev] " Gabriel Krisman Bertazi
2022-05-18  0:15     ` Theodore Ts'o
2022-05-18  0:15       ` [f2fs-dev] " Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ynycm9QGS7MIU4io@sol.localdomain \
    --to=ebiggers@kernel.org \
    --cc=adilger.kernel@dilger.ca \
    --cc=jaegeuk@kernel.org \
    --cc=kernel@collabora.com \
    --cc=krisman@collabora.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-f2fs-devel@lists.sourceforge.net \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.