linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/8] Support for Casefolding and Encryption
@ 2019-12-03  5:10 Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2 Daniel Rosenberg
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Ext4 and F2FS currently both support casefolding and encryption, but not at the
same time. These patches aim to rectify that.

Since directory names are stored case preserved, we cannot just take the hash
of the ciphertext. Instead we use the siphash of the casefolded name. With this
we no longer have a direct path from an encrypted name to the hash without the
key. To deal with this, fscrypt now always includes the hash in the name it
presents when the key is not present. There is a pre-existing bug where you can
change parts of the hash and still match the name so long as the disruption to
the hash does not happen to affect lookup on that filesystem. I'm not sure how
to fix that without making ext4 lookups slower in the more common case.

I moved the identical dcache operations for ext4 and f2fs into the VFS, as any
filesystem that uses casefolding will need the same code. This will also allow
further optimizations to that path, although my current changes don't take
advantage of that yet.

For Ext4, this also means that we need to store the hash on disk. We only do so
for encrypted and casefolded directories to avoid on disk format changes.
Previously encryption and casefolding could not live on the same filesystem,
and we're relaxing that requirement. F2fs is a bit more straightforward since
it already stores hashes on disk.

I've updated the related tools with just enough to enable the feature. I still
need to adjust their respective fsck's, although without access to the keys,
they won't be able to verify the hashes of casefolded and encrypted names.


Daniel Rosenberg (8):
  fscrypt: Add siphash and hash key for policy v2
  fscrypt: Don't allow v1 policies with casefolding
  fscrypt: Change format of no-key token
  vfs: Fold casefolding into vfs
  f2fs: Handle casefolding with Encryption
  ext4: Use struct super_blocks' casefold data
  ext4: Hande casefolding with encryption
  ext4: Optimize match for casefolded encrypted dirs

 Documentation/filesystems/ext4/directory.rst |  27 ++
 fs/crypto/Kconfig                            |   1 +
 fs/crypto/fname.c                            | 204 +++++++++---
 fs/crypto/fscrypt_private.h                  |   9 +
 fs/crypto/keysetup.c                         |  29 +-
 fs/crypto/policy.c                           |  26 +-
 fs/dcache.c                                  |  35 ++
 fs/ext4/dir.c                                |  72 +----
 fs/ext4/ext4.h                               |  87 +++--
 fs/ext4/hash.c                               |  26 +-
 fs/ext4/ialloc.c                             |   5 +-
 fs/ext4/inline.c                             |  41 +--
 fs/ext4/namei.c                              | 318 ++++++++++++-------
 fs/ext4/super.c                              |  21 +-
 fs/f2fs/dir.c                                | 115 +++----
 fs/f2fs/f2fs.h                               |  14 +-
 fs/f2fs/hash.c                               |  25 +-
 fs/f2fs/inline.c                             |   9 +-
 fs/f2fs/super.c                              |  17 +-
 fs/f2fs/sysfs.c                              |   8 +-
 fs/inode.c                                   |   8 +
 fs/namei.c                                   |  43 ++-
 include/linux/fs.h                           |  12 +
 include/linux/fscrypt.h                      | 107 +++----
 24 files changed, 797 insertions(+), 462 deletions(-)

-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-03 23:25   ` Eric Biggers
  2019-12-03  5:10 ` [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding Daniel Rosenberg
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

When using casefolding along with encryption, we need to use a
cryptographic hash to allow fast filesystem operations while not knowing
the case of the name stored on disk while not revealing extra
information about the name if the key is not present.

When a v2 policy is used on a directory, we derive a key for use with
siphash.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/crypto/fname.c           | 22 ++++++++++++++++++++++
 fs/crypto/fscrypt_private.h |  9 +++++++++
 fs/crypto/keysetup.c        | 29 ++++++++++++++++++++---------
 include/linux/fscrypt.h     |  8 ++++++++
 4 files changed, 59 insertions(+), 9 deletions(-)

diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index 3da3707c10e3..b33f03b9f892 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -12,6 +12,7 @@
  */
 
 #include <linux/scatterlist.h>
+#include <linux/siphash.h>
 #include <crypto/skcipher.h>
 #include "fscrypt_private.h"
 
@@ -400,3 +401,24 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
 	return ret;
 }
 EXPORT_SYMBOL(fscrypt_setup_filename);
+
+/**
+ * fscrypt_fname_siphash() - Calculate the siphash for a file name
+ * @dir: the parent directory
+ * @name: the name of the file to get the siphash of
+ *
+ * Given a user-provided filename @name, this function calculates the siphash of
+ * that name using the hash key stored with the directory's policy.
+ *
+ *
+ * Return: the siphash of @name using the hash key of @dir
+ */
+u64 fscrypt_fname_siphash(const struct inode *dir, const struct qstr *name)
+{
+	struct fscrypt_info *ci = dir->i_crypt_info;
+
+	WARN_ON(!ci || !ci->ci_hash_key_initialized);
+
+	return siphash(name->name, name->len, &ci->ci_hash_key);
+}
+EXPORT_SYMBOL(fscrypt_fname_siphash);
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index 130b50e5a011..f0dfef9921de 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -12,6 +12,7 @@
 #define _FSCRYPT_PRIVATE_H
 
 #include <linux/fscrypt.h>
+#include <linux/siphash.h>
 #include <crypto/hash.h>
 
 #define CONST_STRLEN(str)	(sizeof(str) - 1)
@@ -194,6 +195,13 @@ struct fscrypt_info {
 	 */
 	struct fscrypt_direct_key *ci_direct_key;
 
+	/*
+	 * With v2 policies, this can be used with siphash
+	 * When the key has been set, ci_hash_key_initialized is set to true
+	 */
+	siphash_key_t ci_hash_key;
+	bool ci_hash_key_initialized;
+
 	/* The encryption policy used by this inode */
 	union fscrypt_policy ci_policy;
 
@@ -286,6 +294,7 @@ extern int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
 #define HKDF_CONTEXT_PER_FILE_KEY	2
 #define HKDF_CONTEXT_DIRECT_KEY		3
 #define HKDF_CONTEXT_IV_INO_LBLK_64_KEY	4
+#define HKDF_CONTEXT_FNAME_HASH_KEY     5
 
 extern int fscrypt_hkdf_expand(struct fscrypt_hkdf *hkdf, u8 context,
 			       const u8 *info, unsigned int infolen,
diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
index f577bb6613f9..e6c7ec04cd25 100644
--- a/fs/crypto/keysetup.c
+++ b/fs/crypto/keysetup.c
@@ -192,7 +192,7 @@ static int fscrypt_setup_v2_file_key(struct fscrypt_info *ci,
 				     ci->ci_mode->friendly_name);
 			return -EINVAL;
 		}
-		return setup_per_mode_key(ci, mk, mk->mk_direct_tfms,
+		err = setup_per_mode_key(ci, mk, mk->mk_direct_tfms,
 					  HKDF_CONTEXT_DIRECT_KEY, false);
 	} else if (ci->ci_policy.v2.flags &
 		   FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) {
@@ -202,20 +202,31 @@ static int fscrypt_setup_v2_file_key(struct fscrypt_info *ci,
 		 * the IVs.  This format is optimized for use with inline
 		 * encryption hardware compliant with the UFS or eMMC standards.
 		 */
-		return setup_per_mode_key(ci, mk, mk->mk_iv_ino_lblk_64_tfms,
+		err = setup_per_mode_key(ci, mk, mk->mk_iv_ino_lblk_64_tfms,
 					  HKDF_CONTEXT_IV_INO_LBLK_64_KEY,
 					  true);
-	}
-
-	err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
+	} else {
+		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
 				  HKDF_CONTEXT_PER_FILE_KEY,
 				  ci->ci_nonce, FS_KEY_DERIVATION_NONCE_SIZE,
 				  derived_key, ci->ci_mode->keysize);
-	if (err)
-		return err;
+		if (err)
+			return err;
+
+		err = fscrypt_set_derived_key(ci, derived_key);
+		memzero_explicit(derived_key, ci->ci_mode->keysize);
+		if (err)
+			return err;
+	}
 
-	err = fscrypt_set_derived_key(ci, derived_key);
-	memzero_explicit(derived_key, ci->ci_mode->keysize);
+	if (S_ISDIR(ci->ci_inode->i_mode)) {
+		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
+			  HKDF_CONTEXT_FNAME_HASH_KEY,
+			  ci->ci_nonce, FS_KEY_DERIVATION_NONCE_SIZE,
+			  (u8 *)&ci->ci_hash_key, sizeof(ci->ci_hash_key));
+		if (!err)
+			ci->ci_hash_key_initialized = true;
+	}
 	return err;
 }
 
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 1a7bffe78ed5..e13ff68a99f0 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -155,6 +155,8 @@ extern int fscrypt_fname_alloc_buffer(const struct inode *, u32,
 extern void fscrypt_fname_free_buffer(struct fscrypt_str *);
 extern int fscrypt_fname_disk_to_usr(struct inode *, u32, u32,
 			const struct fscrypt_str *, struct fscrypt_str *);
+extern u64 fscrypt_fname_siphash(const struct inode *dir,
+					const struct qstr *name);
 
 #define FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE	32
 
@@ -446,6 +448,12 @@ static inline int fscrypt_fname_disk_to_usr(struct inode *inode,
 	return -EOPNOTSUPP;
 }
 
+static inline u64 fscrypt_fname_siphash(const struct inode *inode,
+					const struct qstr *name)
+{
+	return 0;
+}
+
 static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
 				      const u8 *de_name, u32 de_name_len)
 {
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2 Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-03 23:37   ` Eric Biggers
  2019-12-03  5:10 ` [PATCH 3/8] fscrypt: Change format of no-key token Daniel Rosenberg
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Casefolding requires a derived key for computing the siphash.
This is available for v2 policies, but not v1, so we disallow it for v1.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/crypto/policy.c      | 26 +++++++++++++++++++++++---
 fs/inode.c              |  8 ++++++++
 include/linux/fscrypt.h |  7 +++++++
 3 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c
index 96f528071bed..94d96d3212d6 100644
--- a/fs/crypto/policy.c
+++ b/fs/crypto/policy.c
@@ -67,9 +67,9 @@ static bool supported_iv_ino_lblk_64_policy(
  * fscrypt_supported_policy - check whether an encryption policy is supported
  *
  * Given an encryption policy, check whether all its encryption modes and other
- * settings are supported by this kernel.  (But we don't currently don't check
- * for crypto API support here, so attempting to use an algorithm not configured
- * into the crypto API will still fail later.)
+ * settings are supported by this kernel on the given inode.  (But we don't
+ * currently don't check for crypto API support here, so attempting to use an
+ * algorithm not configured into the crypto API will still fail later.)
  *
  * Return: %true if supported, else %false
  */
@@ -97,6 +97,12 @@ bool fscrypt_supported_policy(const union fscrypt_policy *policy_u,
 			return false;
 		}
 
+		if (IS_CASEFOLDED(inode)) {
+			fscrypt_warn(inode,
+				     "v1 policy does not support casefolded directories");
+			return false;
+		}
+
 		return true;
 	}
 	case FSCRYPT_POLICY_V2: {
@@ -530,3 +536,17 @@ int fscrypt_inherit_context(struct inode *parent, struct inode *child,
 	return preload ? fscrypt_get_encryption_info(child): 0;
 }
 EXPORT_SYMBOL(fscrypt_inherit_context);
+
+int fscrypt_set_casefolding_allowed(struct inode *inode)
+{
+	union fscrypt_policy policy;
+	int ret = fscrypt_get_policy(inode, &policy);
+
+	if (ret < 0)
+		return ret;
+
+	if (policy.version == FSCRYPT_POLICY_V2)
+		return 0;
+	else
+		return -EINVAL;
+}
diff --git a/fs/inode.c b/fs/inode.c
index fef457a42882..b615ec272a1e 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -20,6 +20,7 @@
 #include <linux/ratelimit.h>
 #include <linux/list_lru.h>
 #include <linux/iversion.h>
+#include <linux/fscrypt.h>
 #include <trace/events/writeback.h>
 #include "internal.h"
 
@@ -2245,6 +2246,13 @@ int vfs_ioc_setflags_prepare(struct inode *inode, unsigned int oldflags,
 	    !capable(CAP_LINUX_IMMUTABLE))
 		return -EPERM;
 
+	/*
+	 * When a directory is encrypted, the CASEFOLD flag can only be turned
+	 * on if the fscrypt policy supports it.
+	 */
+	if (IS_ENCRYPTED(inode) && (flags & ~oldflags & FS_CASEFOLD_FL))
+		return fscrypt_set_casefolding_allowed(inode);
+
 	return 0;
 }
 EXPORT_SYMBOL(vfs_ioc_setflags_prepare);
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index e13ff68a99f0..028aed925e51 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -127,6 +127,8 @@ extern int fscrypt_ioctl_get_policy_ex(struct file *, void __user *);
 extern int fscrypt_has_permitted_context(struct inode *, struct inode *);
 extern int fscrypt_inherit_context(struct inode *, struct inode *,
 					void *, bool);
+extern int fscrypt_set_casefolding_allowed(struct inode *inode);
+
 /* keyring.c */
 extern void fscrypt_sb_free(struct super_block *sb);
 extern int fscrypt_ioctl_add_key(struct file *filp, void __user *arg);
@@ -361,6 +363,11 @@ static inline int fscrypt_inherit_context(struct inode *parent,
 	return -EOPNOTSUPP;
 }
 
+static inline int fscrypt_set_casefolding_allowed(struct inode *inode)
+{
+	return 0;
+}
+
 /* keyring.c */
 static inline void fscrypt_sb_free(struct super_block *sb)
 {
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/8] fscrypt: Change format of no-key token
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2 Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-04  0:09   ` Eric Biggers
  2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Encrypted and casefolded names always require a dirtree hash, since
their hash values cannot be generated without the key.

In the new format, we always base64 encode the same structure. For names
that are less than 149 characters, we concatenate the provided hash and
ciphertext. If the name is longer than 149 characters, we also include
the sha256 of the remaining parts of the name. We then base64 encode the
resulting data to get a representation of the name that is at most 252
characters long, with a very low collision rate. We avoid needing to
compute the sha256 apart from in the case of a very long filename, and
then only need to compute the sha256 of possible matches if their
ciphertext is also longer than 149.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/crypto/Kconfig       |   1 +
 fs/crypto/fname.c       | 182 +++++++++++++++++++++++++++++-----------
 include/linux/fscrypt.h |  92 ++++++--------------
 3 files changed, 160 insertions(+), 115 deletions(-)

diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
index ff5a1746cbae..6e0d56f0b993 100644
--- a/fs/crypto/Kconfig
+++ b/fs/crypto/Kconfig
@@ -9,6 +9,7 @@ config FS_ENCRYPTION
 	select CRYPTO_CTS
 	select CRYPTO_SHA512
 	select CRYPTO_HMAC
+	select CRYPTO_SHA256
 	select KEYS
 	help
 	  Enable encryption of files and directories.  This
diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
index b33f03b9f892..03c837c461f2 100644
--- a/fs/crypto/fname.c
+++ b/fs/crypto/fname.c
@@ -14,8 +14,46 @@
 #include <linux/scatterlist.h>
 #include <linux/siphash.h>
 #include <crypto/skcipher.h>
+#include <crypto/hash.h>
 #include "fscrypt_private.h"
 
+static struct crypto_shash *sha256_hash_tfm;
+
+static int fscrypt_do_sha256(unsigned char *result,
+	     const u8 *data, unsigned int data_len)
+{
+	struct crypto_shash *tfm = READ_ONCE(sha256_hash_tfm);
+
+	if (unlikely(!tfm)) {
+		struct crypto_shash *prev_tfm;
+
+		tfm = crypto_alloc_shash("sha256", 0, 0);
+		if (IS_ERR(tfm)) {
+			if (PTR_ERR(tfm) == -ENOENT) {
+				fscrypt_warn(NULL,
+					     "Missing crypto API support for SHA-256");
+				return -ENOPKG;
+			}
+			fscrypt_err(NULL,
+				    "Error allocating SHA-256 transform: %ld",
+				    PTR_ERR(tfm));
+			return PTR_ERR(tfm);
+		}
+		prev_tfm = cmpxchg(&sha256_hash_tfm, NULL, tfm);
+		if (prev_tfm) {
+			crypto_free_shash(tfm);
+			tfm = prev_tfm;
+		}
+	}
+	{
+		SHASH_DESC_ON_STACK(desc, tfm);
+
+		desc->tfm = tfm;
+
+		return crypto_shash_digest(desc, data, data_len, result);
+	}
+}
+
 static inline bool fscrypt_is_dot_dotdot(const struct qstr *str)
 {
 	if (str->len == 1 && str->name[0] == '.')
@@ -208,8 +246,7 @@ int fscrypt_fname_alloc_buffer(const struct inode *inode,
 			       struct fscrypt_str *crypto_str)
 {
 	const u32 max_encoded_len =
-		max_t(u32, BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE),
-		      1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name)));
+		      BASE64_CHARS(sizeof(struct fscrypt_nokey_name));
 	u32 max_presented_len;
 
 	max_presented_len = max(max_encoded_len, max_encrypted_len);
@@ -243,8 +280,9 @@ EXPORT_SYMBOL(fscrypt_fname_free_buffer);
  * The caller must have allocated sufficient memory for the @oname string.
  *
  * If the key is available, we'll decrypt the disk name; otherwise, we'll encode
- * it for presentation.  Short names are directly base64-encoded, while long
- * names are encoded in fscrypt_digested_name format.
+ * it for presentation.  The usr name is the base64 encoding of the dirtree hash
+ * value, the first 149 characters of the name, and the sha256 of the rest of
+ * the name, if longer than 149 characters.
  *
  * Return: 0 on success, -errno on failure
  */
@@ -254,7 +292,9 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
 			struct fscrypt_str *oname)
 {
 	const struct qstr qname = FSTR_TO_QSTR(iname);
-	struct fscrypt_digested_name digested_name;
+	struct fscrypt_nokey_name nokey_name;
+	u32 size;
+	int err = 0;
 
 	if (fscrypt_is_dot_dotdot(&qname)) {
 		oname->name[0] = '.';
@@ -269,25 +309,29 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
 	if (fscrypt_has_encryption_key(inode))
 		return fname_decrypt(inode, iname, oname);
 
-	if (iname->len <= FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE) {
-		oname->len = base64_encode(iname->name, iname->len,
-					   oname->name);
-		return 0;
-	}
+	size = min_t(u32, iname->len, FSCRYPT_FNAME_UNDIGESTED_SIZE);
+		memcpy(nokey_name.bytes, iname->name, size);
+
 	if (hash) {
-		digested_name.hash = hash;
-		digested_name.minor_hash = minor_hash;
+		nokey_name.dirtree_hash[0] = hash;
+		nokey_name.dirtree_hash[1] = minor_hash;
 	} else {
-		digested_name.hash = 0;
-		digested_name.minor_hash = 0;
+		nokey_name.dirtree_hash[0] = 0;
+		nokey_name.dirtree_hash[1] = 0;
 	}
-	memcpy(digested_name.digest,
-	       FSCRYPT_FNAME_DIGEST(iname->name, iname->len),
-	       FSCRYPT_FNAME_DIGEST_SIZE);
-	oname->name[0] = '_';
-	oname->len = 1 + base64_encode((const u8 *)&digested_name,
-				       sizeof(digested_name), oname->name + 1);
-	return 0;
+	size += sizeof(nokey_name.dirtree_hash);
+
+	if (iname->len > FSCRYPT_FNAME_UNDIGESTED_SIZE) {
+		/* compute sha256 of remaining name */
+		err = fscrypt_do_sha256(nokey_name.sha256,
+				&iname->name[FSCRYPT_FNAME_UNDIGESTED_SIZE],
+				iname->len - FSCRYPT_FNAME_UNDIGESTED_SIZE);
+		if (err)
+			return err;
+		size += sizeof(nokey_name.sha256);
+	}
+	oname->len = base64_encode((const u8 *)&nokey_name, size, oname->name);
+	return err;
 }
 EXPORT_SYMBOL(fscrypt_fname_disk_to_usr);
 
@@ -319,7 +363,6 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
 			      int lookup, struct fscrypt_name *fname)
 {
 	int ret;
-	int digested;
 
 	memset(fname, 0, sizeof(struct fscrypt_name));
 	fname->usr_fname = iname;
@@ -359,42 +402,32 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
 	 * We don't have the key and we are doing a lookup; decode the
 	 * user-supplied name
 	 */
-	if (iname->name[0] == '_') {
-		if (iname->len !=
-		    1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name)))
-			return -ENOENT;
-		digested = 1;
-	} else {
-		if (iname->len >
-		    BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE))
-			return -ENOENT;
-		digested = 0;
-	}
 
 	fname->crypto_buf.name =
-		kmalloc(max_t(size_t, FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE,
-			      sizeof(struct fscrypt_digested_name)),
-			GFP_KERNEL);
+			kmalloc(sizeof(struct fscrypt_nokey_name), GFP_KERNEL);
 	if (fname->crypto_buf.name == NULL)
 		return -ENOMEM;
 
-	ret = base64_decode(iname->name + digested, iname->len - digested,
+	ret = base64_decode(iname->name, iname->len,
 			    fname->crypto_buf.name);
 	if (ret < 0) {
 		ret = -ENOENT;
 		goto errout;
 	}
-	fname->crypto_buf.len = ret;
-	if (digested) {
-		const struct fscrypt_digested_name *n =
-			(const void *)fname->crypto_buf.name;
-		fname->hash = n->hash;
-		fname->minor_hash = n->minor_hash;
-	} else {
-		fname->disk_name.name = fname->crypto_buf.name;
-		fname->disk_name.len = fname->crypto_buf.len;
+	if (ret > (int)offsetofend(struct fscrypt_nokey_name, sha256)) {
+		ret = -EINVAL;
+		goto errout;
+	}
+
+	{
+		struct fscrypt_nokey_name *n =
+				(void *)fname->crypto_buf.name;
+		fname->crypto_buf.len = ret;
+
+		fname->hash = n->dirtree_hash[0];
+		fname->minor_hash = n->dirtree_hash[1];
+		return 0;
 	}
-	return 0;
 
 errout:
 	kfree(fname->crypto_buf.name);
@@ -402,6 +435,61 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
 }
 EXPORT_SYMBOL(fscrypt_setup_filename);
 
+/**
+ * fscrypt_match_name() - test whether the given name matches a directory entry
+ * @fname: the name being searched for
+ * @de_name: the name from the directory entry
+ * @de_name_len: the length of @de_name in bytes
+ *
+ * Normally @fname->disk_name will be set, and in that case we simply compare
+ * that to the name stored in the directory entry.  The only exception is that
+ * if we don't have the key for an encrypted directory and a filename in it is
+ * very long, then we won't have the full disk_name and we'll instead need to
+ * match against the fscrypt_digested_name.
+ *
+ * Return: %true if the name matches, otherwise %false.
+ */
+bool fscrypt_match_name(const struct fscrypt_name *fname,
+				      const u8 *de_name, u32 de_name_len)
+{
+	if (unlikely(!fname->disk_name.name)) {
+		const struct fscrypt_nokey_name *n =
+			(const void *)fname->crypto_buf.name;
+		u32 len;
+		bool check_sha256 = false;
+		u8 sha256[SHA256_DIGEST_SIZE];
+
+		if (fname->crypto_buf.len ==
+			    offsetofend(struct fscrypt_nokey_name, sha256)) {
+			len = FSCRYPT_FNAME_UNDIGESTED_SIZE;
+			check_sha256 = true;
+		} else {
+			len = fname->crypto_buf.len -
+				offsetof(struct fscrypt_nokey_name, bytes);
+		}
+		if (!check_sha256 && de_name_len != len)
+			return false;
+		if (check_sha256 && de_name_len <= len)
+			return false;
+		if (memcmp(de_name, n->bytes, len) != 0)
+			return false;
+		if (check_sha256) {
+			fscrypt_do_sha256(sha256,
+				&de_name[FSCRYPT_FNAME_UNDIGESTED_SIZE],
+				de_name_len - FSCRYPT_FNAME_UNDIGESTED_SIZE);
+			if (memcmp(sha256, n->sha256, sizeof(sha256)) != 0)
+				return false;
+		}
+
+		return true;
+	}
+
+	if (de_name_len != fname->disk_name.len)
+		return false;
+	return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
+}
+EXPORT_SYMBOL(fscrypt_match_name);
+
 /**
  * fscrypt_fname_siphash() - Calculate the siphash for a file name
  * @dir: the parent directory
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 028aed925e51..ddb7245ba92b 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -16,6 +16,7 @@
 #include <linux/fs.h>
 #include <linux/mm.h>
 #include <linux/slab.h>
+#include <crypto/sha.h>
 #include <uapi/linux/fscrypt.h>
 
 #define FS_CRYPTO_BLOCK_SIZE		16
@@ -160,79 +161,34 @@ extern int fscrypt_fname_disk_to_usr(struct inode *, u32, u32,
 extern u64 fscrypt_fname_siphash(const struct inode *dir,
 					const struct qstr *name);
 
-#define FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE	32
-
-/* Extracts the second-to-last ciphertext block; see explanation below */
-#define FSCRYPT_FNAME_DIGEST(name, len)	\
-	((name) + round_down((len) - FS_CRYPTO_BLOCK_SIZE - 1, \
-			     FS_CRYPTO_BLOCK_SIZE))
-
-#define FSCRYPT_FNAME_DIGEST_SIZE	FS_CRYPTO_BLOCK_SIZE
-
 /**
- * fscrypt_digested_name - alternate identifier for an on-disk filename
+ * fscrypt_nokey_name - identifier for on-disk filenames when key is not present
  *
- * When userspace lists an encrypted directory without access to the key,
- * filenames whose ciphertext is longer than FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE
- * bytes are shown in this abbreviated form (base64-encoded) rather than as the
- * full ciphertext (base64-encoded).  This is necessary to allow supporting
- * filenames up to NAME_MAX bytes, since base64 encoding expands the length.
+ * When userspace lists an encrypted directory without access to the key, we
+ * must present them with a unique identifier for the file. base64 encoding will
+ * expand the space, so we use this format to avoid most collisions.
  *
- * To make it possible for filesystems to still find the correct directory entry
- * despite not knowing the full on-disk name, we encode any filesystem-specific
- * 'hash' and/or 'minor_hash' which the filesystem may need for its lookups,
- * followed by the second-to-last ciphertext block of the filename.  Due to the
- * use of the CBC-CTS encryption mode, the second-to-last ciphertext block
- * depends on the full plaintext.  (Note that ciphertext stealing causes the
- * last two blocks to appear "flipped".)  This makes accidental collisions very
- * unlikely: just a 1 in 2^128 chance for two filenames to collide even if they
- * share the same filesystem-specific hashes.
- *
- * However, this scheme isn't immune to intentional collisions, which can be
- * created by anyone able to create arbitrary plaintext filenames and view them
- * without the key.  Making the "digest" be a real cryptographic hash like
- * SHA-256 over the full ciphertext would prevent this, although it would be
- * less efficient and harder to implement, especially since the filesystem would
- * need to calculate it for each directory entry examined during a search.
+ * Filesystems may rely on the hash being present to look up a file on disk.
+ * For filenames that are both casefolded and encrypted, it is not possible to
+ * calculate the hash without the key. Additionally, if the ciphertext is longer
+ * than what we can base64 encode, we cannot generate the hash from the partial
+ * name. For simplicity, we always store the hash at the front of the name,
+ * followed by the first 149 bytes of the ciphertext, and then the sha256 of the
+ * remainder of the name if the ciphertext was longer than 149 bytes. For the
+ * usual case of relatively short filenames, this allows us to avoid needing to
+ * compute the sha256. This results in an encoded name that is at most 252 bytes
+ * long.
  */
-struct fscrypt_digested_name {
-	u32 hash;
-	u32 minor_hash;
-	u8 digest[FSCRYPT_FNAME_DIGEST_SIZE];
-};
 
-/**
- * fscrypt_match_name() - test whether the given name matches a directory entry
- * @fname: the name being searched for
- * @de_name: the name from the directory entry
- * @de_name_len: the length of @de_name in bytes
- *
- * Normally @fname->disk_name will be set, and in that case we simply compare
- * that to the name stored in the directory entry.  The only exception is that
- * if we don't have the key for an encrypted directory and a filename in it is
- * very long, then we won't have the full disk_name and we'll instead need to
- * match against the fscrypt_digested_name.
- *
- * Return: %true if the name matches, otherwise %false.
- */
-static inline bool fscrypt_match_name(const struct fscrypt_name *fname,
-				      const u8 *de_name, u32 de_name_len)
-{
-	if (unlikely(!fname->disk_name.name)) {
-		const struct fscrypt_digested_name *n =
-			(const void *)fname->crypto_buf.name;
-		if (WARN_ON_ONCE(fname->usr_fname->name[0] != '_'))
-			return false;
-		if (de_name_len <= FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE)
-			return false;
-		return !memcmp(FSCRYPT_FNAME_DIGEST(de_name, de_name_len),
-			       n->digest, FSCRYPT_FNAME_DIGEST_SIZE);
-	}
+#define FSCRYPT_FNAME_UNDIGESTED_SIZE 149
+struct fscrypt_nokey_name {
+	u32 dirtree_hash[2];
+	u8 bytes[FSCRYPT_FNAME_UNDIGESTED_SIZE];
+	u8 sha256[SHA256_DIGEST_SIZE];
+};
 
-	if (de_name_len != fname->disk_name.len)
-		return false;
-	return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
-}
+extern bool fscrypt_match_name(const struct fscrypt_name *fname,
+				      const u8 *de_name, u32 de_name_len);
 
 /* bio.c */
 extern void fscrypt_decrypt_bio(struct bio *);
@@ -448,7 +404,7 @@ static inline void fscrypt_fname_free_buffer(struct fscrypt_str *crypto_str)
 }
 
 static inline int fscrypt_fname_disk_to_usr(struct inode *inode,
-					    u32 hash, u32 minor_hash,
+					    u32 dirtree_hash, u32 minor_hash,
 					    const struct fscrypt_str *iname,
 					    struct fscrypt_str *oname)
 {
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
                   ` (2 preceding siblings ...)
  2019-12-03  5:10 ` [PATCH 3/8] fscrypt: Change format of no-key token Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-03  7:41   ` Gao Xiang
                     ` (2 more replies)
  2019-12-03  5:10 ` [PATCH 5/8] f2fs: Handle casefolding with Encryption Daniel Rosenberg
                   ` (3 subsequent siblings)
  7 siblings, 3 replies; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Ext4 and F2fs are both using casefolding, and they, along with any other
filesystem that adds the feature, will be using identical dentry_ops.
Additionally, those dentry ops interfere with the dentry_ops required
for fscrypt once we add support for casefolding and encryption.
Moving this into the vfs removes code duplication as well as the
complication with encryption.

Currently this is pretty close to just moving the existing f2fs/ext4
code up a level into the vfs, although there is a lot of room for
improvement now.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/dcache.c        | 35 +++++++++++++++++++++++++++++++++++
 fs/namei.c         | 43 ++++++++++++++++++++++++++++++++++++++++---
 include/linux/fs.h | 12 ++++++++++++
 3 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/fs/dcache.c b/fs/dcache.c
index f7931b682a0d..575f3c2c3f77 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -32,6 +32,7 @@
 #include <linux/bit_spinlock.h>
 #include <linux/rculist_bl.h>
 #include <linux/list_lru.h>
+#include <linux/unicode.h>
 #include "internal.h"
 #include "mount.h"
 
@@ -228,6 +229,13 @@ static inline int dentry_string_cmp(const unsigned char *cs, const unsigned char
 
 #endif
 
+bool needs_casefold(const struct inode *dir)
+{
+	return IS_CASEFOLDED(dir) &&
+			(!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir));
+}
+EXPORT_SYMBOL(needs_casefold);
+
 static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *ct, unsigned tcount)
 {
 	/*
@@ -247,7 +255,19 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
 	 * be no NUL in the ct/tcount data)
 	 */
 	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
+#ifdef CONFIG_UNICODE
+	struct inode *parent = dentry->d_parent->d_inode;
 
+	if (unlikely(needs_casefold(parent))) {
+		const struct qstr n1 = QSTR_INIT(cs, tcount);
+		const struct qstr n2 = QSTR_INIT(ct, tcount);
+		int result = utf8_strncasecmp(dentry->d_sb->s_encoding,
+						&n1, &n2);
+
+		if (result >= 0 || sb_has_enc_strict_mode(dentry->d_sb))
+			return result;
+	}
+#endif
 	return dentry_string_cmp(cs, ct, tcount);
 }
 
@@ -2404,7 +2424,22 @@ struct dentry *d_hash_and_lookup(struct dentry *dir, struct qstr *name)
 	 * calculate the standard hash first, as the d_op->d_hash()
 	 * routine may choose to leave the hash value unchanged.
 	 */
+#ifdef CONFIG_UNICODE
+	unsigned char *hname = NULL;
+	int hlen = name->len;
+
+	if (IS_CASEFOLDED(dir->d_inode)) {
+		hname = kmalloc(PATH_MAX, GFP_ATOMIC);
+		if (!hname)
+			return ERR_PTR(-ENOMEM);
+		hlen = utf8_casefold(dir->d_sb->s_encoding,
+					name, hname, PATH_MAX);
+	}
+	name->hash = full_name_hash(dir, hname ?: name->name, hlen);
+	kfree(hname);
+#else
 	name->hash = full_name_hash(dir, name->name, name->len);
+#endif
 	if (dir->d_flags & DCACHE_OP_HASH) {
 		int err = dir->d_op->d_hash(dir, name);
 		if (unlikely(err < 0))
diff --git a/fs/namei.c b/fs/namei.c
index 2dda552bcf7a..b8d5cb0994ec 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -39,6 +39,7 @@
 #include <linux/bitops.h>
 #include <linux/init_task.h>
 #include <linux/uaccess.h>
+#include <linux/unicode.h>
 
 #include "internal.h"
 #include "mount.h"
@@ -2062,6 +2063,10 @@ static inline u64 hash_name(const void *salt, const char *name)
 static int link_path_walk(const char *name, struct nameidata *nd)
 {
 	int err;
+#ifdef CONFIG_UNICODE
+	char *hname = NULL;
+	int hlen = 0;
+#endif
 
 	if (IS_ERR(name))
 		return PTR_ERR(name);
@@ -2078,9 +2083,21 @@ static int link_path_walk(const char *name, struct nameidata *nd)
 		err = may_lookup(nd);
 		if (err)
 			return err;
-
+#ifdef CONFIG_UNICODE
+		if (needs_casefold(nd->path.dentry->d_inode)) {
+			struct qstr str = QSTR_INIT(name, PATH_MAX);
+
+			hname = kmalloc(PATH_MAX, GFP_ATOMIC);
+			if (!hname)
+				return -ENOMEM;
+			hlen = utf8_casefold(nd->path.dentry->d_sb->s_encoding,
+						&str, hname, PATH_MAX);
+		}
+		hash_len = hash_name(nd->path.dentry, hname ?: name);
+		kfree(hname);
+#else
 		hash_len = hash_name(nd->path.dentry, name);
-
+#endif
 		type = LAST_NORM;
 		if (name[0] == '.') switch (hashlen_len(hash_len)) {
 			case 2:
@@ -2452,9 +2469,29 @@ EXPORT_SYMBOL(vfs_path_lookup);
 static int lookup_one_len_common(const char *name, struct dentry *base,
 				 int len, struct qstr *this)
 {
+#ifdef CONFIG_UNICODE
+	char *hname = NULL;
+	int hlen = len;
+
+	if (needs_casefold(base->d_inode)) {
+		struct qstr str = QSTR_INIT(name, len);
+
+		hname = kmalloc(PATH_MAX, GFP_ATOMIC);
+		if (!hname)
+			return -ENOMEM;
+		hlen = utf8_casefold(base->d_sb->s_encoding,
+					&str, hname, PATH_MAX);
+	}
+	this->hash = full_name_hash(base, hname ?: name, hlen);
+	kvfree(hname);
+#else
+	this->hash = full_name_hash(base, name, len);
+#endif
 	this->name = name;
 	this->len = len;
-	this->hash = full_name_hash(base, name, len);
+#ifdef CONFIG_UNICODE
+	kfree(hname);
+#endif
 	if (!len)
 		return -EACCES;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c159a8bdee8b..38d1c20f3e6f 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1382,6 +1382,12 @@ extern int send_sigurg(struct fown_struct *fown);
 #define SB_ACTIVE	(1<<30)
 #define SB_NOUSER	(1<<31)
 
+/* These flags relate to encoding and casefolding */
+#define SB_ENC_STRICT_MODE_FL	(1 << 0)
+
+#define sb_has_enc_strict_mode(sb) \
+	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
+
 /*
  *	Umount options
  */
@@ -1449,6 +1455,10 @@ struct super_block {
 #endif
 #ifdef CONFIG_FS_VERITY
 	const struct fsverity_operations *s_vop;
+#endif
+#ifdef CONFIG_UNICODE
+	struct unicode_map *s_encoding;
+	__u16 s_encoding_flags;
 #endif
 	struct hlist_bl_head	s_roots;	/* alternate root dentries for NFS */
 	struct list_head	s_mounts;	/* list of mounts; _not_ for fs use */
@@ -2044,6 +2054,8 @@ static inline bool sb_rdonly(const struct super_block *sb) { return sb->s_flags
 #define IS_WHITEOUT(inode)	(S_ISCHR(inode->i_mode) && \
 				 (inode)->i_rdev == WHITEOUT_DEV)
 
+extern bool needs_casefold(const struct inode *dir);
+
 static inline bool HAS_UNMAPPED_ID(struct inode *inode)
 {
 	return !uid_valid(inode->i_uid) || !gid_valid(inode->i_gid);
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 5/8] f2fs: Handle casefolding with Encryption
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
                   ` (3 preceding siblings ...)
  2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-05  1:17   ` kbuild test robot
  2019-12-03  5:10 ` [PATCH 6/8] ext4: Use struct super_block's casefold data Daniel Rosenberg
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

This expands f2fs's casefolding support to include encrypted
directories. For encrypted directories, we use the siphash of the
casefolded name. This ensures there is no direct way to go from an
unencrypted name to the stored hash on disk without knowledge of the
encryption policy keys.

Additionally, we switch to using the vfs layer's casefolding support
instead of storing this information inside of f2fs's private data.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/f2fs/dir.c    | 115 +++++++++++++++++++----------------------------
 fs/f2fs/f2fs.h   |  14 +++---
 fs/f2fs/hash.c   |  25 +++++++----
 fs/f2fs/inline.c |   9 ++--
 fs/f2fs/super.c  |  17 +++----
 fs/f2fs/sysfs.c  |   8 ++--
 6 files changed, 81 insertions(+), 107 deletions(-)

diff --git a/fs/f2fs/dir.c b/fs/f2fs/dir.c
index c967cacf979e..1cc2995fee84 100644
--- a/fs/f2fs/dir.c
+++ b/fs/f2fs/dir.c
@@ -111,31 +111,50 @@ static struct f2fs_dir_entry *find_in_block(struct inode *dir,
  * Returns: 0 if the directory entry matches, more than 0 if it
  * doesn't match or less than zero on error.
  */
-int f2fs_ci_compare(const struct inode *parent, const struct qstr *name,
-				const struct qstr *entry, bool quick)
+int f2fs_ci_compare(struct inode *parent, const struct qstr *name,
+		    unsigned char *name2, size_t len, bool quick)
 {
 	const struct f2fs_sb_info *sbi = F2FS_SB(parent->i_sb);
-	const struct unicode_map *um = sbi->s_encoding;
+	const struct unicode_map *um = sbi->sb->s_encoding;
+	const struct fscrypt_str crypt_entry = FSTR_INIT(name2, len);
+	struct fscrypt_str decrypted_entry;
+	struct qstr decrypted;
+	struct qstr entry = QSTR_INIT(name2, len);
 	int ret;
 
+	decrypted_entry.name = NULL;
+
+	if (IS_ENCRYPTED(parent) && fscrypt_has_encryption_key(parent)) {
+		decrypted_entry.name = kmalloc(len, GFP_ATOMIC);
+		decrypted.name = decrypted_entry.name;
+		decrypted_entry.len = len;
+		decrypted.len = len;
+		if (!decrypted.name)
+			return -ENOMEM;
+		fscrypt_fname_disk_to_usr(parent, 0, 0, &crypt_entry,
+							&decrypted_entry);
+	}
+
 	if (quick)
-		ret = utf8_strncasecmp_folded(um, name, entry);
+		ret = utf8_strncasecmp_folded(um, name, decrypted_entry.name ?
+							&decrypted : &entry);
 	else
-		ret = utf8_strncasecmp(um, name, entry);
-
+		ret = utf8_strncasecmp(um, name, decrypted_entry.name ?
+							&decrypted : &entry);
 	if (ret < 0) {
 		/* Handle invalid character sequence as either an error
 		 * or as an opaque byte sequence.
 		 */
-		if (f2fs_has_strict_mode(sbi))
+		if (sb_has_enc_strict_mode(sbi->sb))
 			return -EINVAL;
 
-		if (name->len != entry->len)
+		if (name->len != len)
 			return 1;
 
-		return !!memcmp(name->name, entry->name, name->len);
+		ret = !!memcmp(name->name,
+				decrypted_entry.name ?: name2, name->len);
 	}
-
+	kfree(decrypted_entry.name);
 	return ret;
 }
 
@@ -154,7 +173,7 @@ static void f2fs_fname_setup_ci_filename(struct inode *dir,
 	if (!cf_name->name)
 		return;
 
-	cf_name->len = utf8_casefold(sbi->s_encoding,
+	cf_name->len = utf8_casefold(dir->i_sb->s_encoding,
 					iname, cf_name->name,
 					F2FS_NAME_LEN);
 	if ((int)cf_name->len <= 0) {
@@ -173,24 +192,25 @@ static inline bool f2fs_match_name(struct f2fs_dentry_ptr *d,
 {
 #ifdef CONFIG_UNICODE
 	struct inode *parent = d->inode;
-	struct f2fs_sb_info *sbi = F2FS_I_SB(parent);
-	struct qstr entry;
+	struct super_block *sb = parent->i_sb;
+	unsigned char *name;
+	int len;
 #endif
 
 	if (de->hash_code != namehash)
 		return false;
 
 #ifdef CONFIG_UNICODE
-	entry.name = d->filename[bit_pos];
-	entry.len = de->name_len;
+	name = d->filename[bit_pos];
+	len = de->name_len;
 
-	if (sbi->s_encoding && IS_CASEFOLDED(parent)) {
+	if (sb->s_encoding && needs_casefold(parent)) {
 		if (cf_str->name) {
 			struct qstr cf = {.name = cf_str->name,
 					  .len = cf_str->len};
-			return !f2fs_ci_compare(parent, &cf, &entry, true);
+			return !f2fs_ci_compare(parent, &cf, name, len, true);
 		}
-		return !f2fs_ci_compare(parent, fname->usr_fname, &entry,
+		return !f2fs_ci_compare(parent, fname->usr_fname, name, len,
 					false);
 	}
 #endif
@@ -357,8 +377,8 @@ struct f2fs_dir_entry *f2fs_find_entry(struct inode *dir,
 	int err;
 
 #ifdef CONFIG_UNICODE
-	if (f2fs_has_strict_mode(F2FS_I_SB(dir)) && IS_CASEFOLDED(dir) &&
-			utf8_validate(F2FS_I_SB(dir)->s_encoding, child)) {
+	if (sb_has_enc_strict_mode(dir->i_sb) && IS_CASEFOLDED(dir) &&
+			utf8_validate(dir->i_sb->s_encoding, child)) {
 		*res_page = ERR_PTR(-EINVAL);
 		return NULL;
 	}
@@ -602,13 +622,13 @@ void f2fs_update_dentry(nid_t ino, umode_t mode, struct f2fs_dentry_ptr *d,
 
 int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
 				const struct qstr *orig_name,
+				f2fs_hash_t dentry_hash,
 				struct inode *inode, nid_t ino, umode_t mode)
 {
 	unsigned int bit_pos;
 	unsigned int level;
 	unsigned int current_depth;
 	unsigned long bidx, block;
-	f2fs_hash_t dentry_hash;
 	unsigned int nbucket, nblock;
 	struct page *dentry_page = NULL;
 	struct f2fs_dentry_block *dentry_blk = NULL;
@@ -618,7 +638,6 @@ int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
 
 	level = 0;
 	slots = GET_DENTRY_SLOTS(new_name->len);
-	dentry_hash = f2fs_dentry_hash(dir, new_name, NULL);
 
 	current_depth = F2FS_I(dir)->i_current_depth;
 	if (F2FS_I(dir)->chash == dentry_hash) {
@@ -704,17 +723,19 @@ int f2fs_add_dentry(struct inode *dir, struct fscrypt_name *fname,
 				struct inode *inode, nid_t ino, umode_t mode)
 {
 	struct qstr new_name;
+	f2fs_hash_t dentry_hash;
 	int err = -EAGAIN;
 
 	new_name.name = fname_name(fname);
 	new_name.len = fname_len(fname);
 
 	if (f2fs_has_inline_dentry(dir))
-		err = f2fs_add_inline_entry(dir, &new_name, fname->usr_fname,
+		err = f2fs_add_inline_entry(dir, &new_name, fname,
 							inode, ino, mode);
+	dentry_hash = f2fs_dentry_hash(dir, &new_name, fname);
 	if (err == -EAGAIN)
 		err = f2fs_add_regular_entry(dir, &new_name, fname->usr_fname,
-							inode, ino, mode);
+						dentry_hash, inode, ino, mode);
 
 	f2fs_update_time(F2FS_I_SB(dir), REQ_TIME);
 	return err;
@@ -1064,49 +1085,3 @@ const struct file_operations f2fs_dir_operations = {
 #endif
 };
 
-#ifdef CONFIG_UNICODE
-static int f2fs_d_compare(const struct dentry *dentry, unsigned int len,
-			  const char *str, const struct qstr *name)
-{
-	struct qstr qstr = {.name = str, .len = len };
-
-	if (!IS_CASEFOLDED(dentry->d_parent->d_inode)) {
-		if (len != name->len)
-			return -1;
-		return memcmp(str, name, len);
-	}
-
-	return f2fs_ci_compare(dentry->d_parent->d_inode, name, &qstr, false);
-}
-
-static int f2fs_d_hash(const struct dentry *dentry, struct qstr *str)
-{
-	struct f2fs_sb_info *sbi = F2FS_SB(dentry->d_sb);
-	const struct unicode_map *um = sbi->s_encoding;
-	unsigned char *norm;
-	int len, ret = 0;
-
-	if (!IS_CASEFOLDED(dentry->d_inode))
-		return 0;
-
-	norm = f2fs_kmalloc(sbi, PATH_MAX, GFP_ATOMIC);
-	if (!norm)
-		return -ENOMEM;
-
-	len = utf8_casefold(um, str, norm, PATH_MAX);
-	if (len < 0) {
-		if (f2fs_has_strict_mode(sbi))
-			ret = -EINVAL;
-		goto out;
-	}
-	str->hash = full_name_hash(dentry, norm, len);
-out:
-	kvfree(norm);
-	return ret;
-}
-
-const struct dentry_operations f2fs_dentry_ops = {
-	.d_hash = f2fs_d_hash,
-	.d_compare = f2fs_d_compare,
-};
-#endif
diff --git a/fs/f2fs/f2fs.h b/fs/f2fs/f2fs.h
index 5a888a063c7f..a4864ed76dd8 100644
--- a/fs/f2fs/f2fs.h
+++ b/fs/f2fs/f2fs.h
@@ -1184,10 +1184,6 @@ struct f2fs_sb_info {
 	int valid_super_block;			/* valid super block no */
 	unsigned long s_flag;				/* flags for sbi */
 	struct mutex writepages;		/* mutex for writepages() */
-#ifdef CONFIG_UNICODE
-	struct unicode_map *s_encoding;
-	__u16 s_encoding_flags;
-#endif
 
 #ifdef CONFIG_BLK_DEV_ZONED
 	unsigned int blocks_per_blkz;		/* F2FS blocks per zone */
@@ -2969,9 +2965,9 @@ int f2fs_update_extension_list(struct f2fs_sb_info *sbi, const char *name,
 							bool hot, bool set);
 struct dentry *f2fs_get_parent(struct dentry *child);
 
-extern int f2fs_ci_compare(const struct inode *parent,
+extern int f2fs_ci_compare(struct inode *parent,
 			   const struct qstr *name,
-			   const struct qstr *entry,
+			   unsigned char *name2, size_t len,
 			   bool quick);
 
 /*
@@ -3005,7 +3001,7 @@ void f2fs_update_dentry(nid_t ino, umode_t mode, struct f2fs_dentry_ptr *d,
 			const struct qstr *name, f2fs_hash_t name_hash,
 			unsigned int bit_pos);
 int f2fs_add_regular_entry(struct inode *dir, const struct qstr *new_name,
-			const struct qstr *orig_name,
+			const struct qstr *orig_name, f2fs_hash_t dentry_hash,
 			struct inode *inode, nid_t ino, umode_t mode);
 int f2fs_add_dentry(struct inode *dir, struct fscrypt_name *fname,
 			struct inode *inode, nid_t ino, umode_t mode);
@@ -3038,7 +3034,7 @@ int f2fs_sanity_check_ckpt(struct f2fs_sb_info *sbi);
  * hash.c
  */
 f2fs_hash_t f2fs_dentry_hash(const struct inode *dir,
-		const struct qstr *name_info, struct fscrypt_name *fname);
+		const struct qstr *name_info, const struct fscrypt_name *fname);
 
 /*
  * node.c
@@ -3517,7 +3513,7 @@ struct f2fs_dir_entry *f2fs_find_in_inline_dir(struct inode *dir,
 int f2fs_make_empty_inline_dir(struct inode *inode, struct inode *parent,
 			struct page *ipage);
 int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
-			const struct qstr *orig_name,
+			const struct fscrypt_name *fname,
 			struct inode *inode, nid_t ino, umode_t mode);
 void f2fs_delete_inline_entry(struct f2fs_dir_entry *dentry,
 				struct page *page, struct inode *dir,
diff --git a/fs/f2fs/hash.c b/fs/f2fs/hash.c
index 5bc4dcd8fc03..954d03dee450 100644
--- a/fs/f2fs/hash.c
+++ b/fs/f2fs/hash.c
@@ -68,8 +68,9 @@ static void str2hashbuf(const unsigned char *msg, size_t len,
 		*buf++ = pad;
 }
 
-static f2fs_hash_t __f2fs_dentry_hash(const struct qstr *name_info,
-				struct fscrypt_name *fname)
+static f2fs_hash_t __f2fs_dentry_hash(const struct inode *dir,
+				const struct qstr *name_info,
+				const struct fscrypt_name *fname)
 {
 	__u32 hash;
 	f2fs_hash_t f2fs_hash;
@@ -85,6 +86,11 @@ static f2fs_hash_t __f2fs_dentry_hash(const struct qstr *name_info,
 	if (is_dot_dotdot(name_info))
 		return 0;
 
+	if (IS_CASEFOLDED(dir) && IS_ENCRYPTED(dir)) {
+		f2fs_hash = fscrypt_fname_siphash(dir, name_info);
+		return f2fs_hash;
+	}
+
 	/* Initialize the default seed for the hash checksum functions */
 	buf[0] = 0x67452301;
 	buf[1] = 0xefcdab89;
@@ -106,35 +112,38 @@ static f2fs_hash_t __f2fs_dentry_hash(const struct qstr *name_info,
 }
 
 f2fs_hash_t f2fs_dentry_hash(const struct inode *dir,
-		const struct qstr *name_info, struct fscrypt_name *fname)
+		const struct qstr *name_info, const struct fscrypt_name *fname)
 {
 #ifdef CONFIG_UNICODE
 	struct f2fs_sb_info *sbi = F2FS_SB(dir->i_sb);
-	const struct unicode_map *um = sbi->s_encoding;
+	const struct unicode_map *um = sbi->sb->s_encoding;
 	int r, dlen;
 	unsigned char *buff;
 	struct qstr folded;
+	const struct qstr *name = fname ? fname->usr_fname : name_info;
 
 	if (!name_info->len || !IS_CASEFOLDED(dir))
 		goto opaque_seq;
 
+	if (IS_ENCRYPTED(dir) && !fscrypt_has_encryption_key(dir))
+		goto opaque_seq;
+
 	buff = f2fs_kzalloc(sbi, sizeof(char) * PATH_MAX, GFP_KERNEL);
 	if (!buff)
 		return -ENOMEM;
-
-	dlen = utf8_casefold(um, name_info, buff, PATH_MAX);
+	dlen = utf8_casefold(um, name, buff, PATH_MAX);
 	if (dlen < 0) {
 		kvfree(buff);
 		goto opaque_seq;
 	}
 	folded.name = buff;
 	folded.len = dlen;
-	r = __f2fs_dentry_hash(&folded, fname);
+	r = __f2fs_dentry_hash(dir, &folded, fname);
 
 	kvfree(buff);
 	return r;
 
 opaque_seq:
 #endif
-	return __f2fs_dentry_hash(name_info, fname);
+	return __f2fs_dentry_hash(dir, name_info, fname);
 }
diff --git a/fs/f2fs/inline.c b/fs/f2fs/inline.c
index 896db0416f0e..3c3772094153 100644
--- a/fs/f2fs/inline.c
+++ b/fs/f2fs/inline.c
@@ -465,8 +465,8 @@ static int f2fs_add_inline_entries(struct inode *dir, void *inline_dentry)
 		ino = le32_to_cpu(de->ino);
 		fake_mode = f2fs_get_de_type(de) << S_SHIFT;
 
-		err = f2fs_add_regular_entry(dir, &new_name, NULL, NULL,
-							ino, fake_mode);
+		err = f2fs_add_regular_entry(dir, &new_name, NULL,
+					de->hash_code, NULL, ino, fake_mode);
 		if (err)
 			goto punch_dentry_pages;
 
@@ -540,7 +540,7 @@ static int f2fs_convert_inline_dir(struct inode *dir, struct page *ipage,
 }
 
 int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
-				const struct qstr *orig_name,
+				const struct fscrypt_name *fname,
 				struct inode *inode, nid_t ino, umode_t mode)
 {
 	struct f2fs_sb_info *sbi = F2FS_I_SB(dir);
@@ -551,6 +551,7 @@ int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
 	struct f2fs_dentry_ptr d;
 	int slots = GET_DENTRY_SLOTS(new_name->len);
 	struct page *page = NULL;
+	const struct qstr *orig_name = fname->usr_fname;
 	int err = 0;
 
 	ipage = f2fs_get_node_page(sbi, dir->i_ino);
@@ -581,7 +582,7 @@ int f2fs_add_inline_entry(struct inode *dir, const struct qstr *new_name,
 
 	f2fs_wait_on_page_writeback(ipage, NODE, true, true);
 
-	name_hash = f2fs_dentry_hash(dir, new_name, NULL);
+	name_hash = f2fs_dentry_hash(dir, new_name, fname);
 	f2fs_update_dentry(ino, mode, &d, new_name, name_hash, bit_pos);
 
 	set_page_dirty(ipage);
diff --git a/fs/f2fs/super.c b/fs/f2fs/super.c
index 5111e1ffe58a..5e4e76332c4c 100644
--- a/fs/f2fs/super.c
+++ b/fs/f2fs/super.c
@@ -1144,7 +1144,7 @@ static void f2fs_put_super(struct super_block *sb)
 	for (i = 0; i < NR_PAGE_TYPE; i++)
 		kvfree(sbi->write_io[i]);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sbi->s_encoding);
+	utf8_unload(sbi->sb->s_encoding);
 #endif
 	kvfree(sbi);
 }
@@ -3136,17 +3136,11 @@ static int f2fs_scan_devices(struct f2fs_sb_info *sbi)
 static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
 {
 #ifdef CONFIG_UNICODE
-	if (f2fs_sb_has_casefold(sbi) && !sbi->s_encoding) {
+	if (f2fs_sb_has_casefold(sbi) && !sbi->sb->s_encoding) {
 		const struct f2fs_sb_encodings *encoding_info;
 		struct unicode_map *encoding;
 		__u16 encoding_flags;
 
-		if (f2fs_sb_has_encrypt(sbi)) {
-			f2fs_err(sbi,
-				"Can't mount with encoding and encryption");
-			return -EINVAL;
-		}
-
 		if (f2fs_sb_read_encoding(sbi->raw_super, &encoding_info,
 					  &encoding_flags)) {
 			f2fs_err(sbi,
@@ -3167,9 +3161,8 @@ static int f2fs_setup_casefold(struct f2fs_sb_info *sbi)
 			 "%s-%s with flags 0x%hx", encoding_info->name,
 			 encoding_info->version?:"\b", encoding_flags);
 
-		sbi->s_encoding = encoding;
-		sbi->s_encoding_flags = encoding_flags;
-		sbi->sb->s_d_op = &f2fs_dentry_ops;
+		sbi->sb->s_encoding = encoding;
+		sbi->sb->s_encoding_flags = encoding_flags;
 	}
 #else
 	if (f2fs_sb_has_casefold(sbi)) {
@@ -3637,7 +3630,7 @@ static int f2fs_fill_super(struct super_block *sb, void *data, int silent)
 		kvfree(sbi->write_io[i]);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sbi->s_encoding);
+	utf8_unload(sbi->sb->s_encoding);
 #endif
 free_options:
 #ifdef CONFIG_QUOTA
diff --git a/fs/f2fs/sysfs.c b/fs/f2fs/sysfs.c
index 70945ceb9c0c..7fd37c8c9733 100644
--- a/fs/f2fs/sysfs.c
+++ b/fs/f2fs/sysfs.c
@@ -88,10 +88,10 @@ static ssize_t encoding_show(struct f2fs_attr *a,
 #ifdef CONFIG_UNICODE
 	if (f2fs_sb_has_casefold(sbi))
 		return snprintf(buf, PAGE_SIZE, "%s (%d.%d.%d)\n",
-			sbi->s_encoding->charset,
-			(sbi->s_encoding->version >> 16) & 0xff,
-			(sbi->s_encoding->version >> 8) & 0xff,
-			sbi->s_encoding->version & 0xff);
+			sbi->sb->s_encoding->charset,
+			(sbi->sb->s_encoding->version >> 16) & 0xff,
+			(sbi->sb->s_encoding->version >> 8) & 0xff,
+			sbi->sb->s_encoding->version & 0xff);
 #endif
 	return snprintf(buf, PAGE_SIZE, "(none)");
 }
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 6/8] ext4: Use struct super_block's casefold data
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
                   ` (4 preceding siblings ...)
  2019-12-03  5:10 ` [PATCH 5/8] f2fs: Handle casefolding with Encryption Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-03 19:44   ` Gabriel Krisman Bertazi
  2019-12-03  5:10 ` [PATCH 7/8] ext4: Hande casefolding with encryption Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 8/8] ext4: Optimize match for casefolded encrypted dirs Daniel Rosenberg
  7 siblings, 1 reply; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Switch over to using the struct entries added to the VFS, and
remove the redundant dentry operations.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/ext4/dir.c   | 47 -----------------------------------------------
 fs/ext4/ext4.h  |  4 ----
 fs/ext4/hash.c  |  2 +-
 fs/ext4/namei.c | 16 ++++++++--------
 fs/ext4/super.c | 15 +++++----------
 5 files changed, 14 insertions(+), 70 deletions(-)

diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 9fdd2b269d61..c9c8370e5b4b 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -663,50 +663,3 @@ const struct file_operations ext4_dir_operations = {
 	.release	= ext4_release_dir,
 };
 
-#ifdef CONFIG_UNICODE
-static int ext4_d_compare(const struct dentry *dentry, unsigned int len,
-			  const char *str, const struct qstr *name)
-{
-	struct qstr qstr = {.name = str, .len = len };
-	struct inode *inode = dentry->d_parent->d_inode;
-
-	if (!IS_CASEFOLDED(inode) || !EXT4_SB(inode->i_sb)->s_encoding) {
-		if (len != name->len)
-			return -1;
-		return memcmp(str, name->name, len);
-	}
-
-	return ext4_ci_compare(inode, name, &qstr, false);
-}
-
-static int ext4_d_hash(const struct dentry *dentry, struct qstr *str)
-{
-	const struct ext4_sb_info *sbi = EXT4_SB(dentry->d_sb);
-	const struct unicode_map *um = sbi->s_encoding;
-	unsigned char *norm;
-	int len, ret = 0;
-
-	if (!IS_CASEFOLDED(dentry->d_inode) || !um)
-		return 0;
-
-	norm = kmalloc(PATH_MAX, GFP_ATOMIC);
-	if (!norm)
-		return -ENOMEM;
-
-	len = utf8_casefold(um, str, norm, PATH_MAX);
-	if (len < 0) {
-		if (ext4_has_strict_mode(sbi))
-			ret = -EINVAL;
-		goto out;
-	}
-	str->hash = full_name_hash(dentry, norm, len);
-out:
-	kfree(norm);
-	return ret;
-}
-
-const struct dentry_operations ext4_dentry_ops = {
-	.d_hash = ext4_d_hash,
-	.d_compare = ext4_d_compare,
-};
-#endif
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f8578caba40d..3162ef2e53d4 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1430,10 +1430,6 @@ struct ext4_sb_info {
 	struct kobject s_kobj;
 	struct completion s_kobj_unregister;
 	struct super_block *s_sb;
-#ifdef CONFIG_UNICODE
-	struct unicode_map *s_encoding;
-	__u16 s_encoding_flags;
-#endif
 
 	/* Journaling */
 	struct journal_s *s_journal;
diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
index 3e133793a5a3..143b0073b3f4 100644
--- a/fs/ext4/hash.c
+++ b/fs/ext4/hash.c
@@ -275,7 +275,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 		   struct dx_hash_info *hinfo)
 {
 #ifdef CONFIG_UNICODE
-	const struct unicode_map *um = EXT4_SB(dir->i_sb)->s_encoding;
+	const struct unicode_map *um = dir->i_sb->s_encoding;
 	int r, dlen;
 	unsigned char *buff;
 	struct qstr qstr = {.name = name, .len = len };
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index a856997d87b5..4ee5cf007de7 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1282,8 +1282,8 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
 int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
 		    const struct qstr *entry, bool quick)
 {
-	const struct ext4_sb_info *sbi = EXT4_SB(parent->i_sb);
-	const struct unicode_map *um = sbi->s_encoding;
+	const struct super_block *sb = parent->i_sb;
+	const struct unicode_map *um = sb->s_encoding;
 	int ret;
 
 	if (quick)
@@ -1295,7 +1295,7 @@ int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
 		/* Handle invalid character sequence as either an error
 		 * or as an opaque byte sequence.
 		 */
-		if (ext4_has_strict_mode(sbi))
+		if (sb_has_enc_strict_mode(sb))
 			return -EINVAL;
 
 		if (name->len != entry->len)
@@ -1312,7 +1312,7 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
 {
 	int len;
 
-	if (!IS_CASEFOLDED(dir) || !EXT4_SB(dir->i_sb)->s_encoding) {
+	if (!IS_CASEFOLDED(dir) || !dir->i_sb->s_encoding) {
 		cf_name->name = NULL;
 		return;
 	}
@@ -1321,7 +1321,7 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
 	if (!cf_name->name)
 		return;
 
-	len = utf8_casefold(EXT4_SB(dir->i_sb)->s_encoding,
+	len = utf8_casefold(dir->i_sb->s_encoding,
 			    iname, cf_name->name,
 			    EXT4_NAME_LEN);
 	if (len <= 0) {
@@ -1358,7 +1358,7 @@ static inline bool ext4_match(const struct inode *parent,
 #endif
 
 #ifdef CONFIG_UNICODE
-	if (EXT4_SB(parent->i_sb)->s_encoding && IS_CASEFOLDED(parent)) {
+	if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent)) {
 		if (fname->cf_name.name) {
 			struct qstr cf = {.name = fname->cf_name.name,
 					  .len = fname->cf_name.len};
@@ -2182,8 +2182,8 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
 		return -EINVAL;
 
 #ifdef CONFIG_UNICODE
-	if (ext4_has_strict_mode(sbi) && IS_CASEFOLDED(dir) &&
-	    sbi->s_encoding && utf8_validate(sbi->s_encoding, &dentry->d_name))
+	if (sb_has_enc_strict_mode(sb) && IS_CASEFOLDED(dir) &&
+	    sb->s_encoding && utf8_validate(sb->s_encoding, &dentry->d_name))
 		return -EINVAL;
 #endif
 
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1d82b56d9b11..074e61b15181 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -1056,7 +1056,7 @@ static void ext4_put_super(struct super_block *sb)
 	kfree(sbi->s_blockgroup_lock);
 	fs_put_dax(sbi->s_daxdev);
 #ifdef CONFIG_UNICODE
-	utf8_unload(sbi->s_encoding);
+	utf8_unload(sb->s_encoding);
 #endif
 	kfree(sbi);
 }
@@ -3815,7 +3815,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount;
 
 #ifdef CONFIG_UNICODE
-	if (ext4_has_feature_casefold(sb) && !sbi->s_encoding) {
+	if (ext4_has_feature_casefold(sb) && !sb->s_encoding) {
 		const struct ext4_sb_encodings *encoding_info;
 		struct unicode_map *encoding;
 		__u16 encoding_flags;
@@ -3846,8 +3846,8 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 			 "%s-%s with flags 0x%hx", encoding_info->name,
 			 encoding_info->version?:"\b", encoding_flags);
 
-		sbi->s_encoding = encoding;
-		sbi->s_encoding_flags = encoding_flags;
+		sb->s_encoding = encoding;
+		sb->s_encoding_flags = encoding_flags;
 	}
 #endif
 
@@ -4498,11 +4498,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount4;
 	}
 
-#ifdef CONFIG_UNICODE
-	if (sbi->s_encoding)
-		sb->s_d_op = &ext4_dentry_ops;
-#endif
-
 	sb->s_root = d_make_root(root);
 	if (!sb->s_root) {
 		ext4_msg(sb, KERN_ERR, "get root dentry failed");
@@ -4687,7 +4682,7 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		crypto_free_shash(sbi->s_chksum_driver);
 
 #ifdef CONFIG_UNICODE
-	utf8_unload(sbi->s_encoding);
+	utf8_unload(sb->s_encoding);
 #endif
 
 #ifdef CONFIG_QUOTA
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 7/8] ext4: Hande casefolding with encryption
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
                   ` (5 preceding siblings ...)
  2019-12-03  5:10 ` [PATCH 6/8] ext4: Use struct super_block's casefold data Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  2019-12-03  5:10 ` [PATCH 8/8] ext4: Optimize match for casefolded encrypted dirs Daniel Rosenberg
  7 siblings, 0 replies; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

This adds support for encryption with casefolding.

Since the name on disk is case preserving, and also encrypted, we can no
longer just recompute the hash on the fly. Additionally, to avoid
leaking extra information from the hash of the unencrypted name, we use
siphash via an fscrypt v2 policy.

The hash is stored at the end of the directory entry for all entries
inside of an encrypted and casefolded directory apart from those that
deal with '.' and '..'. This way, the change is backwards compatible
with existing ext4 filesystems.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 Documentation/filesystems/ext4/directory.rst |  27 ++
 fs/ext4/dir.c                                |  25 +-
 fs/ext4/ext4.h                               |  66 ++++-
 fs/ext4/hash.c                               |  24 +-
 fs/ext4/ialloc.c                             |   5 +-
 fs/ext4/inline.c                             |  41 +--
 fs/ext4/namei.c                              | 291 +++++++++++++------
 fs/ext4/super.c                              |   6 -
 8 files changed, 342 insertions(+), 143 deletions(-)

diff --git a/Documentation/filesystems/ext4/directory.rst b/Documentation/filesystems/ext4/directory.rst
index 073940cc64ed..55f618b37144 100644
--- a/Documentation/filesystems/ext4/directory.rst
+++ b/Documentation/filesystems/ext4/directory.rst
@@ -121,6 +121,31 @@ The directory file type is one of the following values:
    * - 0x7
      - Symbolic link.
 
+To support directories that are both encrypted and casefolded directories, we
+must also include hash information in the directory entry. We append
+``ext4_extended_dir_entry_2`` to ``ext4_dir_entry_2`` except for the entries
+for dot and dotdot, which are kept the same. The structure follows immediately
+after ``name`` and is included in the size listed by ``rec_len`` If a directory
+entry uses this extension, it may be up to 271 bytes.
+
+.. list-table::
+   :widths: 8 8 24 40
+   :header-rows: 1
+
+   * - Offset
+     - Size
+     - Name
+     - Description
+   * - 0x0
+     - \_\_le32
+     - hash
+     - The hash of the directory name
+   * - 0x4
+     - \_\_le32
+     - minor\_hash
+     - The minor hash of the directory name
+
+
 In order to add checksums to these classic directory blocks, a phony
 ``struct ext4_dir_entry`` is placed at the end of each leaf block to
 hold the checksum. The directory entry is 12 bytes long. The inode
@@ -322,6 +347,8 @@ The directory hash is one of the following values:
      - Half MD4, unsigned.
    * - 0x5
      - Tea, unsigned.
+   * - 0x6
+     - Siphash.
 
 Interior nodes of an htree are recorded as ``struct dx_node``, which is
 also the full length of a data block:
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index c9c8370e5b4b..4f0e03dc594e 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -26,10 +26,11 @@
 #include <linux/buffer_head.h>
 #include <linux/slab.h>
 #include <linux/iversion.h>
-#include <linux/unicode.h>
 #include "ext4.h"
 #include "xattr.h"
 
+#define DOTDOT_OFFSET 12
+
 static int ext4_dx_readdir(struct file *, struct dir_context *);
 
 /**
@@ -67,17 +68,20 @@ int __ext4_check_dir_entry(const char *function, unsigned int line,
 			   struct inode *dir, struct file *filp,
 			   struct ext4_dir_entry_2 *de,
 			   struct buffer_head *bh, char *buf, int size,
+			   ext4_lblk_t lblk,
 			   unsigned int offset)
 {
 	const char *error_msg = NULL;
 	const int rlen = ext4_rec_len_from_disk(de->rec_len,
 						dir->i_sb->s_blocksize);
+	bool fake = (lblk == 0) && (offset <= DOTDOT_OFFSET);
 
-	if (unlikely(rlen < EXT4_DIR_REC_LEN(1)))
+	if (unlikely(rlen < ext4_dir_rec_len(1, fake ? NULL : dir)))
 		error_msg = "rec_len is smaller than minimal";
 	else if (unlikely(rlen % 4 != 0))
 		error_msg = "rec_len % 4 != 0";
-	else if (unlikely(rlen < EXT4_DIR_REC_LEN(de->name_len)))
+	else if (unlikely(rlen < ext4_dir_rec_len(de->name_len,
+							fake ? NULL : dir)))
 		error_msg = "rec_len is too small for name_len";
 	else if (unlikely(((char *) de - buf) + rlen > size))
 		error_msg = "directory entry overrun";
@@ -90,15 +94,15 @@ int __ext4_check_dir_entry(const char *function, unsigned int line,
 	if (filp)
 		ext4_error_file(filp, function, line, bh->b_blocknr,
 				"bad entry in directory: %s - offset=%u, "
-				"inode=%u, rec_len=%d, name_len=%d, size=%d",
+				"inode=%u, rec_len=%d, lblk=%d, size=%d",
 				error_msg, offset, le32_to_cpu(de->inode),
-				rlen, de->name_len, size);
+				rlen, lblk, size);
 	else
 		ext4_error_inode(dir, function, line, bh->b_blocknr,
 				"bad entry in directory: %s - offset=%u, "
-				"inode=%u, rec_len=%d, name_len=%d, size=%d",
+				"inode=%u, rec_len=%d, lblk=%d, size=%d",
 				 error_msg, offset, le32_to_cpu(de->inode),
-				 rlen, de->name_len, size);
+				 rlen, lblk, size);
 
 	return 1;
 }
@@ -220,7 +224,8 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
 				 * failure will be detected in the
 				 * dirent test below. */
 				if (ext4_rec_len_from_disk(de->rec_len,
-					sb->s_blocksize) < EXT4_DIR_REC_LEN(1))
+					sb->s_blocksize) < ext4_dir_rec_len(1,
+									inode))
 					break;
 				i += ext4_rec_len_from_disk(de->rec_len,
 							    sb->s_blocksize);
@@ -236,7 +241,7 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
 			de = (struct ext4_dir_entry_2 *) (bh->b_data + offset);
 			if (ext4_check_dir_entry(inode, file, de, bh,
 						 bh->b_data, bh->b_size,
-						 offset)) {
+						 map.m_lblk, offset)) {
 				/*
 				 * On error, skip to the next block
 				 */
@@ -638,7 +643,7 @@ int ext4_check_all_de(struct inode *dir, struct buffer_head *bh, void *buf,
 	top = buf + buf_size;
 	while ((char *) de < top) {
 		if (ext4_check_dir_entry(dir, NULL, de, bh,
-					 buf, buf_size, offset))
+					 buf, buf_size, 0, offset))
 			return -EFSCORRUPTED;
 		rlen = ext4_rec_len_from_disk(de->rec_len, buf_size);
 		de = (struct ext4_dir_entry_2 *)((char *)de + rlen);
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 3162ef2e53d4..f06bab489d37 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -1955,6 +1955,19 @@ struct ext4_dir_entry {
 	char	name[EXT4_NAME_LEN];	/* File name */
 };
 
+
+/*
+ * Extended entry for ext4_dir_entry_2, since we can't easily store values after
+ * an arbitrary sized field, and would prefer not to break the format. For
+ * entries that are both encrypted and casefolded, we need to include the hash
+ * in the entry.
+ */
+
+struct ext4_extended_dir_entry_2 {
+	__le32 hash;
+	__le32 minor_hash;
+};
+
 /*
  * The new version of the directory entry.  Since EXT4 structures are
  * stored in intel byte order, and the name_len field could never be
@@ -1967,8 +1980,24 @@ struct ext4_dir_entry_2 {
 	__u8	name_len;		/* Name length */
 	__u8	file_type;
 	char	name[EXT4_NAME_LEN];	/* File name */
+	char	padding[sizeof(struct ext4_extended_dir_entry_2)];
 };
 
+/*
+ * Access the extended section of ext4_dir_entry_2
+ */
+#define EXT4_EXTENDED_DIRENT(entry) \
+	((struct ext4_extended_dir_entry_2 *) \
+		(((void *)(entry)) + 8 + (entry)->name_len))
+#define EXT4_DIRENT_HASH(entry) le32_to_cpu(EXT4_EXTENDED_DIRENT(de)->hash)
+#define EXT4_DIRENT_MINOR_HASH(entry) \
+		le32_to_cpu(EXT4_EXTENDED_DIRENT(de)->minor_hash)
+
+static inline bool ext4_hash_in_dirent(const struct inode *inode)
+{
+	return IS_CASEFOLDED(inode) && IS_ENCRYPTED(inode);
+}
+
 /*
  * This is a bogus directory entry at the end of each leaf block that
  * records checksums.
@@ -2010,10 +2039,25 @@ struct ext4_dir_entry_tail {
  */
 #define EXT4_DIR_PAD			4
 #define EXT4_DIR_ROUND			(EXT4_DIR_PAD - 1)
-#define EXT4_DIR_REC_LEN(name_len)	(((name_len) + 8 + EXT4_DIR_ROUND) & \
-					 ~EXT4_DIR_ROUND)
 #define EXT4_MAX_REC_LEN		((1<<16)-1)
 
+/*
+ * The rec_len is dependent on the type of directory. Directories that are
+ * casefolded and encrypted need to store the hash as well, so we add room for
+ * ext4_extended_dir_entry_2. For all entries related to '.' or '..' you should
+ * pass NULL for dir, as those entries do not use the extra fields.
+ */
+
+static inline unsigned int ext4_dir_rec_len(__u8 name_len,
+						const struct inode *dir)
+{
+	int rec_len = (name_len + 8 + EXT4_DIR_ROUND);
+
+	if (dir && ext4_hash_in_dirent(dir))
+		rec_len += sizeof(struct ext4_extended_dir_entry_2);
+	return (rec_len & ~EXT4_DIR_ROUND);
+}
+
 /*
  * If we ever get support for fs block sizes > page_size, we'll need
  * to remove the #if statements in the next two functions...
@@ -2070,6 +2114,7 @@ static inline __le16 ext4_rec_len_to_disk(unsigned len, unsigned blocksize)
 #define DX_HASH_LEGACY_UNSIGNED		3
 #define DX_HASH_HALF_MD4_UNSIGNED	4
 #define DX_HASH_TEA_UNSIGNED		5
+#define DX_HASH_SIPHASH			6
 
 static inline u32 ext4_chksum(struct ext4_sb_info *sbi, u32 crc,
 			      const void *address, unsigned int length)
@@ -2124,6 +2169,7 @@ struct ext4_filename {
 };
 
 #define fname_name(p) ((p)->disk_name.name)
+#define fname_usr_name(p) ((p)->usr_fname->name)
 #define fname_len(p)  ((p)->disk_name.len)
 
 /*
@@ -2458,21 +2504,22 @@ extern int __ext4_check_dir_entry(const char *, unsigned int, struct inode *,
 				  struct file *,
 				  struct ext4_dir_entry_2 *,
 				  struct buffer_head *, char *, int,
-				  unsigned int);
-#define ext4_check_dir_entry(dir, filp, de, bh, buf, size, offset)	\
+				  ext4_lblk_t, unsigned int);
+#define ext4_check_dir_entry(dir, filp, de, bh, buf, size, lblk, offset) \
 	unlikely(__ext4_check_dir_entry(__func__, __LINE__, (dir), (filp), \
-					(de), (bh), (buf), (size), (offset)))
+				(de), (bh), (buf), (size), (lblk), (offset)))
 extern int ext4_htree_store_dirent(struct file *dir_file, __u32 hash,
 				__u32 minor_hash,
 				struct ext4_dir_entry_2 *dirent,
 				struct fscrypt_str *ent_name);
 extern void ext4_htree_free_dir_info(struct dir_private_info *p);
 extern int ext4_find_dest_de(struct inode *dir, struct inode *inode,
+			     ext4_lblk_t lblk,
 			     struct buffer_head *bh,
 			     void *buf, int buf_size,
 			     struct ext4_filename *fname,
 			     struct ext4_dir_entry_2 **dest_de);
-void ext4_insert_dentry(struct inode *inode,
+void ext4_insert_dentry(struct inode *dir, struct inode *inode,
 			struct ext4_dir_entry_2 *de,
 			int buf_size,
 			struct ext4_filename *fname);
@@ -2650,11 +2697,12 @@ extern int ext4_search_dir(struct buffer_head *bh,
 			   int buf_size,
 			   struct inode *dir,
 			   struct ext4_filename *fname,
-			   unsigned int offset,
+			   ext4_lblk_t lblk, unsigned int offset,
 			   struct ext4_dir_entry_2 **res_dir);
 extern int ext4_generic_delete_entry(handle_t *handle,
 				     struct inode *dir,
 				     struct ext4_dir_entry_2 *de_del,
+				     ext4_lblk_t lblk,
 				     struct buffer_head *bh,
 				     void *entry_buf,
 				     int buf_size,
@@ -3188,9 +3236,9 @@ extern void ext4_initialize_dirent_tail(struct buffer_head *bh,
 					unsigned int blocksize);
 extern int ext4_handle_dirty_dirblock(handle_t *handle, struct inode *inode,
 				      struct buffer_head *bh);
-extern int ext4_ci_compare(const struct inode *parent,
+extern int ext4_ci_compare(struct inode *parent,
 			   const struct qstr *fname,
-			   const struct qstr *entry, bool quick);
+			   unsigned char *name2, size_t len, bool quick);
 
 #define S_SHIFT 12
 static const unsigned char ext4_type_by_mode[(S_IFMT >> S_SHIFT) + 1] = {
diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c
index 143b0073b3f4..035b57b93673 100644
--- a/fs/ext4/hash.c
+++ b/fs/ext4/hash.c
@@ -197,7 +197,7 @@ static void str2hashbuf_unsigned(const char *msg, int len, __u32 *buf, int num)
  * represented, and whether or not the returned hash is 32 bits or 64
  * bits.  32 bit hashes will return 0 for the minor hash.
  */
-static int __ext4fs_dirhash(const char *name, int len,
+static int __ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 			    struct dx_hash_info *hinfo)
 {
 	__u32	hash;
@@ -259,6 +259,22 @@ static int __ext4fs_dirhash(const char *name, int len,
 		hash = buf[0];
 		minor_hash = buf[1];
 		break;
+	case DX_HASH_SIPHASH:
+	{
+		struct qstr qname = QSTR_INIT(name, len);
+		__u64	combined_hash;
+
+		if (fscrypt_has_encryption_key(dir)) {
+			combined_hash = fscrypt_fname_siphash(dir, &qname);
+		} else {
+			ext4_warning_inode(dir, "Siphash requires key");
+			return -1;
+		}
+
+		hash = (__u32)(combined_hash >> 32);
+		minor_hash = (__u32)combined_hash;
+		break;
+	}
 	default:
 		hinfo->hash = 0;
 		return -1;
@@ -280,7 +296,7 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 	unsigned char *buff;
 	struct qstr qstr = {.name = name, .len = len };
 
-	if (len && IS_CASEFOLDED(dir) && um) {
+	if (len && needs_casefold(dir) && um) {
 		buff = kzalloc(sizeof(char) * PATH_MAX, GFP_KERNEL);
 		if (!buff)
 			return -ENOMEM;
@@ -291,12 +307,12 @@ int ext4fs_dirhash(const struct inode *dir, const char *name, int len,
 			goto opaque_seq;
 		}
 
-		r = __ext4fs_dirhash(buff, dlen, hinfo);
+		r = __ext4fs_dirhash(dir, buff, dlen, hinfo);
 
 		kfree(buff);
 		return r;
 	}
 opaque_seq:
 #endif
-	return __ext4fs_dirhash(name, len, hinfo);
+	return __ext4fs_dirhash(dir, name, len, hinfo);
 }
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index dc333e8e51e8..4142282a049a 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -448,7 +448,10 @@ static int find_group_orlov(struct super_block *sb, struct inode *parent,
 		int ret = -1;
 
 		if (qstr) {
-			hinfo.hash_version = DX_HASH_HALF_MD4;
+			if (ext4_hash_in_dirent(parent))
+				hinfo.hash_version = DX_HASH_SIPHASH;
+			else
+				hinfo.hash_version = DX_HASH_HALF_MD4;
 			hinfo.seed = sbi->s_hash_seed;
 			ext4fs_dirhash(parent, qstr->name, qstr->len, &hinfo);
 			grp = hinfo.hash;
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 2fec62d764fa..4dd04c06bdab 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -996,7 +996,7 @@ void ext4_show_inline_dir(struct inode *dir, struct buffer_head *bh,
 			     offset, de_len, de->name_len, de->name,
 			     de->name_len, le32_to_cpu(de->inode));
 		if (ext4_check_dir_entry(dir, NULL, de, bh,
-					 inline_start, inline_size, offset))
+					 inline_start, inline_size, 0, offset))
 			BUG();
 
 		offset += de_len;
@@ -1022,7 +1022,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle,
 	int		err;
 	struct ext4_dir_entry_2 *de;
 
-	err = ext4_find_dest_de(dir, inode, iloc->bh, inline_start,
+	err = ext4_find_dest_de(dir, inode, 0, iloc->bh, inline_start,
 				inline_size, fname, &de);
 	if (err)
 		return err;
@@ -1031,7 +1031,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle,
 	err = ext4_journal_get_write_access(handle, iloc->bh);
 	if (err)
 		return err;
-	ext4_insert_dentry(inode, de, inline_size, fname);
+	ext4_insert_dentry(dir, inode, de, inline_size, fname);
 
 	ext4_show_inline_dir(dir, iloc->bh, inline_start, inline_size);
 
@@ -1100,7 +1100,7 @@ static int ext4_update_inline_dir(handle_t *handle, struct inode *dir,
 	int old_size = EXT4_I(dir)->i_inline_size - EXT4_MIN_INLINE_DATA_SIZE;
 	int new_size = get_max_inline_xattr_value_size(dir, iloc);
 
-	if (new_size - old_size <= EXT4_DIR_REC_LEN(1))
+	if (new_size - old_size <= ext4_dir_rec_len(1, NULL))
 		return -ENOSPC;
 
 	ret = ext4_update_inline_data(handle, dir,
@@ -1378,8 +1378,8 @@ int ext4_inlinedir_to_tree(struct file *dir_file,
 			fake.name_len = 1;
 			strcpy(fake.name, ".");
 			fake.rec_len = ext4_rec_len_to_disk(
-						EXT4_DIR_REC_LEN(fake.name_len),
-						inline_size);
+					  ext4_dir_rec_len(fake.name_len, NULL),
+					  inline_size);
 			ext4_set_de_type(inode->i_sb, &fake, S_IFDIR);
 			de = &fake;
 			pos = EXT4_INLINE_DOTDOT_OFFSET;
@@ -1388,8 +1388,8 @@ int ext4_inlinedir_to_tree(struct file *dir_file,
 			fake.name_len = 2;
 			strcpy(fake.name, "..");
 			fake.rec_len = ext4_rec_len_to_disk(
-						EXT4_DIR_REC_LEN(fake.name_len),
-						inline_size);
+					  ext4_dir_rec_len(fake.name_len, NULL),
+					  inline_size);
 			ext4_set_de_type(inode->i_sb, &fake, S_IFDIR);
 			de = &fake;
 			pos = EXT4_INLINE_DOTDOT_SIZE;
@@ -1398,13 +1398,18 @@ int ext4_inlinedir_to_tree(struct file *dir_file,
 			pos += ext4_rec_len_from_disk(de->rec_len, inline_size);
 			if (ext4_check_dir_entry(inode, dir_file, de,
 					 iloc.bh, dir_buf,
-					 inline_size, pos)) {
+					 inline_size, block, pos)) {
 				ret = count;
 				goto out;
 			}
 		}
 
-		ext4fs_dirhash(dir, de->name, de->name_len, hinfo);
+		if (ext4_hash_in_dirent(dir)) {
+			hinfo->hash = EXT4_DIRENT_HASH(de);
+			hinfo->minor_hash = EXT4_DIRENT_MINOR_HASH(de);
+		} else {
+			ext4fs_dirhash(dir, de->name, de->name_len, hinfo);
+		}
 		if ((hinfo->hash < start_hash) ||
 		    ((hinfo->hash == start_hash) &&
 		     (hinfo->minor_hash < start_minor_hash)))
@@ -1486,8 +1491,8 @@ int ext4_read_inline_dir(struct file *file,
 	 * So we will use extra_offset and extra_size to indicate them
 	 * during the inline dir iteration.
 	 */
-	dotdot_offset = EXT4_DIR_REC_LEN(1);
-	dotdot_size = dotdot_offset + EXT4_DIR_REC_LEN(2);
+	dotdot_offset = ext4_dir_rec_len(1, NULL);
+	dotdot_size = dotdot_offset + ext4_dir_rec_len(2, NULL);
 	extra_offset = dotdot_size - EXT4_INLINE_DOTDOT_SIZE;
 	extra_size = extra_offset + inline_size;
 
@@ -1522,7 +1527,7 @@ int ext4_read_inline_dir(struct file *file,
 			 * failure will be detected in the
 			 * dirent test below. */
 			if (ext4_rec_len_from_disk(de->rec_len, extra_size)
-				< EXT4_DIR_REC_LEN(1))
+				< ext4_dir_rec_len(1, NULL))
 				break;
 			i += ext4_rec_len_from_disk(de->rec_len,
 						    extra_size);
@@ -1550,7 +1555,7 @@ int ext4_read_inline_dir(struct file *file,
 		de = (struct ext4_dir_entry_2 *)
 			(dir_buf + ctx->pos - extra_offset);
 		if (ext4_check_dir_entry(inode, file, de, iloc.bh, dir_buf,
-					 extra_size, ctx->pos))
+					 extra_size, 0, ctx->pos))
 			goto out;
 		if (le32_to_cpu(de->inode)) {
 			if (!dir_emit(ctx, de->name, de->name_len,
@@ -1642,7 +1647,7 @@ struct buffer_head *ext4_find_inline_entry(struct inode *dir,
 						EXT4_INLINE_DOTDOT_SIZE;
 	inline_size = EXT4_MIN_INLINE_DATA_SIZE - EXT4_INLINE_DOTDOT_SIZE;
 	ret = ext4_search_dir(iloc.bh, inline_start, inline_size,
-			      dir, fname, 0, res_dir);
+			      dir, fname, 0, 0, res_dir);
 	if (ret == 1)
 		goto out_find;
 	if (ret < 0)
@@ -1655,7 +1660,7 @@ struct buffer_head *ext4_find_inline_entry(struct inode *dir,
 	inline_size = ext4_get_inline_size(dir) - EXT4_MIN_INLINE_DATA_SIZE;
 
 	ret = ext4_search_dir(iloc.bh, inline_start, inline_size,
-			      dir, fname, 0, res_dir);
+			      dir, fname, 0, 0, res_dir);
 	if (ret == 1)
 		goto out_find;
 
@@ -1704,7 +1709,7 @@ int ext4_delete_inline_entry(handle_t *handle,
 	if (err)
 		goto out;
 
-	err = ext4_generic_delete_entry(handle, dir, de_del, bh,
+	err = ext4_generic_delete_entry(handle, dir, de_del, 0, bh,
 					inline_start, inline_size, 0);
 	if (err)
 		goto out;
@@ -1788,7 +1793,7 @@ bool empty_inline_dir(struct inode *dir, int *has_inline_data)
 					   &inline_pos, &inline_size);
 		if (ext4_check_dir_entry(dir, NULL, de,
 					 iloc.bh, inline_pos,
-					 inline_size, offset)) {
+					 inline_size, 0, offset)) {
 			ext4_warning(dir->i_sb,
 				     "bad inline directory (dir #%lu) - "
 				     "inode %u, rec_len %u, name_len %d"
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 4ee5cf007de7..f536cfc626bd 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -277,9 +277,11 @@ static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de,
 		       unsigned blocksize, struct dx_hash_info *hinfo,
 		       struct dx_map_entry map[]);
 static void dx_sort_map(struct dx_map_entry *map, unsigned count);
-static struct ext4_dir_entry_2 *dx_move_dirents(char *from, char *to,
-		struct dx_map_entry *offsets, int count, unsigned blocksize);
-static struct ext4_dir_entry_2* dx_pack_dirents(char *base, unsigned blocksize);
+static struct ext4_dir_entry_2 *dx_move_dirents(struct inode *dir, char *from,
+					char *to, struct dx_map_entry *offsets,
+					int count, unsigned int blocksize);
+static struct ext4_dir_entry_2 *dx_pack_dirents(struct inode *dir, char *base,
+						unsigned int blocksize);
 static void dx_insert_block(struct dx_frame *frame,
 					u32 hash, ext4_lblk_t block);
 static int ext4_htree_next_block(struct inode *dir, __u32 hash,
@@ -288,7 +290,7 @@ static int ext4_htree_next_block(struct inode *dir, __u32 hash,
 				 __u32 *start_hash);
 static struct buffer_head * ext4_dx_find_entry(struct inode *dir,
 		struct ext4_filename *fname,
-		struct ext4_dir_entry_2 **res_dir);
+		struct ext4_dir_entry_2 **res_dir, ext4_lblk_t *lblk);
 static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 			     struct inode *dir, struct inode *inode);
 
@@ -571,8 +573,9 @@ static inline void dx_set_limit(struct dx_entry *entries, unsigned value)
 
 static inline unsigned dx_root_limit(struct inode *dir, unsigned infosize)
 {
-	unsigned entry_space = dir->i_sb->s_blocksize - EXT4_DIR_REC_LEN(1) -
-		EXT4_DIR_REC_LEN(2) - infosize;
+	unsigned int entry_space = dir->i_sb->s_blocksize -
+			ext4_dir_rec_len(1, NULL) -
+			ext4_dir_rec_len(2, NULL) - infosize;
 
 	if (ext4_has_metadata_csum(dir->i_sb))
 		entry_space -= sizeof(struct dx_tail);
@@ -581,7 +584,8 @@ static inline unsigned dx_root_limit(struct inode *dir, unsigned infosize)
 
 static inline unsigned dx_node_limit(struct inode *dir)
 {
-	unsigned entry_space = dir->i_sb->s_blocksize - EXT4_DIR_REC_LEN(0);
+	unsigned int entry_space = dir->i_sb->s_blocksize -
+			ext4_dir_rec_len(0, dir);
 
 	if (ext4_has_metadata_csum(dir->i_sb))
 		entry_space -= sizeof(struct dx_tail);
@@ -677,7 +681,10 @@ static struct stats dx_show_leaf(struct inode *dir,
 						name = fname_crypto_str.name;
 						len = fname_crypto_str.len;
 					}
-					ext4fs_dirhash(dir, de->name,
+					if (IS_CASEFOLDED(dir))
+						h.hash = EXT4_DIRENT_HASH(de);
+					else
+						ext4fs_dirhash(dir, de->name,
 						       de->name_len, &h);
 					printk("%*.s:(E)%x.%u ", len, name,
 					       h.hash, (unsigned) ((char *) de
@@ -693,7 +700,7 @@ static struct stats dx_show_leaf(struct inode *dir,
 				       (unsigned) ((char *) de - base));
 #endif
 			}
-			space += EXT4_DIR_REC_LEN(de->name_len);
+			space += ext4_dir_rec_len(de->name_len, dir);
 			names++;
 		}
 		de = ext4_next_entry(de, size);
@@ -765,7 +772,8 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
 	root = (struct dx_root *) frame->bh->b_data;
 	if (root->info.hash_version != DX_HASH_TEA &&
 	    root->info.hash_version != DX_HASH_HALF_MD4 &&
-	    root->info.hash_version != DX_HASH_LEGACY) {
+	    root->info.hash_version != DX_HASH_LEGACY &&
+	    root->info.hash_version != DX_HASH_SIPHASH) {
 		ext4_warning_inode(dir, "Unrecognised inode hash code %u",
 				   root->info.hash_version);
 		goto fail;
@@ -1001,7 +1009,7 @@ static int htree_dirblock_to_tree(struct file *dir_file,
 	de = (struct ext4_dir_entry_2 *) bh->b_data;
 	top = (struct ext4_dir_entry_2 *) ((char *) de +
 					   dir->i_sb->s_blocksize -
-					   EXT4_DIR_REC_LEN(0));
+					   ext4_dir_rec_len(0, dir));
 #ifdef CONFIG_FS_ENCRYPTION
 	/* Check if the directory is encrypted */
 	if (IS_ENCRYPTED(dir)) {
@@ -1020,13 +1028,18 @@ static int htree_dirblock_to_tree(struct file *dir_file,
 #endif
 	for (; de < top; de = ext4_next_entry(de, dir->i_sb->s_blocksize)) {
 		if (ext4_check_dir_entry(dir, NULL, de, bh,
-				bh->b_data, bh->b_size,
+				bh->b_data, bh->b_size, block,
 				(block<<EXT4_BLOCK_SIZE_BITS(dir->i_sb))
 					 + ((char *)de - bh->b_data))) {
 			/* silently ignore the rest of the block */
 			break;
 		}
-		ext4fs_dirhash(dir, de->name, de->name_len, hinfo);
+		if (ext4_hash_in_dirent(dir)) {
+			hinfo->hash = EXT4_DIRENT_HASH(de);
+			hinfo->minor_hash = EXT4_DIRENT_MINOR_HASH(de);
+		} else {
+			ext4fs_dirhash(dir, de->name, de->name_len, hinfo);
+		}
 		if ((hinfo->hash < start_hash) ||
 		    ((hinfo->hash == start_hash) &&
 		     (hinfo->minor_hash < start_minor_hash)))
@@ -1097,7 +1110,11 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash,
 		       start_hash, start_minor_hash));
 	dir = file_inode(dir_file);
 	if (!(ext4_test_inode_flag(dir, EXT4_INODE_INDEX))) {
-		hinfo.hash_version = EXT4_SB(dir->i_sb)->s_def_hash_version;
+		if (ext4_hash_in_dirent(dir))
+			hinfo.hash_version = DX_HASH_SIPHASH;
+		else
+			hinfo.hash_version =
+					EXT4_SB(dir->i_sb)->s_def_hash_version;
 		if (hinfo.hash_version <= DX_HASH_TEA)
 			hinfo.hash_version +=
 				EXT4_SB(dir->i_sb)->s_hash_unsigned;
@@ -1190,11 +1207,12 @@ int ext4_htree_fill_tree(struct file *dir_file, __u32 start_hash,
 static inline int search_dirblock(struct buffer_head *bh,
 				  struct inode *dir,
 				  struct ext4_filename *fname,
+				  ext4_lblk_t lblk,
 				  unsigned int offset,
 				  struct ext4_dir_entry_2 **res_dir)
 {
 	return ext4_search_dir(bh, bh->b_data, dir->i_sb->s_blocksize, dir,
-			       fname, offset, res_dir);
+			       fname, lblk, offset, res_dir);
 }
 
 /*
@@ -1215,7 +1233,10 @@ static int dx_make_map(struct inode *dir, struct ext4_dir_entry_2 *de,
 
 	while ((char *) de < base + blocksize) {
 		if (de->name_len && de->inode) {
-			ext4fs_dirhash(dir, de->name, de->name_len, &h);
+			if (ext4_hash_in_dirent(dir))
+				h.hash = EXT4_DIRENT_HASH(de);
+			else
+				ext4fs_dirhash(dir, de->name, de->name_len, &h);
 			map_tail--;
 			map_tail->hash = h.hash;
 			map_tail->offs = ((char *) de - base)>>2;
@@ -1279,31 +1300,55 @@ static void dx_insert_block(struct dx_frame *frame, u32 hash, ext4_lblk_t block)
  * Returns: 0 if the directory entry matches, more than 0 if it
  * doesn't match or less than zero on error.
  */
-int ext4_ci_compare(const struct inode *parent, const struct qstr *name,
-		    const struct qstr *entry, bool quick)
+int ext4_ci_compare(struct inode *parent, const struct qstr *name,
+		unsigned char *name2, size_t len, bool quick)
 {
 	const struct super_block *sb = parent->i_sb;
 	const struct unicode_map *um = sb->s_encoding;
+	const struct fscrypt_str crypt_entry = FSTR_INIT(name2, len);
+	struct fscrypt_str decrypted_entry;
+	struct qstr entry = QSTR_INIT(name2, len);
 	int ret;
 
+	decrypted_entry.name = NULL;
+	decrypted_entry.len = 0;
+	if (IS_ENCRYPTED(parent) && fscrypt_has_encryption_key(parent)) {
+		decrypted_entry.name = kmalloc(len, GFP_ATOMIC);
+		decrypted_entry.len = len;
+		if (!decrypted_entry.name)
+			return -ENOMEM;
+		ret = fscrypt_fname_disk_to_usr(parent, 0, 0, &crypt_entry,
+							&decrypted_entry);
+		if (ret < 0)
+			goto err;
+	}
+
+	{
+	struct qstr decrypted = FSTR_TO_QSTR(&decrypted_entry);
 	if (quick)
-		ret = utf8_strncasecmp_folded(um, name, entry);
+		ret = utf8_strncasecmp_folded(um, name,
+				decrypted_entry.name ? &decrypted : &entry);
 	else
-		ret = utf8_strncasecmp(um, name, entry);
+		ret = utf8_strncasecmp(um, name,
+				decrypted_entry.name ? &decrypted : &entry);
+	}
 
 	if (ret < 0) {
 		/* Handle invalid character sequence as either an error
 		 * or as an opaque byte sequence.
 		 */
-		if (sb_has_enc_strict_mode(sb))
-			return -EINVAL;
-
-		if (name->len != entry->len)
-			return 1;
+		if (sb_has_enc_strict_mode(sb)) {
+			ret = -EINVAL;
+			goto err;
+		}
 
-		return !!memcmp(name->name, entry->name, name->len);
+		if (name->len != entry.len)
+			ret = 1;
+		else
+			ret = !!memcmp(name->name, entry.name, name->len);
 	}
-
+err:
+	kfree(decrypted_entry.name);
 	return ret;
 }
 
@@ -1339,14 +1384,11 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
  *
  * Return: %true if the directory entry matches, otherwise %false.
  */
-static inline bool ext4_match(const struct inode *parent,
+static bool ext4_match(struct inode *parent,
 			      const struct ext4_filename *fname,
-			      const struct ext4_dir_entry_2 *de)
+			      struct ext4_dir_entry_2 *de)
 {
 	struct fscrypt_name f;
-#ifdef CONFIG_UNICODE
-	const struct qstr entry = {.name = de->name, .len = de->name_len};
-#endif
 
 	if (!de->inode)
 		return false;
@@ -1358,14 +1400,27 @@ static inline bool ext4_match(const struct inode *parent,
 #endif
 
 #ifdef CONFIG_UNICODE
-	if (parent->i_sb->s_encoding && IS_CASEFOLDED(parent)) {
+	if (parent->i_sb->s_encoding && needs_casefold(parent)) {
 		if (fname->cf_name.name) {
 			struct qstr cf = {.name = fname->cf_name.name,
 					  .len = fname->cf_name.len};
-			return !ext4_ci_compare(parent, &cf, &entry, true);
+			if (IS_ENCRYPTED(parent)) {
+				struct dx_hash_info hinfo;
+
+				hinfo.hash_version = DX_HASH_SIPHASH;
+				hinfo.seed = NULL;
+				ext4fs_dirhash(parent, fname->cf_name.name,
+						fname_len(fname), &hinfo);
+				if (hinfo.hash != EXT4_DIRENT_HASH(de) ||
+						hinfo.minor_hash !=
+						    EXT4_DIRENT_MINOR_HASH(de))
+					return 0;
+			}
+			return !ext4_ci_compare(parent, &cf, de->name,
+							de->name_len, true);
 		}
-		return !ext4_ci_compare(parent, fname->usr_fname, &entry,
-					false);
+		return !ext4_ci_compare(parent, fname->usr_fname, de->name,
+						de->name_len, false);
 	}
 #endif
 
@@ -1377,7 +1432,8 @@ static inline bool ext4_match(const struct inode *parent,
  */
 int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size,
 		    struct inode *dir, struct ext4_filename *fname,
-		    unsigned int offset, struct ext4_dir_entry_2 **res_dir)
+		    ext4_lblk_t lblk, unsigned int offset,
+		    struct ext4_dir_entry_2 **res_dir)
 {
 	struct ext4_dir_entry_2 * de;
 	char * dlimit;
@@ -1393,7 +1449,7 @@ int ext4_search_dir(struct buffer_head *bh, char *search_buf, int buf_size,
 			/* found a match - just to be sure, do
 			 * a full check */
 			if (ext4_check_dir_entry(dir, NULL, de, bh, bh->b_data,
-						 bh->b_size, offset))
+						 bh->b_size, lblk, offset))
 				return -1;
 			*res_dir = de;
 			return 1;
@@ -1439,7 +1495,7 @@ static int is_dx_internal_node(struct inode *dir, ext4_lblk_t block,
 static struct buffer_head *__ext4_find_entry(struct inode *dir,
 					     struct ext4_filename *fname,
 					     struct ext4_dir_entry_2 **res_dir,
-					     int *inlined)
+					     int *inlined, ext4_lblk_t *lblk)
 {
 	struct super_block *sb;
 	struct buffer_head *bh_use[NAMEI_RA_SIZE];
@@ -1463,6 +1519,8 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 		int has_inline_data = 1;
 		ret = ext4_find_inline_entry(dir, fname, res_dir,
 					     &has_inline_data);
+		if (lblk)
+			*lblk = 0;
 		if (has_inline_data) {
 			if (inlined)
 				*inlined = 1;
@@ -1481,7 +1539,7 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 		goto restart;
 	}
 	if (is_dx(dir)) {
-		ret = ext4_dx_find_entry(dir, fname, res_dir);
+		ret = ext4_dx_find_entry(dir, fname, res_dir, lblk);
 		/*
 		 * On success, or if the error was file not found,
 		 * return.  Otherwise, fall back to doing a search the
@@ -1544,9 +1602,11 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 			goto cleanup_and_exit;
 		}
 		set_buffer_verified(bh);
-		i = search_dirblock(bh, dir, fname,
+		i = search_dirblock(bh, dir, fname, block,
 			    block << EXT4_BLOCK_SIZE_BITS(sb), res_dir);
 		if (i == 1) {
+			if (lblk)
+				*lblk = block;
 			EXT4_I(dir)->i_dir_start_lookup = block;
 			ret = bh;
 			goto cleanup_and_exit;
@@ -1581,7 +1641,7 @@ static struct buffer_head *__ext4_find_entry(struct inode *dir,
 static struct buffer_head *ext4_find_entry(struct inode *dir,
 					   const struct qstr *d_name,
 					   struct ext4_dir_entry_2 **res_dir,
-					   int *inlined)
+					   int *inlined, ext4_lblk_t *lblk)
 {
 	int err;
 	struct ext4_filename fname;
@@ -1593,7 +1653,7 @@ static struct buffer_head *ext4_find_entry(struct inode *dir,
 	if (err)
 		return ERR_PTR(err);
 
-	bh = __ext4_find_entry(dir, &fname, res_dir, inlined);
+	bh = __ext4_find_entry(dir, &fname, res_dir, inlined, lblk);
 
 	ext4_fname_free_filename(&fname);
 	return bh;
@@ -1613,7 +1673,7 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
 	if (err)
 		return ERR_PTR(err);
 
-	bh = __ext4_find_entry(dir, &fname, res_dir, NULL);
+	bh = __ext4_find_entry(dir, &fname, res_dir, NULL, NULL);
 
 	ext4_fname_free_filename(&fname);
 	return bh;
@@ -1621,7 +1681,7 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
 
 static struct buffer_head * ext4_dx_find_entry(struct inode *dir,
 			struct ext4_filename *fname,
-			struct ext4_dir_entry_2 **res_dir)
+			struct ext4_dir_entry_2 **res_dir, ext4_lblk_t *lblk)
 {
 	struct super_block * sb = dir->i_sb;
 	struct dx_frame frames[EXT4_HTREE_LEVEL], *frame;
@@ -1637,11 +1697,13 @@ static struct buffer_head * ext4_dx_find_entry(struct inode *dir,
 		return (struct buffer_head *) frame;
 	do {
 		block = dx_get_block(frame->at);
+		if (lblk)
+			*lblk = block;
 		bh = ext4_read_dirblock(dir, block, DIRENT_HTREE);
 		if (IS_ERR(bh))
 			goto errout;
 
-		retval = search_dirblock(bh, dir, fname,
+		retval = search_dirblock(bh, dir, fname, block,
 					 block << EXT4_BLOCK_SIZE_BITS(sb),
 					 res_dir);
 		if (retval == 1)
@@ -1736,7 +1798,7 @@ struct dentry *ext4_get_parent(struct dentry *child)
 	struct ext4_dir_entry_2 * de;
 	struct buffer_head *bh;
 
-	bh = ext4_find_entry(d_inode(child), &dotdot, &de, NULL);
+	bh = ext4_find_entry(d_inode(child), &dotdot, &de, NULL, NULL);
 	if (IS_ERR(bh))
 		return ERR_CAST(bh);
 	if (!bh)
@@ -1758,7 +1820,8 @@ struct dentry *ext4_get_parent(struct dentry *child)
  * Returns pointer to last entry moved.
  */
 static struct ext4_dir_entry_2 *
-dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count,
+dx_move_dirents(struct inode *dir, char *from, char *to,
+		struct dx_map_entry *map, int count,
 		unsigned blocksize)
 {
 	unsigned rec_len = 0;
@@ -1766,7 +1829,8 @@ dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count,
 	while (count--) {
 		struct ext4_dir_entry_2 *de = (struct ext4_dir_entry_2 *)
 						(from + (map->offs<<2));
-		rec_len = EXT4_DIR_REC_LEN(de->name_len);
+		rec_len = ext4_dir_rec_len(de->name_len, dir);
+
 		memcpy (to, de, rec_len);
 		((struct ext4_dir_entry_2 *) to)->rec_len =
 				ext4_rec_len_to_disk(rec_len, blocksize);
@@ -1781,7 +1845,8 @@ dx_move_dirents(char *from, char *to, struct dx_map_entry *map, int count,
  * Compact each dir entry in the range to the minimal rec_len.
  * Returns pointer to last entry in range.
  */
-static struct ext4_dir_entry_2* dx_pack_dirents(char *base, unsigned blocksize)
+static struct ext4_dir_entry_2 *dx_pack_dirents(struct inode *dir, char *base,
+							unsigned int blocksize)
 {
 	struct ext4_dir_entry_2 *next, *to, *prev, *de = (struct ext4_dir_entry_2 *) base;
 	unsigned rec_len = 0;
@@ -1790,7 +1855,7 @@ static struct ext4_dir_entry_2* dx_pack_dirents(char *base, unsigned blocksize)
 	while ((char*)de < base + blocksize) {
 		next = ext4_next_entry(de, blocksize);
 		if (de->inode && de->name_len) {
-			rec_len = EXT4_DIR_REC_LEN(de->name_len);
+			rec_len = ext4_dir_rec_len(de->name_len, dir);
 			if (de > to)
 				memmove(to, de, rec_len);
 			to->rec_len = ext4_rec_len_to_disk(rec_len, blocksize);
@@ -1808,13 +1873,12 @@ static struct ext4_dir_entry_2* dx_pack_dirents(char *base, unsigned blocksize)
  * Returns pointer to de in block into which the new entry will be inserted.
  */
 static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
-			struct buffer_head **bh,struct dx_frame *frame,
-			struct dx_hash_info *hinfo)
+			struct buffer_head **bh, struct dx_frame *frame,
+			struct dx_hash_info *hinfo, ext4_lblk_t *newblock)
 {
 	unsigned blocksize = dir->i_sb->s_blocksize;
 	unsigned count, continued;
 	struct buffer_head *bh2;
-	ext4_lblk_t newblock;
 	u32 hash2;
 	struct dx_map_entry *map;
 	char *data1 = (*bh)->b_data, *data2;
@@ -1826,7 +1890,7 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
 	if (ext4_has_metadata_csum(dir->i_sb))
 		csum_size = sizeof(struct ext4_dir_entry_tail);
 
-	bh2 = ext4_append(handle, dir, &newblock);
+	bh2 = ext4_append(handle, dir, newblock);
 	if (IS_ERR(bh2)) {
 		brelse(*bh);
 		*bh = NULL;
@@ -1870,9 +1934,9 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
 					hash2, split, count-split));
 
 	/* Fancy dance to stay within two buffers */
-	de2 = dx_move_dirents(data1, data2, map + split, count - split,
+	de2 = dx_move_dirents(dir, data1, data2, map + split, count - split,
 			      blocksize);
-	de = dx_pack_dirents(data1, blocksize);
+	de = dx_pack_dirents(dir, data1, blocksize);
 	de->rec_len = ext4_rec_len_to_disk(data1 + (blocksize - csum_size) -
 					   (char *) de,
 					   blocksize);
@@ -1894,7 +1958,7 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
 		swap(*bh, bh2);
 		de = de2;
 	}
-	dx_insert_block(frame, hash2 + continued, newblock);
+	dx_insert_block(frame, hash2 + continued, *newblock);
 	err = ext4_handle_dirty_dirblock(handle, dir, bh2);
 	if (err)
 		goto journal_error;
@@ -1914,13 +1978,14 @@ static struct ext4_dir_entry_2 *do_split(handle_t *handle, struct inode *dir,
 }
 
 int ext4_find_dest_de(struct inode *dir, struct inode *inode,
+		      ext4_lblk_t lblk,
 		      struct buffer_head *bh,
 		      void *buf, int buf_size,
 		      struct ext4_filename *fname,
 		      struct ext4_dir_entry_2 **dest_de)
 {
 	struct ext4_dir_entry_2 *de;
-	unsigned short reclen = EXT4_DIR_REC_LEN(fname_len(fname));
+	unsigned short reclen = ext4_dir_rec_len(fname_len(fname), dir);
 	int nlen, rlen;
 	unsigned int offset = 0;
 	char *top;
@@ -1929,11 +1994,11 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode,
 	top = buf + buf_size - reclen;
 	while ((char *) de <= top) {
 		if (ext4_check_dir_entry(dir, NULL, de, bh,
-					 buf, buf_size, offset))
+					 buf, buf_size, lblk, offset))
 			return -EFSCORRUPTED;
 		if (ext4_match(dir, fname, de))
 			return -EEXIST;
-		nlen = EXT4_DIR_REC_LEN(de->name_len);
+		nlen = ext4_dir_rec_len(de->name_len, dir);
 		rlen = ext4_rec_len_from_disk(de->rec_len, buf_size);
 		if ((de->inode ? rlen - nlen : rlen) >= reclen)
 			break;
@@ -1947,7 +2012,8 @@ int ext4_find_dest_de(struct inode *dir, struct inode *inode,
 	return 0;
 }
 
-void ext4_insert_dentry(struct inode *inode,
+void ext4_insert_dentry(struct inode *dir,
+			struct inode *inode,
 			struct ext4_dir_entry_2 *de,
 			int buf_size,
 			struct ext4_filename *fname)
@@ -1955,7 +2021,7 @@ void ext4_insert_dentry(struct inode *inode,
 
 	int nlen, rlen;
 
-	nlen = EXT4_DIR_REC_LEN(de->name_len);
+	nlen = ext4_dir_rec_len(de->name_len, dir);
 	rlen = ext4_rec_len_from_disk(de->rec_len, buf_size);
 	if (de->inode) {
 		struct ext4_dir_entry_2 *de1 =
@@ -1969,6 +2035,17 @@ void ext4_insert_dentry(struct inode *inode,
 	ext4_set_de_type(inode->i_sb, de, inode->i_mode);
 	de->name_len = fname_len(fname);
 	memcpy(de->name, fname_name(fname), fname_len(fname));
+	if (ext4_hash_in_dirent(dir)) {
+		struct dx_hash_info hinfo;
+
+		hinfo.hash_version = DX_HASH_SIPHASH;
+		hinfo.seed = NULL;
+		ext4fs_dirhash(dir, fname_usr_name(fname),
+				fname_len(fname), &hinfo);
+		EXT4_EXTENDED_DIRENT(de)->hash = cpu_to_le32(hinfo.hash);
+		EXT4_EXTENDED_DIRENT(de)->minor_hash =
+				cpu_to_le32(hinfo.minor_hash);
+	}
 }
 
 /*
@@ -1982,6 +2059,7 @@ void ext4_insert_dentry(struct inode *inode,
 static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname,
 			     struct inode *dir,
 			     struct inode *inode, struct ext4_dir_entry_2 *de,
+			     ext4_lblk_t blk,
 			     struct buffer_head *bh)
 {
 	unsigned int	blocksize = dir->i_sb->s_blocksize;
@@ -1992,7 +2070,7 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname,
 		csum_size = sizeof(struct ext4_dir_entry_tail);
 
 	if (!de) {
-		err = ext4_find_dest_de(dir, inode, bh, bh->b_data,
+		err = ext4_find_dest_de(dir, inode, blk, bh, bh->b_data,
 					blocksize - csum_size, fname, &de);
 		if (err)
 			return err;
@@ -2005,7 +2083,7 @@ static int add_dirent_to_buf(handle_t *handle, struct ext4_filename *fname,
 	}
 
 	/* By now the buffer is marked for journaling */
-	ext4_insert_dentry(inode, de, blocksize, fname);
+	ext4_insert_dentry(dir, inode, de, blocksize, fname);
 
 	/*
 	 * XXX shouldn't update any times until successful
@@ -2097,11 +2175,16 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
 
 	/* Initialize the root; the dot dirents already exist */
 	de = (struct ext4_dir_entry_2 *) (&root->dotdot);
-	de->rec_len = ext4_rec_len_to_disk(blocksize - EXT4_DIR_REC_LEN(2),
-					   blocksize);
+	de->rec_len = ext4_rec_len_to_disk(
+			blocksize - ext4_dir_rec_len(2, NULL), blocksize);
 	memset (&root->info, 0, sizeof(root->info));
 	root->info.info_length = sizeof(root->info);
-	root->info.hash_version = EXT4_SB(dir->i_sb)->s_def_hash_version;
+	if (ext4_hash_in_dirent(dir))
+		root->info.hash_version = DX_HASH_SIPHASH;
+	else
+		root->info.hash_version =
+				EXT4_SB(dir->i_sb)->s_def_hash_version;
+
 	entries = root->entries;
 	dx_set_block(entries, 1);
 	dx_set_count(entries, 1);
@@ -2112,7 +2195,12 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
 	if (fname->hinfo.hash_version <= DX_HASH_TEA)
 		fname->hinfo.hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;
 	fname->hinfo.seed = EXT4_SB(dir->i_sb)->s_hash_seed;
-	ext4fs_dirhash(dir, fname_name(fname), fname_len(fname), &fname->hinfo);
+	if (ext4_hash_in_dirent(dir))
+		ext4fs_dirhash(dir, fname_usr_name(fname),
+				fname_len(fname), &fname->hinfo);
+	else
+		ext4fs_dirhash(dir, fname_name(fname),
+				fname_len(fname), &fname->hinfo);
 
 	memset(frames, 0, sizeof(frames));
 	frame = frames;
@@ -2127,13 +2215,13 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
 	if (retval)
 		goto out_frames;	
 
-	de = do_split(handle,dir, &bh2, frame, &fname->hinfo);
+	de = do_split(handle, dir, &bh2, frame, &fname->hinfo, &block);
 	if (IS_ERR(de)) {
 		retval = PTR_ERR(de);
 		goto out_frames;
 	}
 
-	retval = add_dirent_to_buf(handle, fname, dir, inode, de, bh2);
+	retval = add_dirent_to_buf(handle, fname, dir, inode, de, block, bh2);
 out_frames:
 	/*
 	 * Even if the block split failed, we have to properly write
@@ -2223,7 +2311,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
 			goto out;
 		}
 		retval = add_dirent_to_buf(handle, &fname, dir, inode,
-					   NULL, bh);
+					   NULL, block, bh);
 		if (retval != -ENOSPC)
 			goto out;
 
@@ -2250,7 +2338,7 @@ static int ext4_add_entry(handle_t *handle, struct dentry *dentry,
 	if (csum_size)
 		ext4_initialize_dirent_tail(bh, blocksize);
 
-	retval = add_dirent_to_buf(handle, &fname, dir, inode, de, bh);
+	retval = add_dirent_to_buf(handle, &fname, dir, inode, de, block, bh);
 out:
 	ext4_fname_free_filename(&fname);
 	brelse(bh);
@@ -2272,6 +2360,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 	struct ext4_dir_entry_2 *de;
 	int restart;
 	int err;
+	ext4_lblk_t lblk;
 
 again:
 	restart = 0;
@@ -2280,7 +2369,8 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 		return PTR_ERR(frame);
 	entries = frame->entries;
 	at = frame->at;
-	bh = ext4_read_dirblock(dir, dx_get_block(frame->at), DIRENT_HTREE);
+	lblk = dx_get_block(frame->at);
+	bh = ext4_read_dirblock(dir, lblk, DIRENT_HTREE);
 	if (IS_ERR(bh)) {
 		err = PTR_ERR(bh);
 		bh = NULL;
@@ -2292,7 +2382,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 	if (err)
 		goto journal_error;
 
-	err = add_dirent_to_buf(handle, fname, dir, inode, NULL, bh);
+	err = add_dirent_to_buf(handle, fname, dir, inode, NULL, lblk, bh);
 	if (err != -ENOSPC)
 		goto cleanup;
 
@@ -2412,12 +2502,12 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 			goto journal_error;
 		}
 	}
-	de = do_split(handle, dir, &bh, frame, &fname->hinfo);
+	de = do_split(handle, dir, &bh, frame, &fname->hinfo, &lblk);
 	if (IS_ERR(de)) {
 		err = PTR_ERR(de);
 		goto cleanup;
 	}
-	err = add_dirent_to_buf(handle, fname, dir, inode, de, bh);
+	err = add_dirent_to_buf(handle, fname, dir, inode, de, lblk, bh);
 	goto cleanup;
 
 journal_error:
@@ -2440,6 +2530,7 @@ static int ext4_dx_add_entry(handle_t *handle, struct ext4_filename *fname,
 int ext4_generic_delete_entry(handle_t *handle,
 			      struct inode *dir,
 			      struct ext4_dir_entry_2 *de_del,
+			      ext4_lblk_t lblk,
 			      struct buffer_head *bh,
 			      void *entry_buf,
 			      int buf_size,
@@ -2454,7 +2545,7 @@ int ext4_generic_delete_entry(handle_t *handle,
 	de = (struct ext4_dir_entry_2 *)entry_buf;
 	while (i < buf_size - csum_size) {
 		if (ext4_check_dir_entry(dir, NULL, de, bh,
-					 bh->b_data, bh->b_size, i))
+					 bh->b_data, bh->b_size, lblk, i))
 			return -EFSCORRUPTED;
 		if (de == de_del)  {
 			if (pde)
@@ -2479,6 +2570,7 @@ int ext4_generic_delete_entry(handle_t *handle,
 static int ext4_delete_entry(handle_t *handle,
 			     struct inode *dir,
 			     struct ext4_dir_entry_2 *de_del,
+			     ext4_lblk_t lblk,
 			     struct buffer_head *bh)
 {
 	int err, csum_size = 0;
@@ -2499,7 +2591,7 @@ static int ext4_delete_entry(handle_t *handle,
 	if (unlikely(err))
 		goto out;
 
-	err = ext4_generic_delete_entry(handle, dir, de_del,
+	err = ext4_generic_delete_entry(handle, dir, de_del, lblk,
 					bh, bh->b_data,
 					dir->i_sb->s_blocksize, csum_size);
 	if (err)
@@ -2693,7 +2785,7 @@ struct ext4_dir_entry_2 *ext4_init_dot_dotdot(struct inode *inode,
 {
 	de->inode = cpu_to_le32(inode->i_ino);
 	de->name_len = 1;
-	de->rec_len = ext4_rec_len_to_disk(EXT4_DIR_REC_LEN(de->name_len),
+	de->rec_len = ext4_rec_len_to_disk(ext4_dir_rec_len(de->name_len, NULL),
 					   blocksize);
 	strcpy(de->name, ".");
 	ext4_set_de_type(inode->i_sb, de, S_IFDIR);
@@ -2703,11 +2795,12 @@ struct ext4_dir_entry_2 *ext4_init_dot_dotdot(struct inode *inode,
 	de->name_len = 2;
 	if (!dotdot_real_len)
 		de->rec_len = ext4_rec_len_to_disk(blocksize -
-					(csum_size + EXT4_DIR_REC_LEN(1)),
+					(csum_size + ext4_dir_rec_len(1, NULL)),
 					blocksize);
 	else
 		de->rec_len = ext4_rec_len_to_disk(
-				EXT4_DIR_REC_LEN(de->name_len), blocksize);
+					ext4_dir_rec_len(de->name_len, NULL),
+					blocksize);
 	strcpy(de->name, "..");
 	ext4_set_de_type(inode->i_sb, de, S_IFDIR);
 
@@ -2835,7 +2928,8 @@ bool ext4_empty_dir(struct inode *inode)
 	}
 
 	sb = inode->i_sb;
-	if (inode->i_size < EXT4_DIR_REC_LEN(1) + EXT4_DIR_REC_LEN(2)) {
+	if (inode->i_size < ext4_dir_rec_len(1, NULL) +
+					ext4_dir_rec_len(2, NULL)) {
 		EXT4_ERROR_INODE(inode, "invalid size");
 		return true;
 	}
@@ -2873,7 +2967,7 @@ bool ext4_empty_dir(struct inode *inode)
 			de = (struct ext4_dir_entry_2 *) bh->b_data;
 		}
 		if (ext4_check_dir_entry(inode, NULL, de, bh,
-					 bh->b_data, bh->b_size, offset)) {
+					 bh->b_data, bh->b_size, 0, offset)) {
 			de = (struct ext4_dir_entry_2 *)(bh->b_data +
 							 sb->s_blocksize);
 			offset = (offset | (sb->s_blocksize - 1)) + 1;
@@ -3071,6 +3165,8 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry)
 	struct buffer_head *bh;
 	struct ext4_dir_entry_2 *de;
 	handle_t *handle = NULL;
+	ext4_lblk_t lblk;
+
 
 	if (unlikely(ext4_forced_shutdown(EXT4_SB(dir->i_sb))))
 		return -EIO;
@@ -3085,7 +3181,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry)
 		return retval;
 
 	retval = -ENOENT;
-	bh = ext4_find_entry(dir, &dentry->d_name, &de, NULL);
+	bh = ext4_find_entry(dir, &dentry->d_name, &de, NULL, &lblk);
 	if (IS_ERR(bh))
 		return PTR_ERR(bh);
 	if (!bh)
@@ -3112,7 +3208,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry)
 	if (IS_DIRSYNC(dir))
 		ext4_handle_sync(handle);
 
-	retval = ext4_delete_entry(handle, dir, de, bh);
+	retval = ext4_delete_entry(handle, dir, de, lblk, bh);
 	if (retval)
 		goto end_rmdir;
 	if (!EXT4_DIR_LINK_EMPTY(inode))
@@ -3158,6 +3254,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
 	struct buffer_head *bh;
 	struct ext4_dir_entry_2 *de;
 	handle_t *handle = NULL;
+	ext4_lblk_t lblk;
 
 	if (unlikely(ext4_forced_shutdown(EXT4_SB(dir->i_sb))))
 		return -EIO;
@@ -3173,7 +3270,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
 		return retval;
 
 	retval = -ENOENT;
-	bh = ext4_find_entry(dir, &dentry->d_name, &de, NULL);
+	bh = ext4_find_entry(dir, &dentry->d_name, &de, NULL, &lblk);
 	if (IS_ERR(bh))
 		return PTR_ERR(bh);
 	if (!bh)
@@ -3196,7 +3293,7 @@ static int ext4_unlink(struct inode *dir, struct dentry *dentry)
 	if (IS_DIRSYNC(dir))
 		ext4_handle_sync(handle);
 
-	retval = ext4_delete_entry(handle, dir, de, bh);
+	retval = ext4_delete_entry(handle, dir, de, lblk, bh);
 	if (retval)
 		goto end_unlink;
 	dir->i_ctime = dir->i_mtime = current_time(dir);
@@ -3457,6 +3554,7 @@ struct ext4_renament {
 	int dir_nlink_delta;
 
 	/* entry for "dentry" */
+	ext4_lblk_t lblk;
 	struct buffer_head *bh;
 	struct ext4_dir_entry_2 *de;
 	int inlined;
@@ -3544,12 +3642,13 @@ static int ext4_find_delete_entry(handle_t *handle, struct inode *dir,
 	int retval = -ENOENT;
 	struct buffer_head *bh;
 	struct ext4_dir_entry_2 *de;
+	ext4_lblk_t lblk;
 
-	bh = ext4_find_entry(dir, d_name, &de, NULL);
+	bh = ext4_find_entry(dir, d_name, &de, NULL, &lblk);
 	if (IS_ERR(bh))
 		return PTR_ERR(bh);
 	if (bh) {
-		retval = ext4_delete_entry(handle, dir, de, bh);
+		retval = ext4_delete_entry(handle, dir, de, lblk, bh);
 		brelse(bh);
 	}
 	return retval;
@@ -3573,7 +3672,8 @@ static void ext4_rename_delete(handle_t *handle, struct ext4_renament *ent,
 		retval = ext4_find_delete_entry(handle, ent->dir,
 						&ent->dentry->d_name);
 	} else {
-		retval = ext4_delete_entry(handle, ent->dir, ent->de, ent->bh);
+		retval = ext4_delete_entry(handle, ent->dir, ent->de,
+						ent->lblk, ent->bh);
 		if (retval == -ENOENT) {
 			retval = ext4_find_delete_entry(handle, ent->dir,
 							&ent->dentry->d_name);
@@ -3686,7 +3786,8 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 			return retval;
 	}
 
-	old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL);
+	old.bh = ext4_find_entry(old.dir, &old.dentry->d_name, &old.de, NULL,
+				&old.lblk);
 	if (IS_ERR(old.bh))
 		return PTR_ERR(old.bh);
 	/*
@@ -3700,7 +3801,7 @@ static int ext4_rename(struct inode *old_dir, struct dentry *old_dentry,
 		goto end_rename;
 
 	new.bh = ext4_find_entry(new.dir, &new.dentry->d_name,
-				 &new.de, &new.inlined);
+				 &new.de, &new.inlined, NULL);
 	if (IS_ERR(new.bh)) {
 		retval = PTR_ERR(new.bh);
 		new.bh = NULL;
@@ -3880,7 +3981,7 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
 		return retval;
 
 	old.bh = ext4_find_entry(old.dir, &old.dentry->d_name,
-				 &old.de, &old.inlined);
+				 &old.de, &old.inlined, NULL);
 	if (IS_ERR(old.bh))
 		return PTR_ERR(old.bh);
 	/*
@@ -3894,7 +3995,7 @@ static int ext4_cross_rename(struct inode *old_dir, struct dentry *old_dentry,
 		goto end_rename;
 
 	new.bh = ext4_find_entry(new.dir, &new.dentry->d_name,
-				 &new.de, &new.inlined);
+				 &new.de, &new.inlined, NULL);
 	if (IS_ERR(new.bh)) {
 		retval = PTR_ERR(new.bh);
 		new.bh = NULL;
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 074e61b15181..47b81e8faa44 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -3820,12 +3820,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		struct unicode_map *encoding;
 		__u16 encoding_flags;
 
-		if (ext4_has_feature_encrypt(sb)) {
-			ext4_msg(sb, KERN_ERR,
-				 "Can't mount with encoding and encryption");
-			goto failed_mount;
-		}
-
 		if (ext4_sb_read_encoding(es, &encoding_info,
 					  &encoding_flags)) {
 			ext4_msg(sb, KERN_ERR,
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 8/8] ext4: Optimize match for casefolded encrypted dirs
  2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
                   ` (6 preceding siblings ...)
  2019-12-03  5:10 ` [PATCH 7/8] ext4: Hande casefolding with encryption Daniel Rosenberg
@ 2019-12-03  5:10 ` Daniel Rosenberg
  7 siblings, 0 replies; 21+ messages in thread
From: Daniel Rosenberg @ 2019-12-03  5:10 UTC (permalink / raw)
  To: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro
  Cc: Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Matching names with casefolded encrypting directories requires
decrypting entries to confirm case since we are case preserving. We can
avoid needing to decrypt if our hash values don't match.

Signed-off-by: Daniel Rosenberg <drosen@google.com>
---
 fs/ext4/ext4.h  | 17 ++++++++-------
 fs/ext4/namei.c | 57 ++++++++++++++++++++++++++-----------------------
 2 files changed, 39 insertions(+), 35 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f06bab489d37..f104c46a6895 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2393,9 +2393,9 @@ extern unsigned ext4_free_clusters_after_init(struct super_block *sb,
 ext4_fsblk_t ext4_inode_to_goal_block(struct inode *);
 
 #ifdef CONFIG_UNICODE
-extern void ext4_fname_setup_ci_filename(struct inode *dir,
+extern int ext4_fname_setup_ci_filename(struct inode *dir,
 					 const struct qstr *iname,
-					 struct fscrypt_str *fname);
+					 struct ext4_filename *fname);
 #endif
 
 #ifdef CONFIG_FS_ENCRYPTION
@@ -2426,9 +2426,9 @@ static inline int ext4_fname_setup_filename(struct inode *dir,
 	ext4_fname_from_fscrypt_name(fname, &name);
 
 #ifdef CONFIG_UNICODE
-	ext4_fname_setup_ci_filename(dir, iname, &fname->cf_name);
+	err = ext4_fname_setup_ci_filename(dir, iname, fname);
 #endif
-	return 0;
+	return err;
 }
 
 static inline int ext4_fname_prepare_lookup(struct inode *dir,
@@ -2445,9 +2445,9 @@ static inline int ext4_fname_prepare_lookup(struct inode *dir,
 	ext4_fname_from_fscrypt_name(fname, &name);
 
 #ifdef CONFIG_UNICODE
-	ext4_fname_setup_ci_filename(dir, &dentry->d_name, &fname->cf_name);
+	err = ext4_fname_setup_ci_filename(dir, &dentry->d_name, fname);
 #endif
-	return 0;
+	return err;
 }
 
 static inline void ext4_fname_free_filename(struct ext4_filename *fname)
@@ -2472,15 +2472,16 @@ static inline int ext4_fname_setup_filename(struct inode *dir,
 					    int lookup,
 					    struct ext4_filename *fname)
 {
+	int err = 0;
 	fname->usr_fname = iname;
 	fname->disk_name.name = (unsigned char *) iname->name;
 	fname->disk_name.len = iname->len;
 
 #ifdef CONFIG_UNICODE
-	ext4_fname_setup_ci_filename(dir, iname, &fname->cf_name);
+	err = ext4_fname_setup_ci_filename(dir, iname, fname);
 #endif
 
-	return 0;
+	return err;
 }
 
 static inline int ext4_fname_prepare_lookup(struct inode *dir,
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index f536cfc626bd..58b58fb532ba 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -784,7 +784,9 @@ dx_probe(struct ext4_filename *fname, struct inode *dir,
 	if (hinfo->hash_version <= DX_HASH_TEA)
 		hinfo->hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;
 	hinfo->seed = EXT4_SB(dir->i_sb)->s_hash_seed;
-	if (fname && fname_name(fname))
+	/* hash is already computed for encrypted casefolded directory */
+	if (fname && fname_name(fname) &&
+				!(IS_ENCRYPTED(dir) && IS_CASEFOLDED(dir)))
 		ext4fs_dirhash(dir, fname_name(fname), fname_len(fname), hinfo);
 	hash = hinfo->hash;
 
@@ -1352,19 +1354,21 @@ int ext4_ci_compare(struct inode *parent, const struct qstr *name,
 	return ret;
 }
 
-void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
-				  struct fscrypt_str *cf_name)
+int ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
+				  struct ext4_filename *name)
 {
+	struct fscrypt_str *cf_name = &name->cf_name;
+	struct dx_hash_info *hinfo = &name->hinfo;
 	int len;
 
-	if (!IS_CASEFOLDED(dir) || !dir->i_sb->s_encoding) {
+	if (!needs_casefold(dir) || !dir->i_sb->s_encoding) {
 		cf_name->name = NULL;
-		return;
+		return 0;
 	}
 
 	cf_name->name = kmalloc(EXT4_NAME_LEN, GFP_NOFS);
 	if (!cf_name->name)
-		return;
+		return -ENOMEM;
 
 	len = utf8_casefold(dir->i_sb->s_encoding,
 			    iname, cf_name->name,
@@ -1372,10 +1376,18 @@ void ext4_fname_setup_ci_filename(struct inode *dir, const struct qstr *iname,
 	if (len <= 0) {
 		kfree(cf_name->name);
 		cf_name->name = NULL;
-		return;
 	}
 	cf_name->len = (unsigned) len;
+	if (!IS_ENCRYPTED(dir))
+		return 0;
 
+	hinfo->hash_version = DX_HASH_SIPHASH;
+	hinfo->seed = NULL;
+	if (cf_name->name)
+		ext4fs_dirhash(dir, cf_name->name, cf_name->len, hinfo);
+	else
+		ext4fs_dirhash(dir, iname->name, iname->len, hinfo);
+	return 0;
 }
 #endif
 
@@ -1405,16 +1417,12 @@ static bool ext4_match(struct inode *parent,
 			struct qstr cf = {.name = fname->cf_name.name,
 					  .len = fname->cf_name.len};
 			if (IS_ENCRYPTED(parent)) {
-				struct dx_hash_info hinfo;
-
-				hinfo.hash_version = DX_HASH_SIPHASH;
-				hinfo.seed = NULL;
-				ext4fs_dirhash(parent, fname->cf_name.name,
-						fname_len(fname), &hinfo);
-				if (hinfo.hash != EXT4_DIRENT_HASH(de) ||
-						hinfo.minor_hash !=
-						    EXT4_DIRENT_MINOR_HASH(de))
+				if (fname->hinfo.hash != EXT4_DIRENT_HASH(de) ||
+					fname->hinfo.minor_hash !=
+						EXT4_DIRENT_MINOR_HASH(de)) {
+
 					return 0;
+				}
 			}
 			return !ext4_ci_compare(parent, &cf, de->name,
 							de->name_len, true);
@@ -2036,15 +2044,11 @@ void ext4_insert_dentry(struct inode *dir,
 	de->name_len = fname_len(fname);
 	memcpy(de->name, fname_name(fname), fname_len(fname));
 	if (ext4_hash_in_dirent(dir)) {
-		struct dx_hash_info hinfo;
+		struct dx_hash_info *hinfo = &fname->hinfo;
 
-		hinfo.hash_version = DX_HASH_SIPHASH;
-		hinfo.seed = NULL;
-		ext4fs_dirhash(dir, fname_usr_name(fname),
-				fname_len(fname), &hinfo);
-		EXT4_EXTENDED_DIRENT(de)->hash = cpu_to_le32(hinfo.hash);
+		EXT4_EXTENDED_DIRENT(de)->hash = cpu_to_le32(hinfo->hash);
 		EXT4_EXTENDED_DIRENT(de)->minor_hash =
-				cpu_to_le32(hinfo.minor_hash);
+						cpu_to_le32(hinfo->minor_hash);
 	}
 }
 
@@ -2195,10 +2199,9 @@ static int make_indexed_dir(handle_t *handle, struct ext4_filename *fname,
 	if (fname->hinfo.hash_version <= DX_HASH_TEA)
 		fname->hinfo.hash_version += EXT4_SB(dir->i_sb)->s_hash_unsigned;
 	fname->hinfo.seed = EXT4_SB(dir->i_sb)->s_hash_seed;
-	if (ext4_hash_in_dirent(dir))
-		ext4fs_dirhash(dir, fname_usr_name(fname),
-				fname_len(fname), &fname->hinfo);
-	else
+
+	/* casefolded encrypted hashes are computed on fname setup */
+	if (!ext4_hash_in_dirent(dir))
 		ext4fs_dirhash(dir, fname_name(fname),
 				fname_len(fname), &fname->hinfo);
 
-- 
2.24.0.393.g34dc348eaf-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
@ 2019-12-03  7:41   ` Gao Xiang
  2019-12-03 19:42     ` Gabriel Krisman Bertazi
  2019-12-03 19:31   ` Gabriel Krisman Bertazi
  2020-01-03 20:26   ` Theodore Y. Ts'o
  2 siblings, 1 reply; 21+ messages in thread
From: Gao Xiang @ 2019-12-03  7:41 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro,
	Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team

On Mon, Dec 02, 2019 at 09:10:45PM -0800, Daniel Rosenberg wrote:
> Ext4 and F2fs are both using casefolding, and they, along with any other
> filesystem that adds the feature, will be using identical dentry_ops.
> Additionally, those dentry ops interfere with the dentry_ops required
> for fscrypt once we add support for casefolding and encryption.
> Moving this into the vfs removes code duplication as well as the
> complication with encryption.
> 
> Currently this is pretty close to just moving the existing f2fs/ext4
> code up a level into the vfs, although there is a lot of room for
> improvement now.
> 
> Signed-off-by: Daniel Rosenberg <drosen@google.com>

I'm afraid that such vfs modification is unneeded.

Just a quick glance it seems just can be replaced by introducing some
.d_cmp, .d_hash helpers (or with little modification) and most non-Android
emulated storage files are not casefolded (even in Android).

"those dentry ops interfere with the dentry_ops required for fscrypt",
I don't think it's a real diffculty and it could be done with some
better approach instead.

Thanks,
Gao Xiang


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
  2019-12-03  7:41   ` Gao Xiang
@ 2019-12-03 19:31   ` Gabriel Krisman Bertazi
  2020-01-03 20:26   ` Theodore Y. Ts'o
  2 siblings, 0 replies; 21+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-12-03 19:31 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro,
	Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, kernel-team

Daniel Rosenberg <drosen@google.com> writes:

> Ext4 and F2fs are both using casefolding, and they, along with any other
> filesystem that adds the feature, will be using identical dentry_ops.
> Additionally, those dentry ops interfere with the dentry_ops required
> for fscrypt once we add support for casefolding and encryption.
> Moving this into the vfs removes code duplication as well as the
> complication with encryption.
>
> Currently this is pretty close to just moving the existing f2fs/ext4
> code up a level into the vfs, although there is a lot of room for
> improvement now.
>
> Signed-off-by: Daniel Rosenberg <drosen@google.com>
> ---
>  fs/dcache.c        | 35 +++++++++++++++++++++++++++++++++++
>  fs/namei.c         | 43 ++++++++++++++++++++++++++++++++++++++++---
>  include/linux/fs.h | 12 ++++++++++++
>  3 files changed, 87 insertions(+), 3 deletions(-)
>
> diff --git a/fs/dcache.c b/fs/dcache.c
> index f7931b682a0d..575f3c2c3f77 100644
> --- a/fs/dcache.c
> +++ b/fs/dcache.c
> @@ -32,6 +32,7 @@
>  #include <linux/bit_spinlock.h>
>  #include <linux/rculist_bl.h>
>  #include <linux/list_lru.h>
> +#include <linux/unicode.h>
>  #include "internal.h"
>  #include "mount.h"
>  
> @@ -228,6 +229,13 @@ static inline int dentry_string_cmp(const unsigned char *cs, const unsigned char
>  
>  #endif
>  
> +bool needs_casefold(const struct inode *dir)
> +{
> +	return IS_CASEFOLDED(dir) &&
> +			(!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir));
> +}
> +EXPORT_SYMBOL(needs_casefold);
> +
>  static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *ct, unsigned tcount)
>  {
>  	/*
> @@ -247,7 +255,19 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
>  	 * be no NUL in the ct/tcount data)
>  	 */
>  	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
> +#ifdef CONFIG_UNICODE
> +	struct inode *parent = dentry->d_parent->d_inode;
>  
> +	if (unlikely(needs_casefold(parent))) {
> +		const struct qstr n1 = QSTR_INIT(cs, tcount);
> +		const struct qstr n2 = QSTR_INIT(ct, tcount);
> +		int result = utf8_strncasecmp(dentry->d_sb->s_encoding,
> +						&n1, &n2);
> +
> +		if (result >= 0 || sb_has_enc_strict_mode(dentry->d_sb))
> +			return result;
> +	}
> +#endif
>  	return dentry_string_cmp(cs, ct, tcount);
>  }
>  
> @@ -2404,7 +2424,22 @@ struct dentry *d_hash_and_lookup(struct dentry *dir, struct qstr *name)
>  	 * calculate the standard hash first, as the d_op->d_hash()
>  	 * routine may choose to leave the hash value unchanged.
>  	 */
> +#ifdef CONFIG_UNICODE
> +	unsigned char *hname = NULL;
> +	int hlen = name->len;
> +
> +	if (IS_CASEFOLDED(dir->d_inode)) {
> +		hname = kmalloc(PATH_MAX, GFP_ATOMIC);
> +		if (!hname)
> +			return ERR_PTR(-ENOMEM);
> +		hlen = utf8_casefold(dir->d_sb->s_encoding,
> +					name, hname, PATH_MAX);
> +	}
> +	name->hash = full_name_hash(dir, hname ?: name->name, hlen);
> +	kfree(hname);
> +#else
>  	name->hash = full_name_hash(dir, name->name, name->len);
> +#endif
>  	if (dir->d_flags & DCACHE_OP_HASH) {
>  		int err = dir->d_op->d_hash(dir, name);
>  		if (unlikely(err < 0))
> diff --git a/fs/namei.c b/fs/namei.c
> index 2dda552bcf7a..b8d5cb0994ec 100644
> --- a/fs/namei.c
> +++ b/fs/namei.c
> @@ -39,6 +39,7 @@
>  #include <linux/bitops.h>
>  #include <linux/init_task.h>
>  #include <linux/uaccess.h>
> +#include <linux/unicode.h>
>  
>  #include "internal.h"
>  #include "mount.h"
> @@ -2062,6 +2063,10 @@ static inline u64 hash_name(const void *salt, const char *name)
>  static int link_path_walk(const char *name, struct nameidata *nd)
>  {
>  	int err;
> +#ifdef CONFIG_UNICODE
> +	char *hname = NULL;
> +	int hlen = 0;
> +#endif
>  
>  	if (IS_ERR(name))
>  		return PTR_ERR(name);
> @@ -2078,9 +2083,21 @@ static int link_path_walk(const char *name, struct nameidata *nd)
>  		err = may_lookup(nd);
>  		if (err)
>  			return err;
> -
> +#ifdef CONFIG_UNICODE
> +		if (needs_casefold(nd->path.dentry->d_inode)) {
> +			struct qstr str = QSTR_INIT(name, PATH_MAX);
> +
> +			hname = kmalloc(PATH_MAX, GFP_ATOMIC);
> +			if (!hname)
> +				return -ENOMEM;
> +			hlen = utf8_casefold(nd->path.dentry->d_sb->s_encoding,
> +						&str, hname, PATH_MAX);
> +		}
> +		hash_len = hash_name(nd->path.dentry, hname ?: name);
> +		kfree(hname);

It would be nice to reuse the memory allocation for the entire path walk
instead of allocating and freeing several times in a row.  Still not
ideal, but I don't see how we could not have at least one allocation here.

> +#else
>  		hash_len = hash_name(nd->path.dentry, name);
> -
> +#endif
>  		type = LAST_NORM;
>  		if (name[0] == '.') switch (hashlen_len(hash_len)) {
>  			case 2:
> @@ -2452,9 +2469,29 @@ EXPORT_SYMBOL(vfs_path_lookup);
>  static int lookup_one_len_common(const char *name, struct dentry *base,
>  				 int len, struct qstr *this)
>  {
> +#ifdef CONFIG_UNICODE
> +	char *hname = NULL;
> +	int hlen = len;
> +
> +	if (needs_casefold(base->d_inode)) {
> +		struct qstr str = QSTR_INIT(name, len);
> +
> +		hname = kmalloc(PATH_MAX, GFP_ATOMIC);
> +		if (!hname)
> +			return -ENOMEM;
> +		hlen = utf8_casefold(base->d_sb->s_encoding,
> +					&str, hname, PATH_MAX);
> +	}
> +	this->hash = full_name_hash(base, hname ?: name, hlen);
> +	kvfree(hname);
> +#else
> +	this->hash = full_name_hash(base, name, len);
> +#endif
>  	this->name = name;
>  	this->len = len;
> -	this->hash = full_name_hash(base, name, len);
> +#ifdef CONFIG_UNICODE
> +	kfree(hname);
> +#endif
>  	if (!len)
>  		return -EACCES;
>  
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index c159a8bdee8b..38d1c20f3e6f 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1382,6 +1382,12 @@ extern int send_sigurg(struct fown_struct *fown);
>  #define SB_ACTIVE	(1<<30)
>  #define SB_NOUSER	(1<<31)
>  
> +/* These flags relate to encoding and casefolding */
> +#define SB_ENC_STRICT_MODE_FL	(1 << 0)
> +
> +#define sb_has_enc_strict_mode(sb) \
> +	(sb->s_encoding_flags & SB_ENC_STRICT_MODE_FL)
> +
>  /*
>   *	Umount options
>   */
> @@ -1449,6 +1455,10 @@ struct super_block {
>  #endif
>  #ifdef CONFIG_FS_VERITY
>  	const struct fsverity_operations *s_vop;
> +#endif
> +#ifdef CONFIG_UNICODE
> +	struct unicode_map *s_encoding;
> +	__u16 s_encoding_flags;
>  #endif
>  	struct hlist_bl_head	s_roots;	/* alternate root dentries for NFS */
>  	struct list_head	s_mounts;	/* list of mounts; _not_ for fs use */
> @@ -2044,6 +2054,8 @@ static inline bool sb_rdonly(const struct super_block *sb) { return sb->s_flags
>  #define IS_WHITEOUT(inode)	(S_ISCHR(inode->i_mode) && \
>  				 (inode)->i_rdev == WHITEOUT_DEV)
>  
> +extern bool needs_casefold(const struct inode *dir);
> +
>  static inline bool HAS_UNMAPPED_ID(struct inode *inode)
>  {
>  	return !uid_valid(inode->i_uid) || !gid_valid(inode->i_gid);

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03  7:41   ` Gao Xiang
@ 2019-12-03 19:42     ` Gabriel Krisman Bertazi
  2019-12-03 20:34       ` Eric Biggers
  0 siblings, 1 reply; 21+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-12-03 19:42 UTC (permalink / raw)
  To: Gao Xiang
  Cc: Daniel Rosenberg, Theodore Ts'o, linux-ext4, Jaegeuk Kim,
	Chao Yu, linux-f2fs-devel, Eric Biggers, linux-fscrypt,
	Alexander Viro, Andreas Dilger, Jonathan Corbet, linux-doc,
	linux-kernel, linux-fsdevel, kernel-team

Gao Xiang <gaoxiang25@huawei.com> writes:

> On Mon, Dec 02, 2019 at 09:10:45PM -0800, Daniel Rosenberg wrote:
>> Ext4 and F2fs are both using casefolding, and they, along with any other
>> filesystem that adds the feature, will be using identical dentry_ops.
>> Additionally, those dentry ops interfere with the dentry_ops required
>> for fscrypt once we add support for casefolding and encryption.
>> Moving this into the vfs removes code duplication as well as the
>> complication with encryption.
>> 
>> Currently this is pretty close to just moving the existing f2fs/ext4
>> code up a level into the vfs, although there is a lot of room for
>> improvement now.
>> 
>> Signed-off-by: Daniel Rosenberg <drosen@google.com>
>
> I'm afraid that such vfs modification is unneeded.
>
> Just a quick glance it seems just can be replaced by introducing some
> .d_cmp, .d_hash helpers (or with little modification) and most non-Android
> emulated storage files are not casefolded (even in Android).
>
> "those dentry ops interfere with the dentry_ops required for fscrypt",
> I don't think it's a real diffculty and it could be done with some
> better approach instead.

It would be good to avoid dentry_ops in general for these cases.  It
doesn't just interfere with fscrypt, but also overlayfs and others.

The difficulty is that it is not trivial to change dentry_ops after
dentries are already installed in the dcache.  Which means that it is
hard to use different dentry_ops for different parts of the filesystem,
for instance when converting a directory to case-insensitive or back
to case-sensitive.

In fact, currently and for case-insensitive at least, we install generic
hooks for the entire case-insensitive filesystem and use it even for
!IS_CASEFOLDED() directories. This breaks overlayfs even if we don't
have a single IS_CASEFOLDED() directory at all, just by having the
superblock flag, we *must* set the dentry_ops, which already breaks
overlayfs.

I think Daniel's approach of moving this into VFS is the simplest way to
actually solve the issue, instead of extending and duplicating a lot of
functionality into filesystem hooks to support the possible mixes of
case-insensitive, overlayfs and fscrypt.

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 6/8] ext4: Use struct super_block's casefold data
  2019-12-03  5:10 ` [PATCH 6/8] ext4: Use struct super_block's casefold data Daniel Rosenberg
@ 2019-12-03 19:44   ` Gabriel Krisman Bertazi
  0 siblings, 0 replies; 21+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-12-03 19:44 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro,
	Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, kernel-team

Daniel Rosenberg <drosen@google.com> writes:

> Switch over to using the struct entries added to the VFS, and
> remove the redundant dentry operations.

getting this in VFS instead of per filesystem is gonna allow us to mix
not only fscrypt with case-insensitive but also overlayfs.

Need to do a closer review, but thanks for doing this!!

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03 19:42     ` Gabriel Krisman Bertazi
@ 2019-12-03 20:34       ` Eric Biggers
  2019-12-03 21:21         ` Gabriel Krisman Bertazi
  0 siblings, 1 reply; 21+ messages in thread
From: Eric Biggers @ 2019-12-03 20:34 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Gao Xiang, Daniel Rosenberg, Theodore Ts'o, linux-ext4,
	Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-fscrypt,
	Alexander Viro, Andreas Dilger, Jonathan Corbet, linux-doc,
	linux-kernel, linux-fsdevel, kernel-team

On Tue, Dec 03, 2019 at 02:42:10PM -0500, Gabriel Krisman Bertazi wrote:
> Gao Xiang <gaoxiang25@huawei.com> writes:
> 
> > On Mon, Dec 02, 2019 at 09:10:45PM -0800, Daniel Rosenberg wrote:
> >> Ext4 and F2fs are both using casefolding, and they, along with any other
> >> filesystem that adds the feature, will be using identical dentry_ops.
> >> Additionally, those dentry ops interfere with the dentry_ops required
> >> for fscrypt once we add support for casefolding and encryption.
> >> Moving this into the vfs removes code duplication as well as the
> >> complication with encryption.
> >> 
> >> Currently this is pretty close to just moving the existing f2fs/ext4
> >> code up a level into the vfs, although there is a lot of room for
> >> improvement now.
> >> 
> >> Signed-off-by: Daniel Rosenberg <drosen@google.com>
> >
> > I'm afraid that such vfs modification is unneeded.
> >
> > Just a quick glance it seems just can be replaced by introducing some
> > .d_cmp, .d_hash helpers (or with little modification) and most non-Android
> > emulated storage files are not casefolded (even in Android).
> >
> > "those dentry ops interfere with the dentry_ops required for fscrypt",
> > I don't think it's a real diffculty and it could be done with some
> > better approach instead.
> 
> It would be good to avoid dentry_ops in general for these cases.  It
> doesn't just interfere with fscrypt, but also overlayfs and others.
> 
> The difficulty is that it is not trivial to change dentry_ops after
> dentries are already installed in the dcache.  Which means that it is
> hard to use different dentry_ops for different parts of the filesystem,
> for instance when converting a directory to case-insensitive or back
> to case-sensitive.
> 
> In fact, currently and for case-insensitive at least, we install generic
> hooks for the entire case-insensitive filesystem and use it even for
> !IS_CASEFOLDED() directories. This breaks overlayfs even if we don't
> have a single IS_CASEFOLDED() directory at all, just by having the
> superblock flag, we *must* set the dentry_ops, which already breaks
> overlayfs.
> 
> I think Daniel's approach of moving this into VFS is the simplest way to
> actually solve the issue, instead of extending and duplicating a lot of
> functionality into filesystem hooks to support the possible mixes of
> case-insensitive, overlayfs and fscrypt.
> 

I think we can actually get everything we want using dentry_operations only,
since the filesystem can set ->d_op during ->lookup() (like what is done for
encrypted filenames now) rather than at dentry allocation time.  And fs/crypto/
can export fscrypt_d_revalidate() rather than setting ->d_op itself.

See the untested patch below.

It's definitely ugly to have to handle the 3 cases of encrypt, casefold, and
encrypt+casefold separately -- and this will need to be duplicated for each
filesystem.  But we do have to weigh that against adding additional complexity
and overhead to the VFS for everyone.  If we do go with the VFS changes, please
try to make them as simple and unobtrusive as possible.

diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
index 3719efa546c6..cfa44adff2b3 100644
--- a/fs/crypto/crypto.c
+++ b/fs/crypto/crypto.c
@@ -290,7 +290,7 @@ EXPORT_SYMBOL(fscrypt_decrypt_block_inplace);
  * Validate dentries in encrypted directories to make sure we aren't potentially
  * caching stale dentries after a key has been added.
  */
-static int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
+int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
 {
 	struct dentry *dir;
 	int err;
@@ -329,10 +329,7 @@ static int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
 
 	return valid;
 }
-
-const struct dentry_operations fscrypt_d_ops = {
-	.d_revalidate = fscrypt_d_revalidate,
-};
+EXPORT_SYMBOL_GPL(fscrypt_d_revalidate);
 
 /**
  * fscrypt_initialize() - allocate major buffers for fs encryption.
diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
index 130b50e5a011..4420670ac40a 100644
--- a/fs/crypto/fscrypt_private.h
+++ b/fs/crypto/fscrypt_private.h
@@ -233,7 +233,6 @@ extern int fscrypt_crypt_block(const struct inode *inode,
 			       unsigned int len, unsigned int offs,
 			       gfp_t gfp_flags);
 extern struct page *fscrypt_alloc_bounce_page(gfp_t gfp_flags);
-extern const struct dentry_operations fscrypt_d_ops;
 
 extern void __printf(3, 4) __cold
 fscrypt_msg(const struct inode *inode, const char *level, const char *fmt, ...);
diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c
index bb3b7fcfdd48..ec81b6a597aa 100644
--- a/fs/crypto/hooks.c
+++ b/fs/crypto/hooks.c
@@ -116,7 +116,6 @@ int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
 		spin_lock(&dentry->d_lock);
 		dentry->d_flags |= DCACHE_ENCRYPTED_NAME;
 		spin_unlock(&dentry->d_lock);
-		d_set_d_op(dentry, &fscrypt_d_ops);
 	}
 	return err;
 }
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index 9fdd2b269d61..bd3c14e6b24a 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -704,9 +704,47 @@ static int ext4_d_hash(const struct dentry *dentry, struct qstr *str)
 	kfree(norm);
 	return ret;
 }
+#endif /* !CONFIG_UNICODE */
 
-const struct dentry_operations ext4_dentry_ops = {
+#ifdef CONFIG_UNICODE
+static const struct dentry_operations ext4_ci_dentry_ops = {
+	.d_hash = ext4_d_hash,
+	.d_compare = ext4_d_compare,
+};
+#endif
+
+#ifdef CONFIG_FS_ENCRYPTION
+static const struct dentry_operations ext4_encrypted_dentry_ops = {
+	.d_revalidate = fscrypt_d_revalidate,
+};
+#endif
+
+#if IS_ENABLED(CONFIG_UNICODE) && IS_ENABLED(CONFIG_FS_ENCRYPTION)
+static const struct dentry_operations ext4_encrypted_ci_dentry_ops = {
 	.d_hash = ext4_d_hash,
 	.d_compare = ext4_d_compare,
+	.d_revalidate = fscrypt_d_revalidate,
 };
 #endif
+
+void ext4_set_d_ops(struct inode *dir, struct dentry *dentry)
+{
+#ifdef CONFIG_FS_ENCRYPTION
+	if (dentry->d_flags & DCACHE_ENCRYPTED_NAME) {
+#ifdef CONFIG_UNICODE
+		if (IS_CASEFOLDED(dir)) {
+			d_set_d_op(dentry, &ext4_encrypted_ci_dentry_ops);
+			return;
+		}
+#endif
+		d_set_d_op(dentry, &ext4_encrypted_dentry_ops);
+		return;
+	}
+#endif
+#ifdef CONFIG_UNICODE
+	if (IS_CASEFOLDED(dir)) {
+		d_set_d_op(dentry, &ext4_ci_dentry_ops);
+		return;
+	}
+#endif
+}
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index f8578caba40d..00a10015a53c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2499,6 +2499,8 @@ static inline  unsigned char get_dtype(struct super_block *sb, int filetype)
 extern int ext4_check_all_de(struct inode *dir, struct buffer_head *bh,
 			     void *buf, int buf_size);
 
+void ext4_set_d_ops(struct inode *dir, struct dentry *dentry);
+
 /* fsync.c */
 extern int ext4_sync_file(struct file *, loff_t, loff_t, int);
 
@@ -3097,10 +3099,6 @@ static inline void ext4_unlock_group(struct super_block *sb,
 /* dir.c */
 extern const struct file_operations ext4_dir_operations;
 
-#ifdef CONFIG_UNICODE
-extern const struct dentry_operations ext4_dentry_ops;
-#endif
-
 /* file.c */
 extern const struct inode_operations ext4_file_inode_operations;
 extern const struct file_operations ext4_file_operations;
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index a856997d87b5..4df1d074b393 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -1608,6 +1608,7 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
 	struct buffer_head *bh;
 
 	err = ext4_fname_prepare_lookup(dir, dentry, &fname);
+	ext4_set_d_ops(dir, dentry);
 	if (err == -ENOENT)
 		return NULL;
 	if (err)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 1d82b56d9b11..ac593e9af270 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -4498,11 +4498,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
 		goto failed_mount4;
 	}
 
-#ifdef CONFIG_UNICODE
-	if (sbi->s_encoding)
-		sb->s_d_op = &ext4_dentry_ops;
-#endif
-
 	sb->s_root = d_make_root(root);
 	if (!sb->s_root) {
 		ext4_msg(sb, KERN_ERR, "get root dentry failed");
diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
index 1a7bffe78ed5..0de461f2225a 100644
--- a/include/linux/fscrypt.h
+++ b/include/linux/fscrypt.h
@@ -120,6 +120,8 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
 
 extern void fscrypt_free_bounce_page(struct page *bounce_page);
 
+extern int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
+
 /* policy.c */
 extern int fscrypt_ioctl_set_policy(struct file *, const void __user *);
 extern int fscrypt_ioctl_get_policy(struct file *, void __user *);
-- 
2.24.0


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03 20:34       ` Eric Biggers
@ 2019-12-03 21:21         ` Gabriel Krisman Bertazi
  2019-12-04  0:32           ` Eric Biggers
  0 siblings, 1 reply; 21+ messages in thread
From: Gabriel Krisman Bertazi @ 2019-12-03 21:21 UTC (permalink / raw)
  To: Eric Biggers
  Cc: Gao Xiang, Daniel Rosenberg, Theodore Ts'o, linux-ext4,
	Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-fscrypt,
	Alexander Viro, Andreas Dilger, Jonathan Corbet, linux-doc,
	linux-kernel, linux-fsdevel, kernel-team

Eric Biggers <ebiggers@kernel.org> writes:

> On Tue, Dec 03, 2019 at 02:42:10PM -0500, Gabriel Krisman Bertazi wrote:
>> Gao Xiang <gaoxiang25@huawei.com> writes:

>> I think Daniel's approach of moving this into VFS is the simplest way to
>> actually solve the issue, instead of extending and duplicating a lot of
>> functionality into filesystem hooks to support the possible mixes of
>> case-insensitive, overlayfs and fscrypt.
>> 
>
> I think we can actually get everything we want using dentry_operations only,
> since the filesystem can set ->d_op during ->lookup() (like what is done for
> encrypted filenames now) rather than at dentry allocation time.  And fs/crypto/
> can export fscrypt_d_revalidate() rather than setting ->d_op itself.

Problem is, differently from fscrypt, case-insensitive uses the d_hash()
hook and for a lookup, we actually use
dentry->d_parent->d_ops->d_hash().  Which works well, until you are flipping the
casefold flag.  Then the dentry already exists and you need to modify
the d_ops on the fly, which I couldn't find precedent anywhere.  I tried
invalidating the dentry whenever we flip the flag, but then if it has
negative dentries as children,I wasn't able to reliably invalidate it,
and that's when I reached the limit of my knowledge in VFS.  In
particular, in every attempt I made to implement it like this, I was
able to race and do a case-insensitive lookup on a directory that was
just made case sensitive.

I'm not saying there isn't a way.  But it is a bit harder than this
proposal. I tried it already and still didn't manage to make it work.
Maybe someone who better understands vfs.

> It's definitely ugly to have to handle the 3 cases of encrypt, casefold, and
> encrypt+casefold separately -- and this will need to be duplicated for each
> filesystem.  But we do have to weigh that against adding additional complexity
> and overhead to the VFS for everyone.  If we do go with the VFS changes, please
> try to make them as simple and unobtrusive as possible.

Well, it is just not case-insensitive+fscrypt. Also overlayfs
there. Probably more.  So we have much more cases.  I understand the VFS
changes need to be very well thought, but when I worked on this it
started to look a more correct solution than using the hooks.

> diff --git a/fs/crypto/crypto.c b/fs/crypto/crypto.c
> index 3719efa546c6..cfa44adff2b3 100644
> --- a/fs/crypto/crypto.c
> +++ b/fs/crypto/crypto.c
> @@ -290,7 +290,7 @@ EXPORT_SYMBOL(fscrypt_decrypt_block_inplace);
>   * Validate dentries in encrypted directories to make sure we aren't potentially
>   * caching stale dentries after a key has been added.
>   */
> -static int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
> +int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
>  {
>  	struct dentry *dir;
>  	int err;
> @@ -329,10 +329,7 @@ static int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags)
>  
>  	return valid;
>  }
> -
> -const struct dentry_operations fscrypt_d_ops = {
> -	.d_revalidate = fscrypt_d_revalidate,
> -};
> +EXPORT_SYMBOL_GPL(fscrypt_d_revalidate);
>  
>  /**
>   * fscrypt_initialize() - allocate major buffers for fs encryption.
> diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
> index 130b50e5a011..4420670ac40a 100644
> --- a/fs/crypto/fscrypt_private.h
> +++ b/fs/crypto/fscrypt_private.h
> @@ -233,7 +233,6 @@ extern int fscrypt_crypt_block(const struct inode *inode,
>  			       unsigned int len, unsigned int offs,
>  			       gfp_t gfp_flags);
>  extern struct page *fscrypt_alloc_bounce_page(gfp_t gfp_flags);
> -extern const struct dentry_operations fscrypt_d_ops;
>  
>  extern void __printf(3, 4) __cold
>  fscrypt_msg(const struct inode *inode, const char *level, const char *fmt, ...);
> diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c
> index bb3b7fcfdd48..ec81b6a597aa 100644
> --- a/fs/crypto/hooks.c
> +++ b/fs/crypto/hooks.c
> @@ -116,7 +116,6 @@ int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
>  		spin_lock(&dentry->d_lock);
>  		dentry->d_flags |= DCACHE_ENCRYPTED_NAME;
>  		spin_unlock(&dentry->d_lock);
> -		d_set_d_op(dentry, &fscrypt_d_ops);
>  	}
>  	return err;
>  }
> diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
> index 9fdd2b269d61..bd3c14e6b24a 100644
> --- a/fs/ext4/dir.c
> +++ b/fs/ext4/dir.c
> @@ -704,9 +704,47 @@ static int ext4_d_hash(const struct dentry *dentry, struct qstr *str)
>  	kfree(norm);
>  	return ret;
>  }
> +#endif /* !CONFIG_UNICODE */
>  
> -const struct dentry_operations ext4_dentry_ops = {
> +#ifdef CONFIG_UNICODE
> +static const struct dentry_operations ext4_ci_dentry_ops = {
> +	.d_hash = ext4_d_hash,
> +	.d_compare = ext4_d_compare,
> +};
> +#endif
> +
> +#ifdef CONFIG_FS_ENCRYPTION
> +static const struct dentry_operations ext4_encrypted_dentry_ops = {
> +	.d_revalidate = fscrypt_d_revalidate,
> +};
> +#endif
> +
> +#if IS_ENABLED(CONFIG_UNICODE) && IS_ENABLED(CONFIG_FS_ENCRYPTION)
> +static const struct dentry_operations ext4_encrypted_ci_dentry_ops = {
>  	.d_hash = ext4_d_hash,
>  	.d_compare = ext4_d_compare,
> +	.d_revalidate = fscrypt_d_revalidate,
>  };
>  #endif
> +
> +void ext4_set_d_ops(struct inode *dir, struct dentry *dentry)
> +{
> +#ifdef CONFIG_FS_ENCRYPTION
> +	if (dentry->d_flags & DCACHE_ENCRYPTED_NAME) {
> +#ifdef CONFIG_UNICODE
> +		if (IS_CASEFOLDED(dir)) {
> +			d_set_d_op(dentry, &ext4_encrypted_ci_dentry_ops);
> +			return;
> +		}
> +#endif
> +		d_set_d_op(dentry, &ext4_encrypted_dentry_ops);
> +		return;
> +	}
> +#endif
> +#ifdef CONFIG_UNICODE
> +	if (IS_CASEFOLDED(dir)) {
> +		d_set_d_op(dentry, &ext4_ci_dentry_ops);
> +		return;
> +	}
> +#endif
> +}
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index f8578caba40d..00a10015a53c 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2499,6 +2499,8 @@ static inline  unsigned char get_dtype(struct super_block *sb, int filetype)
>  extern int ext4_check_all_de(struct inode *dir, struct buffer_head *bh,
>  			     void *buf, int buf_size);
>  
> +void ext4_set_d_ops(struct inode *dir, struct dentry *dentry);
> +
>  /* fsync.c */
>  extern int ext4_sync_file(struct file *, loff_t, loff_t, int);
>  
> @@ -3097,10 +3099,6 @@ static inline void ext4_unlock_group(struct super_block *sb,
>  /* dir.c */
>  extern const struct file_operations ext4_dir_operations;
>  
> -#ifdef CONFIG_UNICODE
> -extern const struct dentry_operations ext4_dentry_ops;
> -#endif
> -
>  /* file.c */
>  extern const struct inode_operations ext4_file_inode_operations;
>  extern const struct file_operations ext4_file_operations;
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index a856997d87b5..4df1d074b393 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -1608,6 +1608,7 @@ static struct buffer_head *ext4_lookup_entry(struct inode *dir,
>  	struct buffer_head *bh;
>  
>  	err = ext4_fname_prepare_lookup(dir, dentry, &fname);
> +	ext4_set_d_ops(dir, dentry);
>  	if (err == -ENOENT)
>  		return NULL;
>  	if (err)
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index 1d82b56d9b11..ac593e9af270 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -4498,11 +4498,6 @@ static int ext4_fill_super(struct super_block *sb, void *data, int silent)
>  		goto failed_mount4;
>  	}
>  
> -#ifdef CONFIG_UNICODE
> -	if (sbi->s_encoding)
> -		sb->s_d_op = &ext4_dentry_ops;
> -#endif
> -
>  	sb->s_root = d_make_root(root);
>  	if (!sb->s_root) {
>  		ext4_msg(sb, KERN_ERR, "get root dentry failed");
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index 1a7bffe78ed5..0de461f2225a 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -120,6 +120,8 @@ static inline struct page *fscrypt_pagecache_page(struct page *bounce_page)
>  
>  extern void fscrypt_free_bounce_page(struct page *bounce_page);
>  
> +extern int fscrypt_d_revalidate(struct dentry *dentry, unsigned int flags);
> +
>  /* policy.c */
>  extern int fscrypt_ioctl_set_policy(struct file *, const void __user *);
>  extern int fscrypt_ioctl_get_policy(struct file *, void __user *);

-- 
Gabriel Krisman Bertazi

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2
  2019-12-03  5:10 ` [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2 Daniel Rosenberg
@ 2019-12-03 23:25   ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2019-12-03 23:25 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, linux-fscrypt, Alexander Viro, Andreas Dilger,
	Jonathan Corbet, linux-doc, linux-kernel, linux-fsdevel,
	Gabriel Krisman Bertazi, kernel-team

On Mon, Dec 02, 2019 at 09:10:42PM -0800, Daniel Rosenberg wrote:
> When using casefolding along with encryption, we need to use a
> cryptographic hash to allow fast filesystem operations while not knowing
> the case of the name stored on disk while not revealing extra
> information about the name if the key is not present.

This sentence is hard to parse.  Can you make it any clearer?

> 
> When a v2 policy is used on a directory, we derive a key for use with
> siphash.
> 
> Signed-off-by: Daniel Rosenberg <drosen@google.com>
> ---
>  fs/crypto/fname.c           | 22 ++++++++++++++++++++++
>  fs/crypto/fscrypt_private.h |  9 +++++++++
>  fs/crypto/keysetup.c        | 29 ++++++++++++++++++++---------
>  include/linux/fscrypt.h     |  8 ++++++++
>  4 files changed, 59 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
> index 3da3707c10e3..b33f03b9f892 100644
> --- a/fs/crypto/fname.c
> +++ b/fs/crypto/fname.c
> @@ -12,6 +12,7 @@
>   */
>  
>  #include <linux/scatterlist.h>
> +#include <linux/siphash.h>
>  #include <crypto/skcipher.h>
>  #include "fscrypt_private.h"
>  
> @@ -400,3 +401,24 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
>  	return ret;
>  }
>  EXPORT_SYMBOL(fscrypt_setup_filename);
> +
> +/**
> + * fscrypt_fname_siphash() - Calculate the siphash for a file name
> + * @dir: the parent directory
> + * @name: the name of the file to get the siphash of
> + *
> + * Given a user-provided filename @name, this function calculates the siphash of
> + * that name using the hash key stored with the directory's policy.

I suggest writing "using the directory's hash key" instead of "using the hash
key stored with the directory's policy", since the latter might be misunderstood
as meaning that the hash key is stored on-disk.

Also it would be helpful to document the assumptions:

	The directory must use a v2 encryption policy, and its key must be available.

> + *
> + *
> + * Return: the siphash of @name using the hash key of @dir
> + */
> +u64 fscrypt_fname_siphash(const struct inode *dir, const struct qstr *name)
> +{
> +	struct fscrypt_info *ci = dir->i_crypt_info;
> +
> +	WARN_ON(!ci || !ci->ci_hash_key_initialized);
> +
> +	return siphash(name->name, name->len, &ci->ci_hash_key);
> +}
> +EXPORT_SYMBOL(fscrypt_fname_siphash);

The !ci part of the WARN_ON is pointless because if it ever triggers, there will
be a NULL dereference afterwards anyway.  I suggest changing it to just:

	WARN_ON(!ci->ci_hash_key_initialized);

> diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h
> index 130b50e5a011..f0dfef9921de 100644
> --- a/fs/crypto/fscrypt_private.h
> +++ b/fs/crypto/fscrypt_private.h
> @@ -12,6 +12,7 @@
>  #define _FSCRYPT_PRIVATE_H
>  
>  #include <linux/fscrypt.h>
> +#include <linux/siphash.h>
>  #include <crypto/hash.h>
>  
>  #define CONST_STRLEN(str)	(sizeof(str) - 1)
> @@ -194,6 +195,13 @@ struct fscrypt_info {
>  	 */
>  	struct fscrypt_direct_key *ci_direct_key;
>  
> +	/*
> +	 * With v2 policies, this can be used with siphash
> +	 * When the key has been set, ci_hash_key_initialized is set to true
> +	 */
> +	siphash_key_t ci_hash_key;
> +	bool ci_hash_key_initialized;
> +
>  	/* The encryption policy used by this inode */
>  	union fscrypt_policy ci_policy;
>  
> @@ -286,6 +294,7 @@ extern int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
>  #define HKDF_CONTEXT_PER_FILE_KEY	2
>  #define HKDF_CONTEXT_DIRECT_KEY		3
>  #define HKDF_CONTEXT_IV_INO_LBLK_64_KEY	4
> +#define HKDF_CONTEXT_FNAME_HASH_KEY     5
>  
>  extern int fscrypt_hkdf_expand(struct fscrypt_hkdf *hkdf, u8 context,
>  			       const u8 *info, unsigned int infolen,
> diff --git a/fs/crypto/keysetup.c b/fs/crypto/keysetup.c
> index f577bb6613f9..e6c7ec04cd25 100644
> --- a/fs/crypto/keysetup.c
> +++ b/fs/crypto/keysetup.c
> @@ -192,7 +192,7 @@ static int fscrypt_setup_v2_file_key(struct fscrypt_info *ci,
>  				     ci->ci_mode->friendly_name);
>  			return -EINVAL;
>  		}
> -		return setup_per_mode_key(ci, mk, mk->mk_direct_tfms,
> +		err = setup_per_mode_key(ci, mk, mk->mk_direct_tfms,
>  					  HKDF_CONTEXT_DIRECT_KEY, false);
>  	} else if (ci->ci_policy.v2.flags &
>  		   FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) {
> @@ -202,20 +202,31 @@ static int fscrypt_setup_v2_file_key(struct fscrypt_info *ci,
>  		 * the IVs.  This format is optimized for use with inline
>  		 * encryption hardware compliant with the UFS or eMMC standards.
>  		 */
> -		return setup_per_mode_key(ci, mk, mk->mk_iv_ino_lblk_64_tfms,
> +		err = setup_per_mode_key(ci, mk, mk->mk_iv_ino_lblk_64_tfms,
>  					  HKDF_CONTEXT_IV_INO_LBLK_64_KEY,
>  					  true);
> -	}
> -
> -	err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
> +	} else {
> +		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
>  				  HKDF_CONTEXT_PER_FILE_KEY,
>  				  ci->ci_nonce, FS_KEY_DERIVATION_NONCE_SIZE,
>  				  derived_key, ci->ci_mode->keysize);

Nit: keep continuation lines aligned when they don't exceed 80 characters:

		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
					  HKDF_CONTEXT_PER_FILE_KEY,
					  ci->ci_nonce,
					  FS_KEY_DERIVATION_NONCE_SIZE,
					  derived_key, ci->ci_mode->keysize);

> -	if (err)
> -		return err;
> +		if (err)
> +			return err;
> +
> +		err = fscrypt_set_derived_key(ci, derived_key);
> +		memzero_explicit(derived_key, ci->ci_mode->keysize);
> +		if (err)
> +			return err;

This 'if (err)' check is in the wrong place.  It needs to be below the brace
below, so that it also checks the error from setup_per_mode_key().

> +	}
>  
> -	err = fscrypt_set_derived_key(ci, derived_key);
> -	memzero_explicit(derived_key, ci->ci_mode->keysize);
> +	if (S_ISDIR(ci->ci_inode->i_mode)) {
> +		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
> +			  HKDF_CONTEXT_FNAME_HASH_KEY,
> +			  ci->ci_nonce, FS_KEY_DERIVATION_NONCE_SIZE,
> +			  (u8 *)&ci->ci_hash_key, sizeof(ci->ci_hash_key));

Nit: keep continuation lines aligned when they don't exceed 80 characters:

		err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
					  HKDF_CONTEXT_FNAME_HASH_KEY,
					  ci->ci_nonce,
					  FS_KEY_DERIVATION_NONCE_SIZE,
					  (u8 *)&ci->ci_hash_key,
					  sizeof(ci->ci_hash_key));

> +		if (!err)
> +			ci->ci_hash_key_initialized = true;
> +	}
>  	return err;

Nit: an early return on error would be better here
(consistent with the code above):

                if (err)
                        return err;
                ci->ci_hash_key_initialized = true;
        }
        return 0;

> +extern u64 fscrypt_fname_siphash(const struct inode *dir,
> +					const struct qstr *name);

Nit: align the continuation line:

extern u64 fscrypt_fname_siphash(const struct inode *dir,
				 const struct qstr *name);

>  
>  #define FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE	32
>  
> @@ -446,6 +448,12 @@ static inline int fscrypt_fname_disk_to_usr(struct inode *inode,
>  	return -EOPNOTSUPP;
>  }
>  
> +static inline u64 fscrypt_fname_siphash(const struct inode *inode,

In the other places the first parameter is called 'dir', not 'inode'.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding
  2019-12-03  5:10 ` [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding Daniel Rosenberg
@ 2019-12-03 23:37   ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2019-12-03 23:37 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, linux-fscrypt, Alexander Viro, Andreas Dilger,
	Jonathan Corbet, linux-doc, linux-kernel, linux-fsdevel,
	Gabriel Krisman Bertazi, kernel-team

On Mon, Dec 02, 2019 at 09:10:43PM -0800, Daniel Rosenberg wrote:
> Casefolding requires a derived key for computing the siphash.
> This is available for v2 policies, but not v1, so we disallow it for v1.
> 
> Signed-off-by: Daniel Rosenberg <drosen@google.com>
> ---
>  fs/crypto/policy.c      | 26 +++++++++++++++++++++++---
>  fs/inode.c              |  8 ++++++++
>  include/linux/fscrypt.h |  7 +++++++
>  3 files changed, 38 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c
> index 96f528071bed..94d96d3212d6 100644
> --- a/fs/crypto/policy.c
> +++ b/fs/crypto/policy.c
> @@ -67,9 +67,9 @@ static bool supported_iv_ino_lblk_64_policy(
>   * fscrypt_supported_policy - check whether an encryption policy is supported
>   *
>   * Given an encryption policy, check whether all its encryption modes and other
> - * settings are supported by this kernel.  (But we don't currently don't check
> - * for crypto API support here, so attempting to use an algorithm not configured
> - * into the crypto API will still fail later.)
> + * settings are supported by this kernel on the given inode.  (But we don't
> + * currently don't check for crypto API support here, so attempting to use an
> + * algorithm not configured into the crypto API will still fail later.)
>   *
>   * Return: %true if supported, else %false
>   */
> @@ -97,6 +97,12 @@ bool fscrypt_supported_policy(const union fscrypt_policy *policy_u,
>  			return false;
>  		}
>  
> +		if (IS_CASEFOLDED(inode)) {
> +			fscrypt_warn(inode,
> +				     "v1 policy does not support casefolded directories");
> +			return false;
> +		}
> +
>  		return true;
>  	}
>  	case FSCRYPT_POLICY_V2: {
> @@ -530,3 +536,17 @@ int fscrypt_inherit_context(struct inode *parent, struct inode *child,
>  	return preload ? fscrypt_get_encryption_info(child): 0;
>  }
>  EXPORT_SYMBOL(fscrypt_inherit_context);
> +
> +int fscrypt_set_casefolding_allowed(struct inode *inode)
> +{
> +	union fscrypt_policy policy;
> +	int ret = fscrypt_get_policy(inode, &policy);
> +
> +	if (ret < 0)
> +		return ret;

In fs/crypto/ we're trying to use 'err' rather than 'ret' when a variable can
only be 0 or a negative errno value.  I.e.:

        union fscrypt_policy policy;
        int err;

        err = fscrypt_get_policy(inode, &policy);
        if (err)
                return err;

> +
> +	if (policy.version == FSCRYPT_POLICY_V2)
> +		return 0;
> +	else
> +		return -EINVAL;
> +}

In kernel code normally an early return is used in cases like this.  I.e.:

	if (policy.version != FSCRYPT_POLICY_V2)
		return -EINVAL;

	return 0;

> @@ -2245,6 +2246,13 @@ int vfs_ioc_setflags_prepare(struct inode *inode, unsigned int oldflags,
>  	    !capable(CAP_LINUX_IMMUTABLE))
>  		return -EPERM;
>  
> +	/*
> +	 * When a directory is encrypted, the CASEFOLD flag can only be turned
> +	 * on if the fscrypt policy supports it.
> +	 */
> +	if (IS_ENCRYPTED(inode) && (flags & ~oldflags & FS_CASEFOLD_FL))
> +		return fscrypt_set_casefolding_allowed(inode);
> +
>  	return 0;
>  }
>  EXPORT_SYMBOL(vfs_ioc_setflags_prepare);

This needs to only return early on error.  Otherwise when people add checks for
more flags later, those checks will not be executed when the CASEFOLD flag is
enabled on an encrypted directory.

I.e.:

        if (IS_ENCRYPTED(inode) && (flags & ~oldflags & FS_CASEFOLD_FL)) {
                err = fscrypt_set_casefolding_allowed(inode);
                if (err)
                        return err;
        }

I'm also wondering if this is the right level of abstraction.
Maybe the API should be:

	err = fscrypt_ioc_setflags_prepare(inode, oldflags, flags);
	if (err)
		return err;

Then the VFS code will be "obvious", and the comment:

        /*
         * When a directory is encrypted, the CASEFOLD flag can only be turned
         * on if the fscrypt policy supports it.
         */

can be moved to the definition of fscrypt_ioc_setflags_prepare() in fs/crypto/.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/8] fscrypt: Change format of no-key token
  2019-12-03  5:10 ` [PATCH 3/8] fscrypt: Change format of no-key token Daniel Rosenberg
@ 2019-12-04  0:09   ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2019-12-04  0:09 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, linux-fscrypt, Alexander Viro, Andreas Dilger,
	Jonathan Corbet, linux-doc, linux-kernel, linux-fsdevel,
	Gabriel Krisman Bertazi, kernel-team

On Mon, Dec 02, 2019 at 09:10:44PM -0800, Daniel Rosenberg wrote:
> Encrypted and casefolded names always require a dirtree hash, since
> their hash values cannot be generated without the key.

I feel like there should be another paragraph here that explains more clearly
(including for people who may not as familiar with fscrypt) what is meant by a
"no-key token", why they need to be changed, and why they are being changed even
for non-casefolded directories.

> 
> In the new format, we always base64 encode the same structure. For names
> that are less than 149 characters, we concatenate the provided hash and
> ciphertext. If the name is longer than 149 characters, we also include
> the sha256 of the remaining parts of the name. We then base64 encode the
> resulting data to get a representation of the name that is at most 252
> characters long, with a very low collision rate. We avoid needing to
> compute the sha256 apart from in the case of a very long filename, and
> then only need to compute the sha256 of possible matches if their
> ciphertext is also longer than 149.
> 
> Signed-off-by: Daniel Rosenberg <drosen@google.com>
> ---
>  fs/crypto/Kconfig       |   1 +
>  fs/crypto/fname.c       | 182 +++++++++++++++++++++++++++++-----------
>  include/linux/fscrypt.h |  92 ++++++--------------
>  3 files changed, 160 insertions(+), 115 deletions(-)
> 
> diff --git a/fs/crypto/Kconfig b/fs/crypto/Kconfig
> index ff5a1746cbae..6e0d56f0b993 100644
> --- a/fs/crypto/Kconfig
> +++ b/fs/crypto/Kconfig
> @@ -9,6 +9,7 @@ config FS_ENCRYPTION
>  	select CRYPTO_CTS
>  	select CRYPTO_SHA512
>  	select CRYPTO_HMAC
> +	select CRYPTO_SHA256
>  	select KEYS
>  	help
>  	  Enable encryption of files and directories.  This
> diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c
> index b33f03b9f892..03c837c461f2 100644
> --- a/fs/crypto/fname.c
> +++ b/fs/crypto/fname.c
> @@ -14,8 +14,46 @@
>  #include <linux/scatterlist.h>
>  #include <linux/siphash.h>
>  #include <crypto/skcipher.h>
> +#include <crypto/hash.h>
>  #include "fscrypt_private.h"
>  
> +static struct crypto_shash *sha256_hash_tfm;
> +
> +static int fscrypt_do_sha256(unsigned char *result,
> +	     const u8 *data, unsigned int data_len)
> +{
> +	struct crypto_shash *tfm = READ_ONCE(sha256_hash_tfm);
> +
> +	if (unlikely(!tfm)) {
> +		struct crypto_shash *prev_tfm;
> +
> +		tfm = crypto_alloc_shash("sha256", 0, 0);
> +		if (IS_ERR(tfm)) {
> +			if (PTR_ERR(tfm) == -ENOENT) {
> +				fscrypt_warn(NULL,
> +					     "Missing crypto API support for SHA-256");
> +				return -ENOPKG;
> +			}

No need to check for -ENOENT specifically here, since this patch makes
CONFIG_FS_ENCRYPTION select CONFIG_CRYPTO_SHA256, so sha256 will always be
available.  Just handle -ENOENT like any other error.

> +			fscrypt_err(NULL,
> +				    "Error allocating SHA-256 transform: %ld",
> +				    PTR_ERR(tfm));
> +			return PTR_ERR(tfm);
> +		}
> +		prev_tfm = cmpxchg(&sha256_hash_tfm, NULL, tfm);
> +		if (prev_tfm) {
> +			crypto_free_shash(tfm);
> +			tfm = prev_tfm;
> +		}
> +	}
> +	{
> +		SHASH_DESC_ON_STACK(desc, tfm);
> +
> +		desc->tfm = tfm;
> +
> +		return crypto_shash_digest(desc, data, data_len, result);
> +	}
> +}
> +
>  static inline bool fscrypt_is_dot_dotdot(const struct qstr *str)
>  {
>  	if (str->len == 1 && str->name[0] == '.')
> @@ -208,8 +246,7 @@ int fscrypt_fname_alloc_buffer(const struct inode *inode,
>  			       struct fscrypt_str *crypto_str)
>  {
>  	const u32 max_encoded_len =
> -		max_t(u32, BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE),
> -		      1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name)));
> +		      BASE64_CHARS(sizeof(struct fscrypt_nokey_name));
>  	u32 max_presented_len;
>  
>  	max_presented_len = max(max_encoded_len, max_encrypted_len);
> @@ -243,8 +280,9 @@ EXPORT_SYMBOL(fscrypt_fname_free_buffer);
>   * The caller must have allocated sufficient memory for the @oname string.
>   *
>   * If the key is available, we'll decrypt the disk name; otherwise, we'll encode
> - * it for presentation.  Short names are directly base64-encoded, while long
> - * names are encoded in fscrypt_digested_name format.
> + * it for presentation.  The usr name is the base64 encoding of the dirtree hash
> + * value, the first 149 characters of the name, and the sha256 of the rest of
> + * the name, if longer than 149 characters.

Maybe write just "If the key is available, we'll decrypt the disk name;
otherwise, we'll encode it for presentation in fscrypt_nokey_name format.
See struct fscrypt_nokey_name for details."

... since this comment doesn't really need to go into the details that are
already covered in the long comment above struct fscrypt_nokey_name.

>   *
>   * Return: 0 on success, -errno on failure
>   */
> @@ -254,7 +292,9 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
>  			struct fscrypt_str *oname)
>  {
>  	const struct qstr qname = FSTR_TO_QSTR(iname);
> -	struct fscrypt_digested_name digested_name;
> +	struct fscrypt_nokey_name nokey_name;
> +	u32 size;
> +	int err = 0;
>  
>  	if (fscrypt_is_dot_dotdot(&qname)) {
>  		oname->name[0] = '.';
> @@ -269,25 +309,29 @@ int fscrypt_fname_disk_to_usr(struct inode *inode,
>  	if (fscrypt_has_encryption_key(inode))
>  		return fname_decrypt(inode, iname, oname);
>  
> -	if (iname->len <= FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE) {
> -		oname->len = base64_encode(iname->name, iname->len,
> -					   oname->name);
> -		return 0;
> -	}
> +	size = min_t(u32, iname->len, FSCRYPT_FNAME_UNDIGESTED_SIZE);
> +		memcpy(nokey_name.bytes, iname->name, size);
> +
>  	if (hash) {
> -		digested_name.hash = hash;
> -		digested_name.minor_hash = minor_hash;
> +		nokey_name.dirtree_hash[0] = hash;
> +		nokey_name.dirtree_hash[1] = minor_hash;
>  	} else {
> -		digested_name.hash = 0;
> -		digested_name.minor_hash = 0;
> +		nokey_name.dirtree_hash[0] = 0;
> +		nokey_name.dirtree_hash[1] = 0;
>  	}
> -	memcpy(digested_name.digest,
> -	       FSCRYPT_FNAME_DIGEST(iname->name, iname->len),
> -	       FSCRYPT_FNAME_DIGEST_SIZE);
> -	oname->name[0] = '_';
> -	oname->len = 1 + base64_encode((const u8 *)&digested_name,
> -				       sizeof(digested_name), oname->name + 1);
> -	return 0;
> +	size += sizeof(nokey_name.dirtree_hash);
> +
> +	if (iname->len > FSCRYPT_FNAME_UNDIGESTED_SIZE) {
> +		/* compute sha256 of remaining name */
> +		err = fscrypt_do_sha256(nokey_name.sha256,
> +				&iname->name[FSCRYPT_FNAME_UNDIGESTED_SIZE],
> +				iname->len - FSCRYPT_FNAME_UNDIGESTED_SIZE);
> +		if (err)
> +			return err;
> +		size += sizeof(nokey_name.sha256);
> +	}
> +	oname->len = base64_encode((const u8 *)&nokey_name, size, oname->name);
> +	return err;
>  }
>  EXPORT_SYMBOL(fscrypt_fname_disk_to_usr);
>  
> @@ -319,7 +363,6 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
>  			      int lookup, struct fscrypt_name *fname)

This function has a comment that needs to be updated:

 * Else, for keyless @lookup operations, @iname is the presented ciphertext, so
 * we decode it to get either the ciphertext disk_name (for short names) or the
 * fscrypt_digested_name (for long names).  Non-@lookup operations will be
 * impossible in this case, so we fail them with ENOKEY.

=>

 * Else, for keyless @lookup operations, @iname is the presented ciphertext, so
 * we decode it to get the fscrypt_nokey_name.  Non-@lookup operations will be
 * impossible in this case, so we fail them with ENOKEY.

>  {
>  	int ret;
> -	int digested;
>  
>  	memset(fname, 0, sizeof(struct fscrypt_name));
>  	fname->usr_fname = iname;
> @@ -359,42 +402,32 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname,
>  	 * We don't have the key and we are doing a lookup; decode the
>  	 * user-supplied name
>  	 */
> -	if (iname->name[0] == '_') {
> -		if (iname->len !=
> -		    1 + BASE64_CHARS(sizeof(struct fscrypt_digested_name)))
> -			return -ENOENT;
> -		digested = 1;
> -	} else {
> -		if (iname->len >
> -		    BASE64_CHARS(FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE))
> -			return -ENOENT;
> -		digested = 0;
> -	}
>  
>  	fname->crypto_buf.name =
> -		kmalloc(max_t(size_t, FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE,
> -			      sizeof(struct fscrypt_digested_name)),
> -			GFP_KERNEL);
> +			kmalloc(sizeof(struct fscrypt_nokey_name), GFP_KERNEL);
>  	if (fname->crypto_buf.name == NULL)
>  		return -ENOMEM;

Need to validate that the filename isn't too long before decoding it.  Otherwise
it might overflow this buffer.

>  
> -	ret = base64_decode(iname->name + digested, iname->len - digested,
> +	ret = base64_decode(iname->name, iname->len,
>  			    fname->crypto_buf.name);
>  	if (ret < 0) {
>  		ret = -ENOENT;
>  		goto errout;
>  	}
> -	fname->crypto_buf.len = ret;
> -	if (digested) {
> -		const struct fscrypt_digested_name *n =
> -			(const void *)fname->crypto_buf.name;
> -		fname->hash = n->hash;
> -		fname->minor_hash = n->minor_hash;
> -	} else {
> -		fname->disk_name.name = fname->crypto_buf.name;
> -		fname->disk_name.len = fname->crypto_buf.len;
> +	if (ret > (int)offsetofend(struct fscrypt_nokey_name, sha256)) {
> +		ret = -EINVAL;
> +		goto errout;
> +	}

This should be -ENOENT rather than -EINVAL, since we should report that the file
does not exist.

> +
> +	{
> +		struct fscrypt_nokey_name *n =
> +				(void *)fname->crypto_buf.name;

'n' (which maybe should be called 'nokey_name'?) could be declared at the
beginning of the function so that it doesn't have to go into this separate
block.

> +		fname->crypto_buf.len = ret;
> +
> +		fname->hash = n->dirtree_hash[0];
> +		fname->minor_hash = n->dirtree_hash[1];

Need to validate that these fields are actually included in the buffer that was
decoded, since it could be shorter.

> +/**
> + * fscrypt_match_name() - test whether the given name matches a directory entry
> + * @fname: the name being searched for
> + * @de_name: the name from the directory entry
> + * @de_name_len: the length of @de_name in bytes
> + *
> + * Normally @fname->disk_name will be set, and in that case we simply compare
> + * that to the name stored in the directory entry.  The only exception is that
> + * if we don't have the key for an encrypted directory and a filename in it is
> + * very long, then we won't have the full disk_name and we'll instead need to
> + * match against the fscrypt_digested_name.

Comment needs to be updated.

> +		bool check_sha256 = false;
> +		u8 sha256[SHA256_DIGEST_SIZE];
> +
> +		if (fname->crypto_buf.len ==
> +			    offsetofend(struct fscrypt_nokey_name, sha256)) {
> +			len = FSCRYPT_FNAME_UNDIGESTED_SIZE;
> +			check_sha256 = true;
> +		} else {
> +			len = fname->crypto_buf.len -
> +				offsetof(struct fscrypt_nokey_name, bytes);
> +		}
> +		if (!check_sha256 && de_name_len != len)
> +			return false;
> +		if (check_sha256 && de_name_len <= len)
> +			return false;
> +		if (memcmp(de_name, n->bytes, len) != 0)
> +			return false;
> +		if (check_sha256) {
> +			fscrypt_do_sha256(sha256,
> +				&de_name[FSCRYPT_FNAME_UNDIGESTED_SIZE],
> +				de_name_len - FSCRYPT_FNAME_UNDIGESTED_SIZE);
> +			if (memcmp(sha256, n->sha256, sizeof(sha256)) != 0)
> +				return false;
> +		}

It's hard to understand what's going on here.  It would be easier to read if the
SHA-256 and non-SHA-256 cases were cleanly separated, e.g.:

		if (fname->crypto_buf.len ==
			    offsetofend(struct fscrypt_nokey_name, sha256)) {
			u8 sha256[SHA256_DIGEST_SIZE];

			if (de_name_len <= FSCRYPT_FNAME_UNDIGESTED_SIZE)
				return false;
			if (memcmp(de_name, n->bytes,
				   FSCRYPT_FNAME_UNDIGESTED_SIZE) != 0)
				return false;
			fscrypt_do_sha256(sha256,
				&de_name[FSCRYPT_FNAME_UNDIGESTED_SIZE],
				de_name_len - FSCRYPT_FNAME_UNDIGESTED_SIZE);
			if (memcmp(sha256, n->sha256, sizeof(sha256)) != 0)
				return false;
		} else {
			u32 len = fname->crypto_buf.len -
				offsetof(struct fscrypt_nokey_name, bytes);

			if (de_name_len != len)
				return false;
			if (memcmp(de_name, n->bytes, len) != 0)
				return false;
		}

Also, I'm not sure it's a good idea to have so many hardcoded references to
SHA-256, since that this algorithm could be changed later.  It might be good to
use something more generic like 'strong_hash'.

> +
> +		return true;
> +	}
> +
> +	if (de_name_len != fname->disk_name.len)
> +		return false;
> +	return !memcmp(de_name, fname->disk_name.name, fname->disk_name.len);
> +}
> +EXPORT_SYMBOL(fscrypt_match_name);
> +
>  /**
>   * fscrypt_fname_siphash() - Calculate the siphash for a file name
>   * @dir: the parent directory
> diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h
> index 028aed925e51..ddb7245ba92b 100644
> --- a/include/linux/fscrypt.h
> +++ b/include/linux/fscrypt.h
> @@ -16,6 +16,7 @@
>  #include <linux/fs.h>
>  #include <linux/mm.h>
>  #include <linux/slab.h>
> +#include <crypto/sha.h>
>  #include <uapi/linux/fscrypt.h>
>  
>  #define FS_CRYPTO_BLOCK_SIZE		16
> @@ -160,79 +161,34 @@ extern int fscrypt_fname_disk_to_usr(struct inode *, u32, u32,
>  extern u64 fscrypt_fname_siphash(const struct inode *dir,
>  					const struct qstr *name);
>  
> -#define FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE	32
> -
> -/* Extracts the second-to-last ciphertext block; see explanation below */
> -#define FSCRYPT_FNAME_DIGEST(name, len)	\
> -	((name) + round_down((len) - FS_CRYPTO_BLOCK_SIZE - 1, \
> -			     FS_CRYPTO_BLOCK_SIZE))
> -
> -#define FSCRYPT_FNAME_DIGEST_SIZE	FS_CRYPTO_BLOCK_SIZE
> -
>  /**
> - * fscrypt_digested_name - alternate identifier for an on-disk filename
> + * fscrypt_nokey_name - identifier for on-disk filenames when key is not present

Now that fscrypt_match_name() is no longer an inline function, the definition of
struct fscrypt_nokey_name should be moved to fs/crypto/fname.c, since
filesystems don't need it directly anymore.

>   *
> - * When userspace lists an encrypted directory without access to the key,
> - * filenames whose ciphertext is longer than FSCRYPT_FNAME_MAX_UNDIGESTED_SIZE
> - * bytes are shown in this abbreviated form (base64-encoded) rather than as the
> - * full ciphertext (base64-encoded).  This is necessary to allow supporting
> - * filenames up to NAME_MAX bytes, since base64 encoding expands the length.
> + * When userspace lists an encrypted directory without access to the key, we
> + * must present them with a unique identifier for the file. base64 encoding will
> + * expand the space, so we use this format to avoid most collisions.
>   *
> - * To make it possible for filesystems to still find the correct directory entry
> - * despite not knowing the full on-disk name, we encode any filesystem-specific
> - * 'hash' and/or 'minor_hash' which the filesystem may need for its lookups,
> - * followed by the second-to-last ciphertext block of the filename.  Due to the
> - * use of the CBC-CTS encryption mode, the second-to-last ciphertext block
> - * depends on the full plaintext.  (Note that ciphertext stealing causes the
> - * last two blocks to appear "flipped".)  This makes accidental collisions very
> - * unlikely: just a 1 in 2^128 chance for two filenames to collide even if they
> - * share the same filesystem-specific hashes.
> - *
> - * However, this scheme isn't immune to intentional collisions, which can be
> - * created by anyone able to create arbitrary plaintext filenames and view them
> - * without the key.  Making the "digest" be a real cryptographic hash like
> - * SHA-256 over the full ciphertext would prevent this, although it would be
> - * less efficient and harder to implement, especially since the filesystem would
> - * need to calculate it for each directory entry examined during a search.
> + * Filesystems may rely on the hash being present to look up a file on disk.
> + * For filenames that are both casefolded and encrypted, it is not possible to
> + * calculate the hash without the key. Additionally, if the ciphertext is longer
> + * than what we can base64 encode, we cannot generate the hash from the partial
> + * name. For simplicity, we always store the hash at the front of the name,
> + * followed by the first 149 bytes of the ciphertext, and then the sha256 of the
> + * remainder of the name if the ciphertext was longer than 149 bytes. For the
> + * usual case of relatively short filenames, this allows us to avoid needing to
> + * compute the sha256. This results in an encoded name that is at most 252 bytes
> + * long.
>   */
> -struct fscrypt_digested_name {
> -	u32 hash;
> -	u32 minor_hash;
> -	u8 digest[FSCRYPT_FNAME_DIGEST_SIZE];
> -};
[...]

> +#define FSCRYPT_FNAME_UNDIGESTED_SIZE 149

I assume that FSCRYPT_FNAME_UNDIGESTED_SIZE is 149 in particular because it
makes the base64 encoding of struct fscrypt_nokey_name fit in NAME_MAX.  Can you
please add a BUILD_BUG_ON() which verifies this assumption at build time?

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03 21:21         ` Gabriel Krisman Bertazi
@ 2019-12-04  0:32           ` Eric Biggers
  0 siblings, 0 replies; 21+ messages in thread
From: Eric Biggers @ 2019-12-04  0:32 UTC (permalink / raw)
  To: Gabriel Krisman Bertazi
  Cc: Gao Xiang, Daniel Rosenberg, Theodore Ts'o, linux-ext4,
	Jaegeuk Kim, Chao Yu, linux-f2fs-devel, linux-fscrypt,
	Alexander Viro, Andreas Dilger, Jonathan Corbet, linux-doc,
	linux-kernel, linux-fsdevel, kernel-team

On Tue, Dec 03, 2019 at 04:21:02PM -0500, Gabriel Krisman Bertazi wrote:
> Eric Biggers <ebiggers@kernel.org> writes:
> 
> > On Tue, Dec 03, 2019 at 02:42:10PM -0500, Gabriel Krisman Bertazi wrote:
> >> Gao Xiang <gaoxiang25@huawei.com> writes:
> 
> >> I think Daniel's approach of moving this into VFS is the simplest way to
> >> actually solve the issue, instead of extending and duplicating a lot of
> >> functionality into filesystem hooks to support the possible mixes of
> >> case-insensitive, overlayfs and fscrypt.
> >> 
> >
> > I think we can actually get everything we want using dentry_operations only,
> > since the filesystem can set ->d_op during ->lookup() (like what is done for
> > encrypted filenames now) rather than at dentry allocation time.  And fs/crypto/
> > can export fscrypt_d_revalidate() rather than setting ->d_op itself.
> 
> Problem is, differently from fscrypt, case-insensitive uses the d_hash()
> hook and for a lookup, we actually use
> dentry->d_parent->d_ops->d_hash().  Which works well, until you are flipping the
> casefold flag.  Then the dentry already exists and you need to modify
> the d_ops on the fly, which I couldn't find precedent anywhere.  I tried
> invalidating the dentry whenever we flip the flag, but then if it has
> negative dentries as children,I wasn't able to reliably invalidate it,
> and that's when I reached the limit of my knowledge in VFS.  In
> particular, in every attempt I made to implement it like this, I was
> able to race and do a case-insensitive lookup on a directory that was
> just made case sensitive.
> 
> I'm not saying there isn't a way.  But it is a bit harder than this
> proposal. I tried it already and still didn't manage to make it work.
> Maybe someone who better understands vfs.

Yes you're right, I forgot that for ->d_hash() and ->d_compare() it's actually
the parent's directory dentry_operations that are used.

> 
> > It's definitely ugly to have to handle the 3 cases of encrypt, casefold, and
> > encrypt+casefold separately -- and this will need to be duplicated for each
> > filesystem.  But we do have to weigh that against adding additional complexity
> > and overhead to the VFS for everyone.  If we do go with the VFS changes, please
> > try to make them as simple and unobtrusive as possible.
> 
> Well, it is just not case-insensitive+fscrypt. Also overlayfs
> there. Probably more.  So we have much more cases.  I understand the VFS
> changes need to be very well thought, but when I worked on this it
> started to look a more correct solution than using the hooks.

Well the point of my proof-of-concept patch having separate ext4_ci_dentry_ops,
ext4_encrypted_dentry_ops, and ext4_encrypted_ci_dentry_ops is supposed to be
for overlayfs support -- since overlayfs requires that some operations are not
present.  If we didn't need overlayfs support, we could just use a single
ext4_dentry_ops for all dentries instead.

I think we could still support fscrypt, casefold, fscrypt+casefold, and
fscrypt+overlayfs with dentry_operations only.  It's casefold+overlayfs that's
the biggest problem, due to the possibility of the casefold flag being set on a
directory later as you pointed out.

- Eric

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 5/8] f2fs: Handle casefolding with Encryption
  2019-12-03  5:10 ` [PATCH 5/8] f2fs: Handle casefolding with Encryption Daniel Rosenberg
@ 2019-12-05  1:17   ` kbuild test robot
  0 siblings, 0 replies; 21+ messages in thread
From: kbuild test robot @ 2019-12-05  1:17 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: kbuild-all, Theodore Ts'o, linux-ext4, Jaegeuk Kim, Chao Yu,
	linux-f2fs-devel, Eric Biggers, linux-fscrypt, Alexander Viro,
	Andreas Dilger, Jonathan Corbet, linux-doc, linux-kernel,
	linux-fsdevel, Gabriel Krisman Bertazi, kernel-team,
	Daniel Rosenberg

Hi Daniel,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on next-20191202 next-20191204]
[cannot apply to ext4/dev f2fs/dev-test v5.4 v5.4-rc8 v5.4-rc7 v5.4]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Daniel-Rosenberg/Support-for-Casefolding-and-Encryption/20191203-131410
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 76bb8b05960c3d1668e6bee7624ed886cbd135ba
reproduce:
        # apt-get install sparse
        # sparse version: v0.6.1-91-g817270f-dirty
        make ARCH=x86_64 allmodconfig
        make C=1 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>


sparse warnings: (new ones prefixed by >>)

>> fs/f2fs/dir.c:205:13: sparse: sparse: incorrect type in assignment (different base types) @@    expected int len @@    got restricted __le16 [usertyint len @@
>> fs/f2fs/dir.c:205:13: sparse:    expected int len
   fs/f2fs/dir.c:205:13: sparse:    got restricted __le16 [usertype] name_len
--
>> fs/f2fs/hash.c:90:27: sparse: sparse: incorrect type in assignment (different base types) @@    expected restricted __le32 [usertype] f2fs_hash @@    got __le32 [usertype] f2fs_hash @@
>> fs/f2fs/hash.c:90:27: sparse:    expected restricted __le32 [usertype] f2fs_hash
>> fs/f2fs/hash.c:90:27: sparse:    got unsigned long long
   fs/f2fs/hash.c:133:24: sparse: sparse: incorrect type in return expression (different base types) @@    expected restricted __le32 @@    got e32 @@
   fs/f2fs/hash.c:133:24: sparse:    expected restricted __le32
   fs/f2fs/hash.c:133:24: sparse:    got int
   fs/f2fs/hash.c:141:11: sparse: sparse: incorrect type in assignment (different base types) @@    expected int r @@    got restricted __int r @@
   fs/f2fs/hash.c:141:11: sparse:    expected int r
   fs/f2fs/hash.c:141:11: sparse:    got restricted __le32
   fs/f2fs/hash.c:144:16: sparse: sparse: incorrect type in return expression (different base types) @@    expected restricted __le32 @@    got le32 @@
   fs/f2fs/hash.c:144:16: sparse:    expected restricted __le32
   fs/f2fs/hash.c:144:16: sparse:    got int r

vim +205 fs/f2fs/dir.c

   199	
   200		if (de->hash_code != namehash)
   201			return false;
   202	
   203	#ifdef CONFIG_UNICODE
   204		name = d->filename[bit_pos];
 > 205		len = de->name_len;
   206	
   207		if (sb->s_encoding && needs_casefold(parent)) {
   208			if (cf_str->name) {
   209				struct qstr cf = {.name = cf_str->name,
   210						  .len = cf_str->len};
   211				return !f2fs_ci_compare(parent, &cf, name, len, true);
   212			}
   213			return !f2fs_ci_compare(parent, fname->usr_fname, name, len,
   214						false);
   215		}
   216	#endif
   217		if (fscrypt_match_name(fname, d->filename[bit_pos],
   218					le16_to_cpu(de->name_len)))
   219			return true;
   220		return false;
   221	}
   222	

---
0-DAY kernel test infrastructure                 Open Source Technology Center
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org Intel Corporation

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/8] vfs: Fold casefolding into vfs
  2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
  2019-12-03  7:41   ` Gao Xiang
  2019-12-03 19:31   ` Gabriel Krisman Bertazi
@ 2020-01-03 20:26   ` Theodore Y. Ts'o
  2 siblings, 0 replies; 21+ messages in thread
From: Theodore Y. Ts'o @ 2020-01-03 20:26 UTC (permalink / raw)
  To: Daniel Rosenberg
  Cc: linux-ext4, Jaegeuk Kim, Chao Yu, linux-f2fs-devel, Eric Biggers,
	linux-fscrypt, Alexander Viro, Andreas Dilger, Jonathan Corbet,
	linux-doc, linux-kernel, linux-fsdevel, Gabriel Krisman Bertazi,
	kernel-team

On Mon, Dec 02, 2019 at 09:10:45PM -0800, Daniel Rosenberg wrote:
> @@ -228,6 +229,13 @@ static inline int dentry_string_cmp(const unsigned char *cs, const unsigned char
>  
>  #endif
>  
> +bool needs_casefold(const struct inode *dir)
> +{
> +	return IS_CASEFOLDED(dir) &&
> +			(!IS_ENCRYPTED(dir) || fscrypt_has_encryption_key(dir));
> +}
> +EXPORT_SYMBOL(needs_casefold);
> +

I'd suggest adding a check to make sure that dir->i_sb->s_encoding is
non-NULL before needs_casefold() returns non-NULL.  Otherwise a bug
(or a fuzzed file system) which manages to set the S_CASEFOLD flag without having s_encoding be initialized might cause a NULL dereference.

Also, maybe make needs_casefold() an inline function which returns 0
if CONFIG_UNICODE is not defined?  That way the need for #ifdef
CONFIG_UNICODE could be reduced.

> @@ -247,7 +255,19 @@ static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *c
>  	 * be no NUL in the ct/tcount data)
>  	 */
>  	const unsigned char *cs = READ_ONCE(dentry->d_name.name);
> +#ifdef CONFIG_UNICODE
> +	struct inode *parent = dentry->d_parent->d_inode;
>  
> +	if (unlikely(needs_casefold(parent))) {
> +		const struct qstr n1 = QSTR_INIT(cs, tcount);
> +		const struct qstr n2 = QSTR_INIT(ct, tcount);
> +		int result = utf8_strncasecmp(dentry->d_sb->s_encoding,
> +						&n1, &n2);
> +
> +		if (result >= 0 || sb_has_enc_strict_mode(dentry->d_sb))
> +			return result;
> +	}
> +#endif

This is an example of how we could drop the #ifdef CONFIG_UNICODE
(moving the declaration of 'parent' into the #if statement) if
needs_casefold() always returns 0 if !defined(CONFIG_UNICODE).

> @@ -2404,7 +2424,22 @@ struct dentry *d_hash_and_lookup(struct dentry *dir, struct qstr *name)
>  	 * calculate the standard hash first, as the d_op->d_hash()
>  	 * routine may choose to leave the hash value unchanged.
>  	 */
> +#ifdef CONFIG_UNICODE
> +	unsigned char *hname = NULL;
> +	int hlen = name->len;
> +
> +	if (IS_CASEFOLDED(dir->d_inode)) {
> +		hname = kmalloc(PATH_MAX, GFP_ATOMIC);
> +		if (!hname)
> +			return ERR_PTR(-ENOMEM);
> +		hlen = utf8_casefold(dir->d_sb->s_encoding,
> +					name, hname, PATH_MAX);
> +	}
> +	name->hash = full_name_hash(dir, hname ?: name->name, hlen);
> +	kfree(hname);
> +#else
>  	name->hash = full_name_hash(dir, name->name, name->len);
> +#endif

Perhaps this could be refactored out?  It's also used in
link_path_walk() and lookup_one_len_common().

(Note, there was some strageness in lookup_one_len_common(), where
hname is freed twice, the first time using kvfree() which I don't
believe is needed.  But this can be fixed as part of the refactoring.)

	   	    	     	    	  - Ted

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-01-03 20:26 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-03  5:10 [PATCH 0/8] Support for Casefolding and Encryption Daniel Rosenberg
2019-12-03  5:10 ` [PATCH 1/8] fscrypt: Add siphash and hash key for policy v2 Daniel Rosenberg
2019-12-03 23:25   ` Eric Biggers
2019-12-03  5:10 ` [PATCH 2/8] fscrypt: Don't allow v1 policies with casefolding Daniel Rosenberg
2019-12-03 23:37   ` Eric Biggers
2019-12-03  5:10 ` [PATCH 3/8] fscrypt: Change format of no-key token Daniel Rosenberg
2019-12-04  0:09   ` Eric Biggers
2019-12-03  5:10 ` [PATCH 4/8] vfs: Fold casefolding into vfs Daniel Rosenberg
2019-12-03  7:41   ` Gao Xiang
2019-12-03 19:42     ` Gabriel Krisman Bertazi
2019-12-03 20:34       ` Eric Biggers
2019-12-03 21:21         ` Gabriel Krisman Bertazi
2019-12-04  0:32           ` Eric Biggers
2019-12-03 19:31   ` Gabriel Krisman Bertazi
2020-01-03 20:26   ` Theodore Y. Ts'o
2019-12-03  5:10 ` [PATCH 5/8] f2fs: Handle casefolding with Encryption Daniel Rosenberg
2019-12-05  1:17   ` kbuild test robot
2019-12-03  5:10 ` [PATCH 6/8] ext4: Use struct super_block's casefold data Daniel Rosenberg
2019-12-03 19:44   ` Gabriel Krisman Bertazi
2019-12-03  5:10 ` [PATCH 7/8] ext4: Hande casefolding with encryption Daniel Rosenberg
2019-12-03  5:10 ` [PATCH 8/8] ext4: Optimize match for casefolded encrypted dirs Daniel Rosenberg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).