On Sep 4, 2019, at 9:55 AM, Eric Biggers wrote: > On Fri, Aug 23, 2019 at 09:23:39AM -0700, Eric Biggers wrote: >> From: Eric Biggers >> >> By design, the kernel enforces that all files in an encrypted directory >> use the same encryption policy as the directory. It's not possible to >> violate this constraint using syscalls. Lookups of files that violate >> this constraint also fail, in case the disk was manipulated. >> >> But this constraint can also be violated by accidental filesystem >> corruption. E.g., a power cut when using ext4 without a journal might >> leave new files without the encryption bit and/or xattr. Thus, it's >> important that e2fsck correct this condition. >> >> Therefore, this patch makes the following changes to e2fsck: >> >> - During pass 1 (inode table scan), create a map from inode number to >> encryption policy for all encrypted inodes. But it's optimized so >> that the full xattrs aren't saved but rather only 32-bit "policy IDs", >> since usually many inodes share the same encryption policy. Also, if >> an encryption xattr is missing, offer to clear the encrypt flag. If >> an encryption xattr is clearly corrupt, offer to clear the inode. >> >> - During pass 2 (directory structure check), use the map to verify that >> all regular files, directories, and symlinks in encrypted directories >> use the directory's encryption policy. Offer to clear any directory >> entries for which this isn't the case. >> >> Add a new test "f_bad_encryption" to test the new behavior. >> >> Due to the new checks, it was also necessary to update the existing test >> "f_short_encrypted_dirent" to add an encryption xattr to the test file, >> since it was missing one before, which is now considered invalid. >> > > Any comments on this patch? I didn't see the original email for this patch, but I found it on patchworks. One change is needed if the filesystem is very large and has a lot of encrypted files. While your typical use case is going to be Android handsets, on the flip side I'm often dealing with filesystems with a few billion files in them, and there are definitely users that want to use the on-disk encryption. >> + /* Append the (ino, policy_id) pair to the list. */ >> + if (info->files_count == info->files_capacity) { >> + size_t new_capacity = info->files_capacity * 2; >> + >> + if (new_capacity < 128) >> + new_capacity = 128; >> + >> + if (ext2fs_resize_mem(info->files_capacity * sizeof(*file), >> + new_capacity * sizeof(*file), >> + &info->files) != 0) If the number of files in the array get very large, then doubling the array size at the end may consume a *lot* of memory. It would be somewhat better to cap new_capacity by the number of inodes in the filesystem, and better yet scale the array size by a fraction of the total number of inodes that have already been processed, but this array might still be several GB of RAM. What about using run-length encoding for this? It is unlikely that many different encryption policies are in a filesystem, and inodes tend to be allocated in groups by users, so it is likely that you will get large runs of inodes with the same policy_id, and this could save considerable space. Cheers, Andreas