* [GIT PULL] overlayfs update for 4.18 @ 2018-06-08 12:13 Miklos Szeredi 2018-06-09 6:52 ` Christoph Hellwig 0 siblings, 1 reply; 17+ messages in thread From: Miklos Szeredi @ 2018-06-08 12:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: linux-kernel, linux-fsdevel, linux-unionfs, viro, hch [-- Attachment #1: Type: text/plain, Size: 6239 bytes --] Hi Linus, Please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git tags/ovl-update-4.18 This contains two new features: 1) Stack file operations: this allows removal of several hacks from the VFS, proper interaction of read-only open files with copy-up, possibility to implement fs modifying ioctls properly, and others. 2) Metadata only copy-up: when file is on lower layer and only metadata is modified (except size) then only copy up the metadata and continue to use the data from the lower file. The series starts with a cleanup of the internal dedupe API. There's some late discussion on details (should vfs limit the size of a dedepe request, and if yes, how much). I've ignored it for this pull request, it can easily be fixed later. Other pain point: overlay doesn't want to double account open files (due to stacking) for fear of breaking existing setups. So added infrastruture that allows to skip accounting an open file in nr_files. I don't much like this, but can't see any other way of keeping backward compatibility. There are two conflicts when merging, attaching my resolution. Thanks, Miklos --- Miklos Szeredi (37): vfs: dedupe: return loff_t vfs: dedupe: rationalize args vfs: dedupe: extract helper for a single dedup vfs: add path_open() vfs: optionally don't account file in nr_files vfs: export vfs_ioctl() to modules vfs: export vfs_dedupe_file_range_one() to modules ovl: copy up times ovl: copy up inode flags Revert "Revert "ovl: get_write_access() in truncate"" ovl: copy up file size as well ovl: deal with overlay files in ovl_d_real() ovl: stack file ops ovl: add helper to return real file ovl: add ovl_read_iter() ovl: add ovl_write_iter() ovl: add ovl_fsync() ovl: add ovl_mmap() ovl: add ovl_fallocate() ovl: add lsattr/chattr support ovl: add ovl_fiemap() ovl: add O_DIRECT support ovl: add reflink/copyfile/dedup support vfs: don't open real ovl: obsolete "check_copy_up" module option ovl: fix documentation of non-standard behavior vfs: simplify dentry_open() Revert "ovl: fix may_write_real() for overlayfs directories" Revert "ovl: don't allow writing ioctl on lower layer" vfs: fix freeze protection in mnt_want_write_file() for overlayfs Revert "ovl: fix relatime for directories" Revert "vfs: update ovl inode before relatime check" Revert "vfs: add flags to d_real()" Revert "vfs: do get_write_access() on upper layer of overlayfs" Partially revert "locks: fix file locking on overlayfs" Revert "fsnotify: support overlayfs" vfs: remove open_flags from d_real() Vivek Goyal (28): ovl: Initialize ovl_inode->redirect in ovl_get_inode() ovl: Move the copy up helpers to copy_up.c ovl: Provide a mount option metacopy=on/off for metadata copyup ovl: During copy up, first copy up metadata and then data ovl: Copy up only metadata during copy up where it makes sense ovl: Add helper ovl_already_copied_up() ovl: A new xattr OVL_XATTR_METACOPY for file on upper ovl: Use out_err instead of out_nomem ovl: Modify ovl_lookup() and friends to lookup metacopy dentry ovl: Copy up meta inode data from lowest data inode ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry ovl: Fix ovl_getattr() to get number of blocks from lower ovl: Store lower data inode in ovl_inode ovl: Add helper ovl_inode_realdata() ovl: Open file with data except for the case of fsync ovl: Do not expose metacopy only dentry from d_real() ovl: Move some dir related ovl_lookup_single() code in else block ovl: Check redirects for metacopy files ovl: Treat metacopy dentries as type OVL_PATH_MERGE ovl: Add an inode flag OVL_CONST_INO ovl: Do not set dentry type ORIGIN for broken hardlinks ovl: Set redirect on metacopy files upon rename ovl: Set redirect on upper inode when it is linked ovl: Check redirect on index as well ovl: add helper to force data copy-up ovl: Do not do metadata only copy-up for truncate operation ovl: Do not do metacopy only for ioctl modifying file attr ovl: Enable metadata only feature --- Documentation/filesystems/Locking | 3 +- Documentation/filesystems/overlayfs.txt | 90 ++++-- Documentation/filesystems/vfs.txt | 16 +- fs/btrfs/ctree.h | 5 +- fs/btrfs/ioctl.c | 7 +- fs/file_table.c | 13 +- fs/inode.c | 46 +-- fs/internal.h | 17 +- fs/ioctl.c | 1 + fs/locks.c | 20 +- fs/namei.c | 2 +- fs/namespace.c | 69 +---- fs/ocfs2/file.c | 10 +- fs/open.c | 87 +++--- fs/overlayfs/Kconfig | 19 ++ fs/overlayfs/Makefile | 4 +- fs/overlayfs/copy_up.c | 190 ++++++++---- fs/overlayfs/dir.c | 105 +++++-- fs/overlayfs/export.c | 3 + fs/overlayfs/file.c | 508 ++++++++++++++++++++++++++++++++ fs/overlayfs/inode.c | 175 +++++++---- fs/overlayfs/namei.c | 195 +++++++----- fs/overlayfs/overlayfs.h | 47 ++- fs/overlayfs/ovl_entry.h | 6 +- fs/overlayfs/super.c | 103 ++++--- fs/overlayfs/util.c | 252 +++++++++++++++- fs/read_write.c | 91 +++--- fs/xattr.c | 9 +- fs/xfs/xfs_file.c | 8 +- include/linux/dcache.h | 15 +- include/linux/fs.h | 31 +- include/linux/fsnotify.h | 14 +- include/uapi/linux/fs.h | 1 - 33 files changed, 1590 insertions(+), 572 deletions(-) create mode 100644 fs/overlayfs/file.c [-- Attachment #2: ov-merge --] [-- Type: text/plain, Size: 2243 bytes --] diff --cc fs/btrfs/ioctl.c index d29992f7dc63,70eac76804df..000000000000 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@@ -3596,14 -3192,20 +3596,15 @@@ out_free return ret; } - ssize_t btrfs_dedupe_file_range(struct file *src_file, u64 loff, u64 olen, - struct file *dst_file, u64 dst_loff) -#define BTRFS_MAX_DEDUPE_LEN SZ_16M - + loff_t btrfs_dedupe_file_range(struct file *src_file, loff_t loff, + struct file *dst_file, loff_t dst_loff, + loff_t olen) { struct inode *src = file_inode(src_file); struct inode *dst = file_inode(dst_file); u64 bs = BTRFS_I(src)->root->fs_info->sb->s_blocksize; - ssize_t res; + int res; - if (olen > BTRFS_MAX_DEDUPE_LEN) - olen = BTRFS_MAX_DEDUPE_LEN; - if (WARN_ON_ONCE(bs < PAGE_SIZE)) { /* * Btrfs does not support blocksize < page_size. As a diff --cc fs/read_write.c index e83bd9744b5d,1ff18ea56584..000000000000 --- a/fs/read_write.c +++ b/fs/read_write.c @@@ -2021,46 -2055,21 +2055,21 @@@ int vfs_dedupe_file_range(struct file * if (info->reserved) { info->status = -EINVAL; - } else if (!(is_admin || (dst_file->f_mode & FMODE_WRITE))) { - info->status = -EINVAL; - } else if (file->f_path.mnt != dst_file->f_path.mnt) { - info->status = -EXDEV; - } else if (S_ISDIR(dst->i_mode)) { - info->status = -EISDIR; - } else if (dst_file->f_op->dedupe_file_range == NULL) { - info->status = -EINVAL; - } else { - deduped = dst_file->f_op->dedupe_file_range(file, off, - len, dst_file, - info->dest_offset); - if (deduped == -EBADE) - info->status = FILE_DEDUPE_RANGE_DIFFERS; - else if (deduped < 0) - info->status = deduped; - else - info->bytes_deduped += deduped; - goto next_loop; ++ goto next_fdput; } - next_file: - mnt_drop_write_file(dst_file); + deduped = vfs_dedupe_file_range_one(file, off, dst_file, + info->dest_offset, len); + if (deduped == -EBADE) + info->status = FILE_DEDUPE_RANGE_DIFFERS; + else if (deduped < 0) + info->status = deduped; + else + info->bytes_deduped += deduped; + -next_loop: +next_fdput: fdput(dst_fd); - +next_loop: if (fatal_signal_pending(current)) goto out; } ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-08 12:13 [GIT PULL] overlayfs update for 4.18 Miklos Szeredi @ 2018-06-09 6:52 ` Christoph Hellwig 2018-06-09 21:42 ` Linus Torvalds 2018-06-10 5:54 ` Al Viro 0 siblings, 2 replies; 17+ messages in thread From: Christoph Hellwig @ 2018-06-09 6:52 UTC (permalink / raw) To: Miklos Szeredi Cc: Linus Torvalds, linux-kernel, linux-fsdevel, linux-unionfs, viro, hch On Fri, Jun 08, 2018 at 02:13:30PM +0200, Miklos Szeredi wrote: > Hi Linus, > > Please pull from: > > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git tags/ovl-update-4.18 > > This contains two new features: > > 1) Stack file operations: this allows removal of several hacks from the > VFS, proper interaction of read-only open files with copy-up, > possibility to implement fs modifying ioctls properly, and others. Which includews all kinds of NAKed or at least non-acked VFS changes. Please get these through Als tree after proper review first. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-09 6:52 ` Christoph Hellwig @ 2018-06-09 21:42 ` Linus Torvalds 2018-06-09 23:55 ` Al Viro 2018-06-11 6:09 ` Christoph Hellwig 2018-06-10 5:54 ` Al Viro 1 sibling, 2 replies; 17+ messages in thread From: Linus Torvalds @ 2018-06-09 21:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Miklos Szeredi, Linux Kernel Mailing List, linux-fsdevel, linux-unionfs, Al Viro Hmm. So I had held off on pulling this in the hope that it would get more comments. I like most of the vfs-level stuff - it gets rid of some of the hackery we had for overlayfs. It does add some new hackery to replace it (like the file accounting), though. And Christoph's copmmentary isn't really helping the situation. Christoph, I haven't seen the NAK history, can you elaborate? And Al, can you please take a look? Linus On Fri, Jun 8, 2018 at 11:52 PM Christoph Hellwig <hch@infradead.org> wrote: > > > This contains two new features: > > > > 1) Stack file operations: this allows removal of several hacks from the > > VFS, proper interaction of read-only open files with copy-up, > > possibility to implement fs modifying ioctls properly, and others. > > Which includews all kinds of NAKed or at least non-acked VFS changes. > > Please get these through Als tree after proper review first. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-09 21:42 ` Linus Torvalds @ 2018-06-09 23:55 ` Al Viro 2018-06-11 6:09 ` Christoph Hellwig 1 sibling, 0 replies; 17+ messages in thread From: Al Viro @ 2018-06-09 23:55 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Miklos Szeredi, Linux Kernel Mailing List, linux-fsdevel, linux-unionfs On Sat, Jun 09, 2018 at 02:42:20PM -0700, Linus Torvalds wrote: > Hmm. > > So I had held off on pulling this in the hope that it would get more comments. > > I like most of the vfs-level stuff - it gets rid of some of the > hackery we had for overlayfs. > > It does add some new hackery to replace it (like the file accounting), though. > > And Christoph's copmmentary isn't really helping the situation. > Christoph, I haven't seen the NAK history, can you elaborate? > > And Al, can you please take a look? Will post tonight... ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-09 21:42 ` Linus Torvalds 2018-06-09 23:55 ` Al Viro @ 2018-06-11 6:09 ` Christoph Hellwig 1 sibling, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2018-06-11 6:09 UTC (permalink / raw) To: Linus Torvalds Cc: Christoph Hellwig, Miklos Szeredi, Linux Kernel Mailing List, linux-fsdevel, linux-unionfs, Al Viro On Sat, Jun 09, 2018 at 02:42:20PM -0700, Linus Torvalds wrote: > And Christoph's copmmentary isn't really helping the situation. > Christoph, I haven't seen the NAK history, can you elaborate? Most of the bits just need a bit of refinement I think, instead of being forced through the overlayfs tree and are generally fine. The pre_mmap hook I think is a clear no-go. We've had this tried multiple times and always rejected it. Unlike previous uses the overlayfs use isn't outright broken, but still questionalable as it will still lead to a copyup that "leaks" if the actual mmap wasn't successfull. The whole discussion of how mmap happens, how we deal with mmap_sem and failures needs a broader discussion with all MM and VFS folks first. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-09 6:52 ` Christoph Hellwig 2018-06-09 21:42 ` Linus Torvalds @ 2018-06-10 5:54 ` Al Viro 2018-06-11 6:10 ` Christoph Hellwig 2018-06-11 8:41 ` Miklos Szeredi 1 sibling, 2 replies; 17+ messages in thread From: Al Viro @ 2018-06-10 5:54 UTC (permalink / raw) To: Christoph Hellwig Cc: Miklos Szeredi, Linus Torvalds, linux-kernel, linux-fsdevel, linux-unionfs On Fri, Jun 08, 2018 at 11:52:08PM -0700, Christoph Hellwig wrote: > On Fri, Jun 08, 2018 at 02:13:30PM +0200, Miklos Szeredi wrote: > > Hi Linus, > > > > Please pull from: > > > > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git tags/ovl-update-4.18 > > > > This contains two new features: > > > > 1) Stack file operations: this allows removal of several hacks from the > > VFS, proper interaction of read-only open files with copy-up, > > possibility to implement fs modifying ioctls properly, and others. > > Which includews all kinds of NAKed or at least non-acked VFS changes. Umm... The worst of yours had been ->pre_mmap(), right? He *did* drop that... > Please get these through Als tree after proper review first. OK, summary of sort (see fsdevel thread for details): * path_open() is dubious; why not simply use vfsmount/dentry from the right layer when opening an underlying file? Then it would be vfs_open()... * ovl_mmap() is broken, plain and simple. Failure ends up leaking a layer struct file *and* doing double fput() on overlayfs one. * ovl_mmap() is also trivially DoSable - you can trigger tons and tons of reopens, each sticking a new (writable layer) struct file into a vma. We *do* want some scheme avoiding once-per-operations reopens in the copied-up-after-r/o-open case. See possible kinda-sorta solution on fsdevel; I'm not sure I like it, though. The rest is pretty minor. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-10 5:54 ` Al Viro @ 2018-06-11 6:10 ` Christoph Hellwig 2018-06-11 8:41 ` Miklos Szeredi 1 sibling, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2018-06-11 6:10 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, Miklos Szeredi, Linus Torvalds, linux-kernel, linux-fsdevel, linux-unionfs On Sun, Jun 10, 2018 at 06:54:38AM +0100, Al Viro wrote: > Umm... The worst of yours had been ->pre_mmap(), right? He *did* drop that... Oh, hadn't notied that. Still odd to change a huge pull request after the end merge window. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-10 5:54 ` Al Viro 2018-06-11 6:10 ` Christoph Hellwig @ 2018-06-11 8:41 ` Miklos Szeredi 2018-06-11 16:27 ` Christoph Hellwig 1 sibling, 1 reply; 17+ messages in thread From: Miklos Szeredi @ 2018-06-11 8:41 UTC (permalink / raw) To: Al Viro Cc: Christoph Hellwig, Linus Torvalds, linux-kernel, linux-fsdevel, overlayfs On Sun, Jun 10, 2018 at 7:54 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Fri, Jun 08, 2018 at 11:52:08PM -0700, Christoph Hellwig wrote: >> On Fri, Jun 08, 2018 at 02:13:30PM +0200, Miklos Szeredi wrote: >> > Hi Linus, >> > >> > Please pull from: >> > >> > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git tags/ovl-update-4.18 >> > >> > This contains two new features: >> > >> > 1) Stack file operations: this allows removal of several hacks from the >> > VFS, proper interaction of read-only open files with copy-up, >> > possibility to implement fs modifying ioctls properly, and others. >> >> Which includews all kinds of NAKed or at least non-acked VFS changes. > > Umm... The worst of yours had been ->pre_mmap(), right? He *did* drop that... > >> Please get these through Als tree after proper review first. > > OK, summary of sort (see fsdevel thread for details): > * path_open() is dubious; why not simply use vfsmount/dentry from the > right layer when opening an underlying file? Then it would be vfs_open()... > * ovl_mmap() is broken, plain and simple. Failure ends up leaking > a layer struct file *and* doing double fput() on overlayfs one. > * ovl_mmap() is also trivially DoSable - you can trigger tons and tons > of reopens, each sticking a new (writable layer) struct file into a vma. > We *do* want some scheme avoiding once-per-operations reopens in the > copied-up-after-r/o-open case. See possible kinda-sorta solution on fsdevel; > I'm not sure I like it, though. Al, thanks for the review. Posted incremental for the ovl_mmap() issues to -fsdevel. I'm pretty confident that that addresses your comments. Linus, would you still pull if Al's satisfied with that resolution? I can post the fixes (just few liners) after the merge window. I'm definitely not going to prepare another pull request this cycle if the old one cannot be pulled. Thanks, Miklos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-11 8:41 ` Miklos Szeredi @ 2018-06-11 16:27 ` Christoph Hellwig 0 siblings, 0 replies; 17+ messages in thread From: Christoph Hellwig @ 2018-06-11 16:27 UTC (permalink / raw) To: Miklos Szeredi Cc: Al Viro, Christoph Hellwig, Linus Torvalds, linux-kernel, linux-fsdevel, overlayfs On Mon, Jun 11, 2018 at 10:41:29AM +0200, Miklos Szeredi wrote: > Linus, would you still pull if Al's satisfied with that resolution? > I can post the fixes (just few liners) after the merge window. Please repost all changes outside overlayfs itself to -fsdevel so that we can improve them for merging in the next merge window instead of hasting them in now. ^ permalink raw reply [flat|nested] 17+ messages in thread
* [GIT PULL] overlayfs update for 4.18 @ 2018-05-29 13:21 Miklos Szeredi 2018-05-29 13:59 ` Christoph Hellwig 2018-06-01 15:26 ` Miklos Szeredi 0 siblings, 2 replies; 17+ messages in thread From: Miklos Szeredi @ 2018-05-29 13:21 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-unionfs Hi Al, I'm sending this pull request to you instead of Linus, because a bigger than usual chunk involves the VFS. Please pull from: git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git for-viro This update contains the following: - Deal with vfs_mkdir() not instantiating dentry. - Stack file operations. This solves the ro/rw file descriptor inconsistency, weirdness with ioctl, as well as removing a bunch of overlay specific hacks from the VFS. - Allow metadata-only copy-up when data is unchanged. - Various cleanups in VFS and overlayfs. Thanks, Miklos --- Amir Goldstein (8): ovl: update documentation for unionmount-testsuite ovl: remove WARN_ON() real inode attributes mismatch ovl: strip debug argument from ovl_do_ helpers ovl: struct cattr cleanups ovl: return dentry from ovl_create_real() ovl: create helper ovl_create_temp() ovl: make ovl_create_real() cope with vfs_mkdir() safely ovl: use inode_insert5() to hash a newly created inode Miklos Szeredi (41): ovl: clean up copy-up error paths vfs: factor out inode_insert5() vfs: dedpue: return loff_t vfs: dedupe: rationalize args vfs: dedupe: extract helper for a single dedup vfs: add path_open() vfs: optionally don't account file in nr_files vfs: add f_op->pre_mmap() vfs: export vfs_ioctl() to modules vfs: export vfs_dedupe_file_range_one() to modules ovl: copy up times ovl: copy up inode flags Revert "Revert "ovl: get_write_access() in truncate"" ovl: copy up file size as well ovl: deal with overlay files in ovl_d_real() ovl: stack file ops ovl: add helper to return real file ovl: add ovl_read_iter() ovl: add ovl_write_iter() ovl: add ovl_fsync() ovl: add ovl_mmap() ovl: add ovl_fallocate() ovl: add lsattr/chattr support ovl: add ovl_fiemap() ovl: add O_DIRECT support ovl: add reflink/copyfile/dedup support vfs: don't open real ovl: copy-up on MAP_SHARED ovl: obsolete "check_copy_up" module option ovl: fix documentation of non-standard behavior vfs: simplify dentry_open() Revert "ovl: fix may_write_real() for overlayfs directories" Revert "ovl: don't allow writing ioctl on lower layer" vfs: fix freeze protection in mnt_want_write_file() for overlayfs Revert "ovl: fix relatime for directories" Revert "vfs: update ovl inode before relatime check" Revert "vfs: add flags to d_real()" Revert "vfs: do get_write_access() on upper layer of overlayfs" Partially revert "locks: fix file locking on overlayfs" Revert "fsnotify: support overlayfs" vfs: remove open_flags from d_real() Vivek Goyal (29): ovl: Pass argument to ovl_get_inode() in a structure ovl: Initialize ovl_inode->redirect in ovl_get_inode() ovl: Move the copy up helpers to copy_up.c ovl: Provide a mount option metacopy=on/off for metadata copyup ovl: During copy up, first copy up metadata and then data ovl: Copy up only metadata during copy up where it makes sense ovl: Add helper ovl_already_copied_up() ovl: A new xattr OVL_XATTR_METACOPY for file on upper ovl: Use out_err instead of out_nomem ovl: Modify ovl_lookup() and friends to lookup metacopy dentry ovl: Copy up meta inode data from lowest data inode ovl: Add helper ovl_dentry_lowerdata() to get lower data dentry ovl: Fix ovl_getattr() to get number of blocks from lower ovl: Store lower data inode in ovl_inode ovl: Add helper ovl_inode_realdata() ovl: Open file with data except for the case of fsync ovl: Do not expose metacopy only dentry from d_real() ovl: Move some dir related ovl_lookup_single() code in else block ovl: Check redirects for metacopy files ovl: Treat metacopy dentries as type OVL_PATH_MERGE ovl: Add an inode flag OVL_CONST_INO ovl: Do not set dentry type ORIGIN for broken hardlinks ovl: Set redirect on metacopy files upon rename ovl: Set redirect on upper inode when it is linked ovl: Check redirect on index as well ovl: Disbale metacopy for MAP_SHARED mmap() ovl: Do not do metadata only copy-up for truncate operation ovl: Do not do metacopy only for ioctl modifying file attr ovl: Enable metadata only feature --- Documentation/filesystems/Locking | 4 +- Documentation/filesystems/overlayfs.txt | 97 ++++-- Documentation/filesystems/vfs.txt | 19 +- fs/btrfs/ctree.h | 5 +- fs/btrfs/ioctl.c | 7 +- fs/file_table.c | 13 +- fs/inode.c | 210 +++++-------- fs/internal.h | 17 +- fs/ioctl.c | 1 + fs/locks.c | 20 +- fs/namei.c | 2 +- fs/namespace.c | 69 +---- fs/ocfs2/file.c | 10 +- fs/open.c | 74 ++--- fs/overlayfs/Kconfig | 40 +++ fs/overlayfs/Makefile | 4 +- fs/overlayfs/copy_up.c | 273 +++++++++------- fs/overlayfs/dir.c | 312 +++++++++++++------ fs/overlayfs/export.c | 11 +- fs/overlayfs/file.c | 530 ++++++++++++++++++++++++++++++++ fs/overlayfs/inode.c | 203 ++++++++---- fs/overlayfs/namei.c | 205 +++++++----- fs/overlayfs/overlayfs.h | 119 ++++--- fs/overlayfs/ovl_entry.h | 7 +- fs/overlayfs/super.c | 134 +++++--- fs/overlayfs/util.c | 252 ++++++++++++++- fs/read_write.c | 91 +++--- fs/xattr.c | 9 +- fs/xfs/xfs_file.c | 8 +- include/linux/dcache.h | 15 +- include/linux/fs.h | 36 ++- include/linux/fsnotify.h | 14 +- include/uapi/linux/fs.h | 1 - mm/util.c | 5 + 34 files changed, 1981 insertions(+), 836 deletions(-) create mode 100644 fs/overlayfs/file.c ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-05-29 13:21 Miklos Szeredi @ 2018-05-29 13:59 ` Christoph Hellwig 2018-05-29 14:12 ` Miklos Szeredi 2018-06-01 15:26 ` Miklos Szeredi 1 sibling, 1 reply; 17+ messages in thread From: Christoph Hellwig @ 2018-05-29 13:59 UTC (permalink / raw) To: Miklos Szeredi; +Cc: Al Viro, linux-kernel, linux-fsdevel, linux-unionfs > vfs: add f_op->pre_mmap() We've been through these pre-mmap games a few times, and always rejected them, why is this any different? > vfs: export vfs_dedupe_file_range_one() to modules Please use EXPORT_SYMBOL_GPL for all these crazy low-level exports. To be homest I'd really like to see the whole thing as a patch series instead of a pull request. Very little seems to have gotten any reviewed-by tags which makes me very suspicious. ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-05-29 13:59 ` Christoph Hellwig @ 2018-05-29 14:12 ` Miklos Szeredi 2018-05-30 8:36 ` Miklos Szeredi 0 siblings, 1 reply; 17+ messages in thread From: Miklos Szeredi @ 2018-05-29 14:12 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Al Viro, linux-kernel, linux-fsdevel, overlayfs On Tue, May 29, 2018 at 3:59 PM, Christoph Hellwig <hch@infradead.org> wrote: >> vfs: add f_op->pre_mmap() > > We've been through these pre-mmap games a few times, and always rejected > them, why is this any different? Don't know what the other cases were. Overlayfs case is completely state free. It just does a copy-up in the case of a shared mapping so that subsequent modifications of that file get reflected in the shared mapping. Can't do the copy-up with mmap_sem held due to locking depencencies. > >> vfs: export vfs_dedupe_file_range_one() to modules > > Please use EXPORT_SYMBOL_GPL for all these crazy low-level exports. > > To be homest I'd really like to see the whole thing as a patch series > instead of a pull request. Very little seems to have gotten any > reviewed-by tags which makes me very suspicious. Did a couple of iterations and got some good feedback. But can post the current version, if you think that's useful. Thanks, Miklos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-05-29 14:12 ` Miklos Szeredi @ 2018-05-30 8:36 ` Miklos Szeredi 2018-05-30 22:27 ` Dave Chinner 0 siblings, 1 reply; 17+ messages in thread From: Miklos Szeredi @ 2018-05-30 8:36 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Al Viro, linux-kernel, linux-fsdevel, overlayfs On Tue, May 29, 2018 at 4:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote: > On Tue, May 29, 2018 at 3:59 PM, Christoph Hellwig <hch@infradead.org> wrote: >>> vfs: export vfs_dedupe_file_range_one() to modules >> >> Please use EXPORT_SYMBOL_GPL for all these crazy low-level exports. I'd argue with the "crazy" part. This should have been the primary interface from the start. The batched dedupe interface is the crazy one: - deduping is page size granularity at worst; performance would not be horrible even if we had to do one syscall per page - vast majority of the time it will be file size granularity Why was that batching invented in the first place? Thanks, Miklos ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-05-30 8:36 ` Miklos Szeredi @ 2018-05-30 22:27 ` Dave Chinner 0 siblings, 0 replies; 17+ messages in thread From: Dave Chinner @ 2018-05-30 22:27 UTC (permalink / raw) To: Miklos Szeredi Cc: Christoph Hellwig, Al Viro, linux-kernel, linux-fsdevel, overlayfs On Wed, May 30, 2018 at 10:36:46AM +0200, Miklos Szeredi wrote: > On Tue, May 29, 2018 at 4:12 PM, Miklos Szeredi <miklos@szeredi.hu> wrote: > > On Tue, May 29, 2018 at 3:59 PM, Christoph Hellwig <hch@infradead.org> wrote: > > >>> vfs: export vfs_dedupe_file_range_one() to modules > >> > >> Please use EXPORT_SYMBOL_GPL for all these crazy low-level exports. > > I'd argue with the "crazy" part. This should have been the primary > interface from the start. The batched dedupe interface is the crazy > one: > > - deduping is page size granularity at worst; performance would not > be horrible even if we had to do one syscall per page > - vast majority of the time it will be file size granularity > > Why was that batching invented in the first place? No idea - the batching ioctl interface is what we inherited from the btrfs ioctl years ago and there are several dedupe applications out there that use it. That's why we pulled it up to the vfs rather than invent a new one and have to wait years for apps to start using it... Cheers, Dave. -- Dave Chinner david@fromorbit.com ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-05-29 13:21 Miklos Szeredi 2018-05-29 13:59 ` Christoph Hellwig @ 2018-06-01 15:26 ` Miklos Szeredi 2018-06-01 16:18 ` Randy Dunlap 1 sibling, 1 reply; 17+ messages in thread From: Miklos Szeredi @ 2018-06-01 15:26 UTC (permalink / raw) To: Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-unionfs On Tue, May 29, 2018 at 03:21:48PM +0200, Miklos Szeredi wrote: > Hi Al, > > I'm sending this pull request to you instead of Linus, because a bigger than > usual chunk involves the VFS. > > Please pull from: > > git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git for-viro > > This update contains the following: > > - Deal with vfs_mkdir() not instantiating dentry. > > - Stack file operations. This solves the ro/rw file descriptor inconsistency, > weirdness with ioctl, as well as removing a bunch of overlay specific hacks > from the VFS. > > - Allow metadata-only copy-up when data is unchanged. > > - Various cleanups in VFS and overlayfs. Updated tree pushed to same place. Incremental patch against previous pull and posted patchset. Thanks, Miklos --- diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt index 0a8e3c4543d1..79be4a77ca08 100644 --- a/Documentation/filesystems/overlayfs.txt +++ b/Documentation/filesystems/overlayfs.txt @@ -280,7 +280,7 @@ parameter metacopy=on/off. Lastly, there is also a per mount option metacopy=on/off to enable/disable this feature per mount. Do not use metacopy=on with untrusted upper/lower directories. Otherwise -it is possible that an attacker can create a handcrafted file with +it is possible that an attacker can create an handcrafted file with appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower pointed by REDIRECT. This should not be possible on local system as setting "trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible @@ -318,7 +318,7 @@ does not support NFS export, lower filesystem does not have a valid UUID or if the upper filesystem does not support extended attributes. For "metadata only copy up" feature there is no verification mechanism at -mount time. So if same upper is mounted with different set of lower, mount +mount time. So if same upper is mouted with different set of lower, mount probably will succeed but expect the unexpected later on. So don't do it. It is quite a common practice to copy overlay layers to a different diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig index 08b04d9fd6e6..e0a090eca65e 100644 --- a/fs/overlayfs/Kconfig +++ b/fs/overlayfs/Kconfig @@ -11,7 +11,7 @@ config OVERLAY_FS For more information see Documentation/filesystems/overlayfs.txt config OVERLAY_FS_REDIRECT_DIR - bool "Overlayfs: turn on redirect directory feature by default" + bool "Overlayfs: turn on redirect dir feature by default" depends on OVERLAY_FS help If this config option is enabled then overlay filesystems will use @@ -46,7 +46,7 @@ config OVERLAY_FS_INDEX depends on OVERLAY_FS help If this config option is enabled then overlay filesystems will use - the index directory to map lower inodes to upper inodes by default. + the inodes index dir to map lower inodes to upper inodes by default. In this case it is still possible to turn off index globally with the "index=off" module option or on a filesystem instance basis with the "index=off" mount option. @@ -67,7 +67,7 @@ config OVERLAY_FS_NFS_EXPORT depends on !OVERLAY_FS_METACOPY help If this config option is enabled then overlay filesystems will use - the index directory to decode overlay NFS file handles by default. + the inodes index dir to decode overlay NFS file handles by default. In this case, it is still possible to turn off NFS export support globally with the "nfs_export=off" module option or on a filesystem instance basis with the "nfs_export=off" mount option. @@ -133,7 +133,7 @@ config OVERLAY_FS_METACOPY help If this config option is enabled then overlay filesystems will copy up only metadata where appropriate and data copy up will - happen when a file is opened for WRITE operation. It is still + happen when a file is opended for WRITE operation. It is still possible to turn off this feature globally with the "metacopy=off" module option or on a filesystem instance basis with the "metacopy=off" mount option. diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index 296037afecdb..bdadedf73e51 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -27,7 +27,7 @@ static int ovl_ccup_set(const char *buf, const struct kernel_param *param) { - pr_warn("overlayfs: \"check_copy_up\" module option is obsolete\n"); + WARN(1, "overlayfs: \"check_copy_up\" module option is obsolete\n"); return 0; } diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c index ec350d4d921c..7063e0f588cc 100644 --- a/fs/overlayfs/dir.c +++ b/fs/overlayfs/dir.c @@ -116,35 +116,35 @@ int ovl_cleanup_and_whiteout(struct dentry *workdir, struct inode *dir, goto out; } -static int ovl_mkdir_real(struct inode *dir, struct dentry **newdentry, - umode_t mode) +static struct dentry *ovl_mkdir_real(struct inode *dir, struct dentry *dentry, + umode_t mode) { int err; - struct dentry *d, *dentry = *newdentry; err = ovl_do_mkdir(dir, dentry, mode); - if (err) - return err; - - if (likely(!d_unhashed(dentry))) - return 0; + if (err) { + dput(dentry); + return ERR_PTR(err); + } /* * vfs_mkdir() may succeed and leave the dentry passed * to it unhashed and negative. If that happens, try to * lookup a new hashed and positive dentry. */ - d = lookup_one_len(dentry->d_name.name, dentry->d_parent, - dentry->d_name.len); - if (IS_ERR(d)) { - pr_warn("overlayfs: failed lookup after mkdir (%pd2, err=%i).\n", - dentry, err); - return PTR_ERR(d); + if (unlikely(d_unhashed(dentry))) { + struct dentry *d; + + d = lookup_one_len(dentry->d_name.name, dentry->d_parent, + dentry->d_name.len); + if (IS_ERR(d)) { + pr_warn("overlayfs: failed lookup after mkdir (%pd2, err=%i).\n", + dentry, err); + } + dput(dentry); + dentry = d; } - dput(dentry); - *newdentry = d; - - return 0; + return dentry; } struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry, @@ -169,8 +169,7 @@ struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry, case S_IFDIR: /* mkdir is special... */ - err = ovl_mkdir_real(dir, &newdentry, attr->mode); - break; + return ovl_mkdir_real(dir, newdentry, attr->mode); case S_IFCHR: case S_IFBLK: @@ -193,7 +192,7 @@ struct dentry *ovl_create_real(struct inode *dir, struct dentry *newdentry, * Not quite sure if non-instantiated dentry is legal or not. * VFS doesn't seem to care so check and warn here. */ - err = -EIO; + err = -ENOENT; } out: if (err) { diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c index ca7c3461e424..31f32fc1004b 100644 --- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -128,7 +128,7 @@ static int ovl_open(struct inode *inode, struct file *file) /* No longer need these flags, so don't pass them on to underlying fs */ file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC); - realfile = ovl_open_realfile(file, ovl_inode_realdata(inode)); + realfile = ovl_open_realfile(file, ovl_inode_real(file_inode(file))); if (IS_ERR(realfile)) return PTR_ERR(realfile); ^ permalink raw reply related [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-01 15:26 ` Miklos Szeredi @ 2018-06-01 16:18 ` Randy Dunlap 2018-06-01 17:03 ` Miklos Szeredi 0 siblings, 1 reply; 17+ messages in thread From: Randy Dunlap @ 2018-06-01 16:18 UTC (permalink / raw) To: Miklos Szeredi, Al Viro; +Cc: linux-kernel, linux-fsdevel, linux-unionfs On 06/01/2018 08:26 AM, Miklos Szeredi wrote: > On Tue, May 29, 2018 at 03:21:48PM +0200, Miklos Szeredi wrote: >> Hi Al, >> >> I'm sending this pull request to you instead of Linus, because a bigger than >> usual chunk involves the VFS. >> >> Please pull from: >> >> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git for-viro >> >> This update contains the following: > --- > > diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt > index 0a8e3c4543d1..79be4a77ca08 100644 > --- a/Documentation/filesystems/overlayfs.txt > +++ b/Documentation/filesystems/overlayfs.txt > @@ -280,7 +280,7 @@ parameter metacopy=on/off. Lastly, there is also a per mount option > metacopy=on/off to enable/disable this feature per mount. > > Do not use metacopy=on with untrusted upper/lower directories. Otherwise > -it is possible that an attacker can create a handcrafted file with > +it is possible that an attacker can create an handcrafted file with bad change: create a handcrafted Wait. Is this patch -R (reversed)? > appropriate REDIRECT and METACOPY xattrs, and gain access to file on lower > pointed by REDIRECT. This should not be possible on local system as setting > "trusted." xattrs will require CAP_SYS_ADMIN. But it should be possible > @@ -318,7 +318,7 @@ does not support NFS export, lower filesystem does not have a valid UUID or > if the upper filesystem does not support extended attributes. > > For "metadata only copy up" feature there is no verification mechanism at > -mount time. So if same upper is mounted with different set of lower, mount > +mount time. So if same upper is mouted with different set of lower, mount mounted > probably will succeed but expect the unexpected later on. So don't do it. > > It is quite a common practice to copy overlay layers to a different > diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig > index 08b04d9fd6e6..e0a090eca65e 100644 > --- a/fs/overlayfs/Kconfig > +++ b/fs/overlayfs/Kconfig > @@ -11,7 +11,7 @@ config OVERLAY_FS > For more information see Documentation/filesystems/overlayfs.txt > > config OVERLAY_FS_REDIRECT_DIR > - bool "Overlayfs: turn on redirect directory feature by default" > + bool "Overlayfs: turn on redirect dir feature by default" nope. > depends on OVERLAY_FS > help > If this config option is enabled then overlay filesystems will use > @@ -46,7 +46,7 @@ config OVERLAY_FS_INDEX > depends on OVERLAY_FS > help > If this config option is enabled then overlay filesystems will use > - the index directory to map lower inodes to upper inodes by default. > + the inodes index dir to map lower inodes to upper inodes by default. > In this case it is still possible to turn off index globally with the > "index=off" module option or on a filesystem instance basis with the > "index=off" mount option. > @@ -67,7 +67,7 @@ config OVERLAY_FS_NFS_EXPORT > depends on !OVERLAY_FS_METACOPY > help > If this config option is enabled then overlay filesystems will use > - the index directory to decode overlay NFS file handles by default. > + the inodes index dir to decode overlay NFS file handles by default. > In this case, it is still possible to turn off NFS export support > globally with the "nfs_export=off" module option or on a filesystem > instance basis with the "nfs_export=off" mount option. > @@ -133,7 +133,7 @@ config OVERLAY_FS_METACOPY > help > If this config option is enabled then overlay filesystems will > copy up only metadata where appropriate and data copy up will > - happen when a file is opened for WRITE operation. It is still > + happen when a file is opended for WRITE operation. It is still nope. > possible to turn off this feature globally with the "metacopy=off" > module option or on a filesystem instance basis with the > "metacopy=off" mount option. -- ~Randy ^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [GIT PULL] overlayfs update for 4.18 2018-06-01 16:18 ` Randy Dunlap @ 2018-06-01 17:03 ` Miklos Szeredi 0 siblings, 0 replies; 17+ messages in thread From: Miklos Szeredi @ 2018-06-01 17:03 UTC (permalink / raw) To: Randy Dunlap; +Cc: Al Viro, linux-kernel, linux-fsdevel, overlayfs On Fri, Jun 1, 2018 at 6:18 PM, Randy Dunlap <rdunlap@infradead.org> wrote: > On 06/01/2018 08:26 AM, Miklos Szeredi wrote: >> On Tue, May 29, 2018 at 03:21:48PM +0200, Miklos Szeredi wrote: >>> Hi Al, >>> >>> I'm sending this pull request to you instead of Linus, because a bigger than >>> usual chunk involves the VFS. >>> >>> Please pull from: >>> >>> git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git for-viro >>> >>> This update contains the following: > > >> --- >> >> diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt >> index 0a8e3c4543d1..79be4a77ca08 100644 >> --- a/Documentation/filesystems/overlayfs.txt >> +++ b/Documentation/filesystems/overlayfs.txt >> @@ -280,7 +280,7 @@ parameter metacopy=on/off. Lastly, there is also a per mount option >> metacopy=on/off to enable/disable this feature per mount. >> >> Do not use metacopy=on with untrusted upper/lower directories. Otherwise >> -it is possible that an attacker can create a handcrafted file with >> +it is possible that an attacker can create an handcrafted file with > > bad change: > create a handcrafted > > Wait. Is this patch -R (reversed)? Oops, yes, reversed diff. Thanks, Miklos ^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2018-06-11 16:27 UTC | newest] Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-06-08 12:13 [GIT PULL] overlayfs update for 4.18 Miklos Szeredi 2018-06-09 6:52 ` Christoph Hellwig 2018-06-09 21:42 ` Linus Torvalds 2018-06-09 23:55 ` Al Viro 2018-06-11 6:09 ` Christoph Hellwig 2018-06-10 5:54 ` Al Viro 2018-06-11 6:10 ` Christoph Hellwig 2018-06-11 8:41 ` Miklos Szeredi 2018-06-11 16:27 ` Christoph Hellwig -- strict thread matches above, loose matches on Subject: below -- 2018-05-29 13:21 Miklos Szeredi 2018-05-29 13:59 ` Christoph Hellwig 2018-05-29 14:12 ` Miklos Szeredi 2018-05-30 8:36 ` Miklos Szeredi 2018-05-30 22:27 ` Dave Chinner 2018-06-01 15:26 ` Miklos Szeredi 2018-06-01 16:18 ` Randy Dunlap 2018-06-01 17:03 ` Miklos Szeredi
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).