linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/35] overlayfs: stack file operations
@ 2018-05-07  8:37 Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 01/35] vfs: add path_open() Miklos Szeredi
                   ` (34 more replies)
  0 siblings, 35 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

[Al, can you please review the vfs affecting patches? ]

Up till now overlayfs didn't stack regular file operations.  Instead, when
a file was opened on an overlay, the file from one of the underlying layers
would be opened and any file operations performed would directly go to the
underlying file on a real filesystem.

This works well mostly, but various hacks were added to the VFS to work
around issues with this:

 - d_path() and friends
 - relatime handling
 - file locking
 - fsnotify
 - writecount handling

There are also issues that are unresolved before this patchset:

 - ioctl's that need write access but can be performed on a O_RDONLY fd
 - ro/rw inconsistency: file on lower layer opened for read-only will
   return stale data on read after copy-up and modification
 - ro/rw inconsistency for mmap: file on lower layer mapped shared will
   contain stale data after copy-up and modification

This patch series reverts the VFS hacks (with the exception of d_path) and
fixes the unresoved issues.  We need to keep d_path related hacks, because
memory maps are still not stacked, yet d_path() should keep working on
vma->vm_file->f_path.

No regressions were observed after running various test suites (xfstests,
ltp, unionmount-testsuite, pjd-fstest).

Performance impact of stacking was found to be minimal.  Memory use for
open overlay files increases by about 256bytes or 12% of total (files +
pinned dentries + pinned inodes).

Git tree is here:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-rorw

Thanks,
Miklos

Changes from v1:

 - split out dedupe API cleanup
 - update documentation and comments
 - clean up directory modification helpers inside overlayfs
 - make functions static
 - added compat ioctl
 - check for upper files in dedupe
 - bring back d_real_inode() as a new user just cropped up in 4.17

---
Miklos Szeredi (35):
  vfs: add path_open()
  vfs: optionally don't account file in nr_files
  vfs: add f_op->pre_mmap()
  vfs: export vfs_ioctl() to modules
  vfs: export vfs_dedupe_file_range_one() to modules
  ovl: copy up times
  ovl: copy up inode flags
  Revert "Revert "ovl: get_write_access() in truncate""
  ovl: copy up file size as well
  ovl: deal with overlay files in ovl_d_real()
  ovl: stack file ops
  ovl: add helper to return real file
  ovl: add ovl_read_iter()
  ovl: add ovl_write_iter()
  ovl: add ovl_fsync()
  ovl: add ovl_mmap()
  ovl: add ovl_fallocate()
  ovl: add lsattr/chattr support
  ovl: add ovl_fiemap()
  ovl: add O_DIRECT support
  ovl: add reflink/copyfile/dedup support
  vfs: don't open real
  ovl: copy-up on MAP_SHARED
  vfs: simplify dentry_open()
  Revert "ovl: fix may_write_real() for overlayfs directories"
  Revert "ovl: don't allow writing ioctl on lower layer"
  vfs: fix freeze protection in mnt_want_write_file() for overlayfs
  Revert "ovl: fix relatime for directories"
  Revert "vfs: update ovl inode before relatime check"
  Revert "vfs: add flags to d_real()"
  Revert "vfs: do get_write_access() on upper layer of overlayfs"
  Partially revert "locks: fix file locking on overlayfs"
  Revert "fsnotify: support overlayfs"
  vfs: remove open_flags from d_real()
  ovl: fix documentation of non-standard behavior

 Documentation/filesystems/Locking       |   4 +-
 Documentation/filesystems/overlayfs.txt |  60 ++--
 Documentation/filesystems/vfs.txt       |  19 +-
 fs/file_table.c                         |  13 +-
 fs/inode.c                              |  46 +--
 fs/internal.h                           |  17 +-
 fs/ioctl.c                              |   1 +
 fs/locks.c                              |  20 +-
 fs/namei.c                              |   2 +-
 fs/namespace.c                          |  69 +----
 fs/open.c                               |  74 ++---
 fs/overlayfs/Kconfig                    |  21 ++
 fs/overlayfs/Makefile                   |   4 +-
 fs/overlayfs/dir.c                      |  33 ++-
 fs/overlayfs/file.c                     | 506 ++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c                    |  63 +++-
 fs/overlayfs/overlayfs.h                |  21 +-
 fs/overlayfs/ovl_entry.h                |   1 +
 fs/overlayfs/super.c                    |  65 ++--
 fs/overlayfs/util.c                     |  11 +-
 fs/read_write.c                         |   6 +-
 fs/xattr.c                              |   9 +-
 include/linux/dcache.h                  |  15 +-
 include/linux/fs.h                      |  28 +-
 include/linux/fsnotify.h                |  14 +-
 include/uapi/linux/fs.h                 |   1 -
 mm/util.c                               |   5 +
 27 files changed, 824 insertions(+), 304 deletions(-)
 create mode 100644 fs/overlayfs/file.c

-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 01/35] vfs: add path_open()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 02/35] vfs: optionally don't account file in nr_files Miklos Szeredi
                   ` (33 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

Currently opening an overlay file results in:

 - the real file on the underlying layer being opened
 - f_path being set to the overlay {mount, dentry} pair

This patch adds a new helper that allows the above to be explicitly
performed.  I.e. it's the same as dentry_open(), except the underlying
inode to open is given as a separate argument.

This is in preparation for stacking I/O operations on overlay files.

Later, when implicit opening is removed, dentry_open() can be implemented
by just calling path_open().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/open.c          | 31 +++++++++++++++++++++++++++++++
 include/linux/fs.h |  2 ++
 2 files changed, 33 insertions(+)

diff --git a/fs/open.c b/fs/open.c
index c5ee7cd60424..d0bf7f061a1a 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -906,6 +906,37 @@ int vfs_open(const struct path *path, struct file *file,
 	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
 }
 
+/**
+ * path_open() - Open an inode by a particular name.
+ * @path: The name of the file.
+ * @flags: The O_ flags used to open this file.
+ * @inode: The inode to open.
+ * @cred: The task's credentials used when opening this file.
+ *
+ * Context: Process context.
+ * Return: A pointer to a struct file or an IS_ERR pointer.  Cannot return NULL.
+ */
+struct file *path_open(const struct path *path, int flags, struct inode *inode,
+		       const struct cred *cred)
+{
+	struct file *file;
+	int retval;
+
+	file = get_empty_filp();
+	if (IS_ERR(file))
+		return file;
+
+	file->f_flags = flags;
+	file->f_path = *path;
+	retval = do_dentry_open(file, inode, NULL, cred);
+	if (retval) {
+		put_filp(file);
+		return ERR_PTR(retval);
+	}
+	return file;
+}
+EXPORT_SYMBOL(path_open);
+
 struct file *dentry_open(const struct path *path, int flags,
 			 const struct cred *cred)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8007a31c4d3c..d97a661342c8 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2401,6 +2401,8 @@ extern struct file *filp_open(const char *, int, umode_t);
 extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
+extern struct file *path_open(const struct path *, int, struct inode *,
+			      const struct cred *);
 extern int filp_close(struct file *, fl_owner_t id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 02/35] vfs: optionally don't account file in nr_files
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 01/35] vfs: add path_open() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 03/35] vfs: add f_op->pre_mmap() Miklos Szeredi
                   ` (32 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

Stacking file operations in overlay will store an extra open file for each
overlay file opened.

The overhead is just that of "struct file" which is about 256bytes, because
overlay already pins an extra dentry and inode when the file is open, which
add up to a much larger overhead.

For fear of breaking working setups, don't start accounting the extra file.

The implementation adds a bool argument to path_open() to control whether
the returned file is to be accounted or not.  If the file is not accounted,
f_mode will contain FMODE_NOACCOUNT, so that when freeing the file the
count is not decremented.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/file_table.c    | 13 +++++++++----
 fs/internal.h      |  7 ++++++-
 fs/open.c          | 10 +++++-----
 include/linux/fs.h |  5 ++++-
 4 files changed, 24 insertions(+), 11 deletions(-)

diff --git a/fs/file_table.c b/fs/file_table.c
index 7ec0b3e5f05d..60376bfa04cf 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -51,7 +51,8 @@ static void file_free_rcu(struct rcu_head *head)
 
 static inline void file_free(struct file *f)
 {
-	percpu_counter_dec(&nr_files);
+	if (!(f->f_mode & FMODE_NOACCOUNT))
+		percpu_counter_dec(&nr_files);
 	call_rcu(&f->f_u.fu_rcuhead, file_free_rcu);
 }
 
@@ -100,7 +101,7 @@ int proc_nr_files(struct ctl_table *table, int write,
  * done, you will imbalance int the mount's writer count
  * and a warning at __fput() time.
  */
-struct file *get_empty_filp(void)
+struct file *__get_empty_filp(bool account)
 {
 	const struct cred *cred = current_cred();
 	static long old_max;
@@ -110,7 +111,8 @@ struct file *get_empty_filp(void)
 	/*
 	 * Privileged users can go above max_files
 	 */
-	if (get_nr_files() >= files_stat.max_files && !capable(CAP_SYS_ADMIN)) {
+	if (account &&
+	    get_nr_files() >= files_stat.max_files && !capable(CAP_SYS_ADMIN)) {
 		/*
 		 * percpu_counters are inaccurate.  Do an expensive check before
 		 * we go and fail.
@@ -123,7 +125,10 @@ struct file *get_empty_filp(void)
 	if (unlikely(!f))
 		return ERR_PTR(-ENOMEM);
 
-	percpu_counter_inc(&nr_files);
+	if (account)
+		percpu_counter_inc(&nr_files);
+	else
+		f->f_mode = FMODE_NOACCOUNT;
 	f->f_cred = get_cred(cred);
 	error = security_file_alloc(f);
 	if (unlikely(error)) {
diff --git a/fs/internal.h b/fs/internal.h
index e08972db0303..b82725ba3054 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -93,7 +93,12 @@ extern void chroot_fs_refs(const struct path *, const struct path *);
 /*
  * file_table.c
  */
-extern struct file *get_empty_filp(void);
+extern struct file *__get_empty_filp(bool account);
+
+static inline struct file *get_empty_filp(void)
+{
+	return __get_empty_filp(true);
+}
 
 /*
  * super.c
diff --git a/fs/open.c b/fs/open.c
index d0bf7f061a1a..6e52fd6fea7c 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -732,8 +732,8 @@ static int do_dentry_open(struct file *f,
 	static const struct file_operations empty_fops = {};
 	int error;
 
-	f->f_mode = OPEN_FMODE(f->f_flags) | FMODE_LSEEK |
-				FMODE_PREAD | FMODE_PWRITE;
+	f->f_mode = (f->f_mode & FMODE_NOACCOUNT) | OPEN_FMODE(f->f_flags) |
+		FMODE_LSEEK | FMODE_PREAD | FMODE_PWRITE;
 
 	path_get(&f->f_path);
 	f->f_inode = inode;
@@ -743,7 +743,7 @@ static int do_dentry_open(struct file *f,
 	f->f_wb_err = filemap_sample_wb_err(f->f_mapping);
 
 	if (unlikely(f->f_flags & O_PATH)) {
-		f->f_mode = FMODE_PATH;
+		f->f_mode = (f->f_mode & FMODE_NOACCOUNT) | FMODE_PATH;
 		f->f_op = &empty_fops;
 		goto done;
 	}
@@ -917,12 +917,12 @@ int vfs_open(const struct path *path, struct file *file,
  * Return: A pointer to a struct file or an IS_ERR pointer.  Cannot return NULL.
  */
 struct file *path_open(const struct path *path, int flags, struct inode *inode,
-		       const struct cred *cred)
+		       const struct cred *cred, bool account)
 {
 	struct file *file;
 	int retval;
 
-	file = get_empty_filp();
+	file = __get_empty_filp(account);
 	if (IS_ERR(file))
 		return file;
 
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d97a661342c8..af49b55ff439 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -153,6 +153,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
 /* File is capable of returning -EAGAIN if I/O will block */
 #define FMODE_NOWAIT	((__force fmode_t)0x8000000)
 
+/* File does not contribute to nr_files count */
+#define FMODE_NOACCOUNT	((__force fmode_t)0x10000000)
+
 /*
  * Flag for rw_copy_check_uvector and compat_rw_copy_check_uvector
  * that indicates that they should check the contents of the iovec are
@@ -2402,7 +2405,7 @@ extern struct file *file_open_root(struct dentry *, struct vfsmount *,
 				   const char *, int, umode_t);
 extern struct file * dentry_open(const struct path *, int, const struct cred *);
 extern struct file *path_open(const struct path *, int, struct inode *,
-			      const struct cred *);
+			      const struct cred *, bool);
 extern int filp_close(struct file *, fl_owner_t id);
 
 extern struct filename *getname_flags(const char __user *, int, int *);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 03/35] vfs: add f_op->pre_mmap()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 01/35] vfs: add path_open() Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 02/35] vfs: optionally don't account file in nr_files Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 04/35] vfs: export vfs_ioctl() to modules Miklos Szeredi
                   ` (31 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This is needed by overlayfs to be able to copy up a file from a read-only
lower layer to a writable layer when being mapped shared.  When copying up,
overlayfs takes VFS locks that would violate locking order when nested
inside mmap_sem.

Add a new f_op->pre_mmap method, which is called before taking mmap_sem.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/Locking | 1 +
 Documentation/filesystems/vfs.txt | 3 +++
 include/linux/fs.h                | 1 +
 mm/util.c                         | 5 +++++
 4 files changed, 10 insertions(+)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 75d2d57e2c44..60e76060baff 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -442,6 +442,7 @@ prototypes:
 	unsigned int (*poll) (struct file *, struct poll_table_struct *);
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
+	int (*pre_mmap) (struct file *, unsigned long, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
 	int (*flush) (struct file *);
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 5fd325df59e2..2bc77ea8aef4 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -859,6 +859,7 @@ struct file_operations {
 	unsigned int (*poll) (struct file *, struct poll_table_struct *);
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
+	int (*pre_mmap) (struct file *, unsigned long, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	int (*mremap)(struct file *, struct vm_area_struct *);
 	int (*open) (struct inode *, struct file *);
@@ -906,6 +907,8 @@ otherwise noted.
   compat_ioctl: called by the ioctl(2) system call when 32 bit system calls
  	 are used on 64 bit kernels.
 
+  pre_mmap: called before mmap, without mmap_sem being held yet.
+
   mmap: called by the mmap(2) system call
 
   open: called by the VFS when an inode should be opened. When the VFS
diff --git a/include/linux/fs.h b/include/linux/fs.h
index af49b55ff439..898fb798a3ff 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1716,6 +1716,7 @@ struct file_operations {
 	__poll_t (*poll) (struct file *, struct poll_table_struct *);
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
+	int (*pre_mmap) (struct file *, unsigned long, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
 	unsigned long mmap_supported_flags;
 	int (*open) (struct inode *, struct file *);
diff --git a/mm/util.c b/mm/util.c
index 45fc3169e7b0..11cd375e1a19 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -352,6 +352,11 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr,
 
 	ret = security_mmap_file(file, prot, flag);
 	if (!ret) {
+		if (file && file->f_op->pre_mmap) {
+			ret = file->f_op->pre_mmap(file, prot, flag);
+			if (ret)
+				return ret;
+		}
 		if (down_write_killable(&mm->mmap_sem))
 			return -EINTR;
 		ret = do_mmap_pgoff(file, addr, len, prot, flag, pgoff,
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 04/35] vfs: export vfs_ioctl() to modules
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (2 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 03/35] vfs: add f_op->pre_mmap() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 05/35] vfs: export vfs_dedupe_file_range_one() " Miklos Szeredi
                   ` (30 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This is needed by the stacked ioctl implementation in overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/internal.h      | 1 -
 fs/ioctl.c         | 1 +
 include/linux/fs.h | 2 ++
 3 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/internal.h b/fs/internal.h
index b82725ba3054..6821cf475fc6 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -189,7 +189,6 @@ extern const struct dentry_operations ns_dentry_operations;
  */
 extern int do_vfs_ioctl(struct file *file, unsigned int fd, unsigned int cmd,
 		    unsigned long arg);
-extern long vfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
 
 /*
  * iomap support:
diff --git a/fs/ioctl.c b/fs/ioctl.c
index 4823431d1c9d..41071915f411 100644
--- a/fs/ioctl.c
+++ b/fs/ioctl.c
@@ -49,6 +49,7 @@ long vfs_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
  out:
 	return error;
 }
+EXPORT_SYMBOL(vfs_ioctl);
 
 static int ioctl_fibmap(struct file *filp, int __user *p)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 898fb798a3ff..26685011c4bd 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1623,6 +1623,8 @@ int vfs_mkobj(struct dentry *, umode_t,
 		int (*f)(struct dentry *, umode_t, void *),
 		void *);
 
+extern long vfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg);
+
 /*
  * VFS file helper functions.
  */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 05/35] vfs: export vfs_dedupe_file_range_one() to modules
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (3 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 04/35] vfs: export vfs_ioctl() to modules Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 06/35] ovl: copy up times Miklos Szeredi
                   ` (29 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This is needed by the stacked dedupe implementation in overlayfs.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/read_write.c    | 6 +++---
 include/linux/fs.h | 4 ++++
 2 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/fs/read_write.c b/fs/read_write.c
index 023df230e2a0..08708f903fc5 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1964,9 +1964,8 @@ int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
 }
 EXPORT_SYMBOL(vfs_dedupe_file_range_compare);
 
-static s64 vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
-				     struct file *dst_file, loff_t dst_pos,
-				     u64 len)
+s64 vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
+			      struct file *dst_file, loff_t dst_pos, u64 len)
 {
 	s64 ret;
 
@@ -2001,6 +2000,7 @@ static s64 vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
 
 	return ret;
 }
+EXPORT_SYMBOL(vfs_dedupe_file_range_one);
 
 int vfs_dedupe_file_range(struct file *file, struct file_dedupe_range *same)
 {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 26685011c4bd..c85a8059f038 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1817,6 +1817,10 @@ extern int vfs_dedupe_file_range_compare(struct inode *src, loff_t srcoff,
 					 loff_t len, bool *is_same);
 extern int vfs_dedupe_file_range(struct file *file,
 				 struct file_dedupe_range *same);
+extern s64 vfs_dedupe_file_range_one(struct file *src_file, loff_t src_pos,
+				     struct file *dst_file, loff_t dst_pos,
+				     u64 len);
+
 
 struct super_operations {
    	struct inode *(*alloc_inode)(struct super_block *sb);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 06/35] ovl: copy up times
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (4 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 05/35] vfs: export vfs_dedupe_file_range_one() " Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 07/35] ovl: copy up inode flags Miklos Szeredi
                   ` (28 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Copy up mtime and ctime to overlay inode after times in real object are
modified.  Be careful not to dirty cachelines when not necessary.

This is in preparation for moving overlay functionality out of the VFS.

This patch shouldn't have any observable effect.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/dir.c       | 33 +++++++++++++++++++++++++--------
 fs/overlayfs/inode.c     |  3 +++
 fs/overlayfs/overlayfs.h |  2 +-
 fs/overlayfs/util.c      | 10 +++++++++-
 4 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 839709c7803a..47dc980e8b33 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -186,11 +186,11 @@ static int ovl_set_opaque(struct dentry *dentry, struct dentry *upperdentry)
 static void ovl_instantiate(struct dentry *dentry, struct inode *inode,
 			    struct dentry *newdentry, bool hardlink)
 {
-	ovl_dentry_version_inc(dentry->d_parent, false);
+	ovl_dir_modified(dentry->d_parent, false);
+	ovl_copyattr(d_inode(newdentry), inode);
 	ovl_dentry_set_upper_alias(dentry);
 	if (!hardlink) {
 		ovl_inode_update(inode, newdentry);
-		ovl_copyattr(newdentry->d_inode, inode);
 	} else {
 		WARN_ON(ovl_inode_real(inode) != d_inode(newdentry));
 		dput(newdentry);
@@ -658,7 +658,7 @@ static int ovl_remove_and_whiteout(struct dentry *dentry,
 	if (err)
 		goto out_d_drop;
 
-	ovl_dentry_version_inc(dentry->d_parent, true);
+	ovl_dir_modified(dentry->d_parent, true);
 out_d_drop:
 	d_drop(dentry);
 out_dput_upper:
@@ -703,7 +703,7 @@ static int ovl_remove_upper(struct dentry *dentry, bool is_dir,
 		err = vfs_rmdir(dir, upper);
 	else
 		err = vfs_unlink(dir, upper, NULL);
-	ovl_dentry_version_inc(dentry->d_parent, ovl_type_origin(dentry));
+	ovl_dir_modified(dentry->d_parent, ovl_type_origin(dentry));
 
 	/*
 	 * Keeping this dentry hashed would mean having to release
@@ -733,6 +733,7 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 	int err;
 	bool locked = false;
 	const struct cred *old_cred;
+	struct dentry *upperdentry;
 	bool lower_positive = ovl_lower_positive(dentry);
 	LIST_HEAD(list);
 
@@ -768,6 +769,17 @@ static int ovl_do_remove(struct dentry *dentry, bool is_dir)
 			drop_nlink(dentry->d_inode);
 	}
 	ovl_nlink_end(dentry, locked);
+
+	/*
+	 * Copy ctime
+	 *
+	 * Note: we fail to update ctime if there was no copy-up, only a
+	 * whiteout
+	 */
+	upperdentry = ovl_dentry_upper(dentry);
+	if (upperdentry)
+		ovl_copyattr(d_inode(upperdentry), d_inode(dentry));
+
 out_drop_write:
 	ovl_drop_write(dentry);
 out:
@@ -1074,10 +1086,15 @@ static int ovl_rename(struct inode *olddir, struct dentry *old,
 			drop_nlink(d_inode(new));
 	}
 
-	ovl_dentry_version_inc(old->d_parent, ovl_type_origin(old) ||
-			       (!overwrite && ovl_type_origin(new)));
-	ovl_dentry_version_inc(new->d_parent, ovl_type_origin(old) ||
-			       (d_inode(new) && ovl_type_origin(new)));
+	ovl_dir_modified(old->d_parent, ovl_type_origin(old) ||
+			 (!overwrite && ovl_type_origin(new)));
+	ovl_dir_modified(new->d_parent, ovl_type_origin(old) ||
+			 (d_inode(new) && ovl_type_origin(new)));
+
+	/* copy ctime: */
+	ovl_copyattr(d_inode(olddentry), d_inode(old));
+	if (d_inode(new) && ovl_dentry_upper(new))
+		ovl_copyattr(d_inode(newdentry), d_inode(new));
 
 out_dput:
 	dput(newdentry);
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index 6e3815fb006b..d3700f0de165 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -304,6 +304,9 @@ int ovl_xattr_set(struct dentry *dentry, struct inode *inode, const char *name,
 	}
 	revert_creds(old_cred);
 
+	/* copy c/mtime */
+	ovl_copyattr(d_inode(realdentry), inode);
+
 out_drop_write:
 	ovl_drop_write(dentry);
 out:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index e0b7de799f6b..271561fa7882 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -237,7 +237,7 @@ void ovl_dentry_set_redirect(struct dentry *dentry, const char *redirect);
 void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		    struct dentry *lowerdentry);
 void ovl_inode_update(struct inode *inode, struct dentry *upperdentry);
-void ovl_dentry_version_inc(struct dentry *dentry, bool impurity);
+void ovl_dir_modified(struct dentry *dentry, bool impurity);
 u64 ovl_dentry_version_get(struct dentry *dentry);
 bool ovl_is_whiteout(struct dentry *dentry);
 struct file *ovl_path_open(struct path *path, int flags);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 6f1078028c66..30a05d1d679d 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -333,7 +333,7 @@ void ovl_inode_update(struct inode *inode, struct dentry *upperdentry)
 	}
 }
 
-void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
+static void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
 {
 	struct inode *inode = d_inode(dentry);
 
@@ -348,6 +348,14 @@ void ovl_dentry_version_inc(struct dentry *dentry, bool impurity)
 		OVL_I(inode)->version++;
 }
 
+void ovl_dir_modified(struct dentry *dentry, bool impurity)
+{
+	/* Copy mtime/ctime */
+	ovl_copyattr(d_inode(ovl_dentry_upper(dentry)), d_inode(dentry));
+
+	ovl_dentry_version_inc(dentry, impurity);
+}
+
 u64 ovl_dentry_version_get(struct dentry *dentry)
 {
 	struct inode *inode = d_inode(dentry);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 07/35] ovl: copy up inode flags
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (5 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 06/35] ovl: copy up times Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 08/35] Revert "Revert "ovl: get_write_access() in truncate"" Miklos Szeredi
                   ` (27 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

On inode creation copy certain inode flags from the underlying real inode
to the overlay inode.

This is in preparation for moving overlay functionality out of the VFS.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/overlayfs.h | 7 +++++++
 fs/overlayfs/util.c      | 1 +
 2 files changed, 8 insertions(+)

diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 271561fa7882..4e26778774c3 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -350,6 +350,13 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to)
 	to->i_ctime = from->i_ctime;
 }
 
+static inline void ovl_copyflags(struct inode *from, struct inode *to)
+{
+	unsigned int mask = S_SYNC | S_IMMUTABLE | S_APPEND | S_NOATIME;
+
+	inode_set_flags(to, from->i_flags & mask, mask);
+}
+
 /* dir.c */
 extern const struct inode_operations ovl_dir_inode_operations;
 struct dentry *ovl_lookup_temp(struct dentry *workdir);
diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
index 30a05d1d679d..25d202b47326 100644
--- a/fs/overlayfs/util.c
+++ b/fs/overlayfs/util.c
@@ -310,6 +310,7 @@ void ovl_inode_init(struct inode *inode, struct dentry *upperdentry,
 		OVL_I(inode)->lower = igrab(d_inode(lowerdentry));
 
 	ovl_copyattr(realinode, inode);
+	ovl_copyflags(realinode, inode);
 	if (!inode->i_ino)
 		inode->i_ino = realinode->i_ino;
 }
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 08/35] Revert "Revert "ovl: get_write_access() in truncate""
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (6 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 07/35] ovl: copy up inode flags Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 09/35] ovl: copy up file size as well Miklos Szeredi
                   ` (26 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

This reverts commit 31c3a7069593b072bd57192b63b62f9a7e994e9a.

Re-add functionality dealing with i_writecount on truncate to overlayfs.
This patch shouldn't have any observable effects, since we just re-assert
the writecout that vfs_truncate() already got for us.

This is in preparation for moving overlay functionality out of the VFS.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c | 21 +++++++++++++++++++++
 1 file changed, 21 insertions(+)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index d3700f0de165..f7b1910bb9d4 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -39,10 +39,27 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 	if (err)
 		goto out;
 
+	if (attr->ia_valid & ATTR_SIZE) {
+		struct inode *realinode = d_inode(ovl_dentry_real(dentry));
+
+		err = -ETXTBSY;
+		if (atomic_read(&realinode->i_writecount) < 0)
+			goto out_drop_write;
+	}
+
 	err = ovl_copy_up(dentry);
 	if (!err) {
+		struct inode *winode = NULL;
+
 		upperdentry = ovl_dentry_upper(dentry);
 
+		if (attr->ia_valid & ATTR_SIZE) {
+			winode = d_inode(upperdentry);
+			err = get_write_access(winode);
+			if (err)
+				goto out_drop_write;
+		}
+
 		if (attr->ia_valid & (ATTR_KILL_SUID|ATTR_KILL_SGID))
 			attr->ia_valid &= ~ATTR_MODE;
 
@@ -53,7 +70,11 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 		if (!err)
 			ovl_copyattr(upperdentry->d_inode, dentry->d_inode);
 		inode_unlock(upperdentry->d_inode);
+
+		if (winode)
+			put_write_access(winode);
 	}
+out_drop_write:
 	ovl_drop_write(dentry);
 out:
 	return err;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 09/35] ovl: copy up file size as well
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (7 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 08/35] Revert "Revert "ovl: get_write_access() in truncate"" Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real() Miklos Szeredi
                   ` (25 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Copy i_size of the underlying inode to the overlay inode in ovl_copyattr().

This is in preparation for stacking I/O operations on overlay files.

This patch shouldn't have any observable effect.

Remove stale comment from ovl_setattr() [spotted by Vivek Goyal].

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c     | 9 ---------
 fs/overlayfs/overlayfs.h | 2 ++
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index f7b1910bb9d4..ba3f832cc39a 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -22,15 +22,6 @@ int ovl_setattr(struct dentry *dentry, struct iattr *attr)
 	struct dentry *upperdentry;
 	const struct cred *old_cred;
 
-	/*
-	 * Check for permissions before trying to copy-up.  This is redundant
-	 * since it will be rechecked later by ->setattr() on upper dentry.  But
-	 * without this, copy-up can be triggered by just about anybody.
-	 *
-	 * We don't initialize inode->size, which just means that
-	 * inode_newsize_ok() will always check against MAX_LFS_FILESIZE and not
-	 * check for a swapfile (which this won't be anyway).
-	 */
 	err = setattr_prepare(dentry, attr);
 	if (err)
 		return err;
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 4e26778774c3..b9f7c632ab9c 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -9,6 +9,7 @@
 
 #include <linux/kernel.h>
 #include <linux/uuid.h>
+#include <linux/fs.h>
 #include "ovl_entry.h"
 
 enum ovl_path_type {
@@ -348,6 +349,7 @@ static inline void ovl_copyattr(struct inode *from, struct inode *to)
 	to->i_atime = from->i_atime;
 	to->i_mtime = from->i_mtime;
 	to->i_ctime = from->i_ctime;
+	i_size_write(to, i_size_read(from));
 }
 
 static inline void ovl_copyflags(struct inode *from, struct inode *to)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (8 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 09/35] ovl: copy up file size as well Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07 13:17   ` Vivek Goyal
  2018-05-07  8:37 ` [PATCH v2 11/35] ovl: stack file ops Miklos Szeredi
                   ` (24 subsequent siblings)
  34 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/super.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index e8551c97de51..ad6a5baf226b 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -97,6 +97,10 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	struct dentry *real;
 	int err;
 
+	/* It's an overlay file */
+	if (inode && d_inode(dentry) == inode)
+		return dentry;
+
 	if (flags & D_REAL_UPPER)
 		return ovl_dentry_upper(dentry);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 11/35] ovl: stack file ops
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (9 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 12/35] ovl: add helper to return real file Miklos Szeredi
                   ` (23 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement file operations on a regular overlay file.  The underlying file
is opened separately and cached in ->private_data.

It might be worth making an exception for such files when accounting in
nr_file to confirm to userspace expectations.  We are only adding a small
overhead (248bytes for the struct file) since the real inode and dentry are
pinned by overlayfs anyway.

This patch doesn't have any effect, since the vfs will use d_real() to find
the real underlying file to open.  The patch at the end of the series will
actually enable this functionality.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/Makefile    |  4 +--
 fs/overlayfs/file.c      | 76 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/overlayfs/inode.c     |  1 +
 fs/overlayfs/overlayfs.h |  3 ++
 4 files changed, 82 insertions(+), 2 deletions(-)
 create mode 100644 fs/overlayfs/file.c

diff --git a/fs/overlayfs/Makefile b/fs/overlayfs/Makefile
index 30802347a020..46e1ff8ac056 100644
--- a/fs/overlayfs/Makefile
+++ b/fs/overlayfs/Makefile
@@ -4,5 +4,5 @@
 
 obj-$(CONFIG_OVERLAY_FS) += overlay.o
 
-overlay-objs := super.o namei.o util.o inode.o dir.o readdir.o copy_up.o \
-		export.o
+overlay-objs := super.o namei.o util.o inode.o file.o dir.o readdir.o \
+		copy_up.o export.o
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
new file mode 100644
index 000000000000..a0b606885c41
--- /dev/null
+++ b/fs/overlayfs/file.c
@@ -0,0 +1,76 @@
+/*
+ * Copyright (C) 2017 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published by
+ * the Free Software Foundation.
+ */
+
+#include <linux/cred.h>
+#include <linux/file.h>
+#include <linux/xattr.h>
+#include "overlayfs.h"
+
+static struct file *ovl_open_realfile(const struct file *file)
+{
+	struct inode *inode = file_inode(file);
+	struct inode *upperinode = ovl_inode_upper(inode);
+	struct inode *realinode = upperinode ?: ovl_inode_lower(inode);
+	struct file *realfile;
+	const struct cred *old_cred;
+
+	old_cred = ovl_override_creds(inode->i_sb);
+	realfile = path_open(&file->f_path, file->f_flags | O_NOATIME,
+			     realinode, current_cred(), false);
+	revert_creds(old_cred);
+
+	pr_debug("open(%p[%pD2/%c], 0%o) -> (%p, 0%o)\n",
+		 file, file, upperinode ? 'u' : 'l', file->f_flags,
+		 realfile, IS_ERR(realfile) ? 0 : realfile->f_flags);
+
+	return realfile;
+}
+
+static int ovl_open(struct inode *inode, struct file *file)
+{
+	struct dentry *dentry = file_dentry(file);
+	struct file *realfile;
+	int err;
+
+	err = ovl_open_maybe_copy_up(dentry, file->f_flags);
+	if (err)
+		return err;
+
+	/* No longer need these flags, so don't pass them on to underlying fs */
+	file->f_flags &= ~(O_CREAT | O_EXCL | O_NOCTTY | O_TRUNC);
+
+	realfile = ovl_open_realfile(file);
+	if (IS_ERR(realfile))
+		return PTR_ERR(realfile);
+
+	file->private_data = realfile;
+
+	return 0;
+}
+
+static int ovl_release(struct inode *inode, struct file *file)
+{
+	fput(file->private_data);
+
+	return 0;
+}
+
+static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
+{
+	struct inode *realinode = ovl_inode_real(file_inode(file));
+
+	return generic_file_llseek_size(file, offset, whence,
+					realinode->i_sb->s_maxbytes,
+					i_size_read(realinode));
+}
+
+const struct file_operations ovl_file_operations = {
+	.open		= ovl_open,
+	.release	= ovl_release,
+	.llseek		= ovl_llseek,
+};
diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index ba3f832cc39a..a228eda66ad8 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -535,6 +535,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 	switch (mode & S_IFMT) {
 	case S_IFREG:
 		inode->i_op = &ovl_file_inode_operations;
+		inode->i_fop = &ovl_file_operations;
 		break;
 
 	case S_IFDIR:
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index b9f7c632ab9c..edbf69c8f45d 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -374,6 +374,9 @@ int ovl_create_real(struct inode *dir, struct dentry *newdentry,
 		    struct dentry *hardlink, bool debug);
 int ovl_cleanup(struct inode *dir, struct dentry *dentry);
 
+/* file.c */
+extern const struct file_operations ovl_file_operations;
+
 /* copy_up.c */
 int ovl_copy_up(struct dentry *dentry);
 int ovl_copy_up_flags(struct dentry *dentry, int flags);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 12/35] ovl: add helper to return real file
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (10 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 11/35] ovl: stack file ops Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 13/35] ovl: add ovl_read_iter() Miklos Szeredi
                   ` (22 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

In the common case we can just use the real file cached in
file->private_data.  There are two exceptions:

1) File has been copied up since open: in this unlikely corner case just
use a throwaway real file for the operation.  If ever this becomes a
perfomance problem (very unlikely, since overlayfs has been doing most fine
without correctly handling this case at all), then we can deal with that by
updating the cached real file.

2) File's f_flags have changed since open: no need to reopen the cached
real file, we can just change the flags there as well.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 60 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 60 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index a0b606885c41..db8778e7c37a 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -31,6 +31,66 @@ static struct file *ovl_open_realfile(const struct file *file)
 	return realfile;
 }
 
+#define OVL_SETFL_MASK (O_APPEND | O_NONBLOCK | O_NDELAY | O_DIRECT)
+
+static int ovl_change_flags(struct file *file, unsigned int flags)
+{
+	struct inode *inode = file_inode(file);
+	int err;
+
+	/* No atime modificaton on underlying */
+	flags |= O_NOATIME;
+
+	/* If some flag changed that cannot be changed then something's amiss */
+	if (WARN_ON((file->f_flags ^ flags) & ~OVL_SETFL_MASK))
+		return -EIO;
+
+	flags &= OVL_SETFL_MASK;
+
+	if (((flags ^ file->f_flags) & O_APPEND) && IS_APPEND(inode))
+		return -EPERM;
+
+	if (flags & O_DIRECT) {
+		if (!file->f_mapping->a_ops ||
+		    !file->f_mapping->a_ops->direct_IO)
+			return -EINVAL;
+	}
+
+	if (file->f_op->check_flags) {
+		err = file->f_op->check_flags(flags);
+		if (err)
+			return err;
+	}
+
+	spin_lock(&file->f_lock);
+	file->f_flags = (file->f_flags & ~OVL_SETFL_MASK) | flags;
+	spin_unlock(&file->f_lock);
+
+	return 0;
+}
+
+static int ovl_real_fdget(const struct file *file, struct fd *real)
+{
+	struct inode *inode = file_inode(file);
+
+	real->flags = 0;
+	real->file = file->private_data;
+
+	/* Has it been copied up since we'd opened it? */
+	if (unlikely(file_inode(real->file) != ovl_inode_real(inode))) {
+		real->flags = FDPUT_FPUT;
+		real->file = ovl_open_realfile(file);
+
+		return PTR_ERR_OR_ZERO(real->file);
+	}
+
+	/* Did the flags change since open? */
+	if (unlikely((file->f_flags ^ real->file->f_flags) & ~O_NOATIME))
+		return ovl_change_flags(real->file, file->f_flags);
+
+	return 0;
+}
+
 static int ovl_open(struct inode *inode, struct file *file)
 {
 	struct dentry *dentry = file_dentry(file);
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 13/35] ovl: add ovl_read_iter()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (11 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 12/35] ovl: add helper to return real file Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 14/35] ovl: add ovl_write_iter() Miklos Szeredi
                   ` (21 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked reading.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 67 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 67 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index db8778e7c37a..bbc40a14acf8 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -9,6 +9,7 @@
 #include <linux/cred.h>
 #include <linux/file.h>
 #include <linux/xattr.h>
+#include <linux/uio.h>
 #include "overlayfs.h"
 
 static struct file *ovl_open_realfile(const struct file *file)
@@ -129,8 +130,74 @@ static loff_t ovl_llseek(struct file *file, loff_t offset, int whence)
 					i_size_read(realinode));
 }
 
+static void ovl_file_accessed(struct file *file)
+{
+	struct inode *inode, *upperinode;
+
+	if (file->f_flags & O_NOATIME)
+		return;
+
+	inode = file_inode(file);
+	upperinode = ovl_inode_upper(inode);
+
+	if (!upperinode)
+		return;
+
+	if ((!timespec_equal(&inode->i_mtime, &upperinode->i_mtime) ||
+	     !timespec_equal(&inode->i_ctime, &upperinode->i_ctime))) {
+		inode->i_mtime = upperinode->i_mtime;
+		inode->i_ctime = upperinode->i_ctime;
+	}
+
+	touch_atime(&file->f_path);
+}
+
+static rwf_t ovl_iocb_to_rwf(struct kiocb *iocb)
+{
+	int ifl = iocb->ki_flags;
+	rwf_t flags = 0;
+
+	if (ifl & IOCB_NOWAIT)
+		flags |= RWF_NOWAIT;
+	if (ifl & IOCB_HIPRI)
+		flags |= RWF_HIPRI;
+	if (ifl & IOCB_DSYNC)
+		flags |= RWF_DSYNC;
+	if (ifl & IOCB_SYNC)
+		flags |= RWF_SYNC;
+
+	return flags;
+}
+
+static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct file *file = iocb->ki_filp;
+	struct fd real;
+	const struct cred *old_cred;
+	ssize_t ret;
+
+	if (!iov_iter_count(iter))
+		return 0;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		return ret;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = vfs_iter_read(real.file, iter, &iocb->ki_pos,
+			    ovl_iocb_to_rwf(iocb));
+	revert_creds(old_cred);
+
+	ovl_file_accessed(file);
+
+	fdput(real);
+
+	return ret;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
 	.llseek		= ovl_llseek,
+	.read_iter	= ovl_read_iter,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 14/35] ovl: add ovl_write_iter()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (12 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 13/35] ovl: add ovl_read_iter() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 15/35] ovl: add ovl_fsync() Miklos Szeredi
                   ` (20 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked writes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 39 +++++++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index bbc40a14acf8..a7af56861aa5 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -195,9 +195,48 @@ static ssize_t ovl_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 	return ret;
 }
 
+static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
+{
+	struct file *file = iocb->ki_filp;
+	struct inode *inode = file_inode(file);
+	struct fd real;
+	const struct cred *old_cred;
+	ssize_t ret;
+
+	if (!iov_iter_count(iter))
+		return 0;
+
+	inode_lock(inode);
+	/* Update mode */
+	ovl_copyattr(ovl_inode_real(inode), inode);
+	ret = file_remove_privs(file);
+	if (ret)
+		goto out_unlock;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		goto out_unlock;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = vfs_iter_write(real.file, iter, &iocb->ki_pos,
+			     ovl_iocb_to_rwf(iocb));
+	revert_creds(old_cred);
+
+	/* Update size */
+	ovl_copyattr(ovl_inode_real(inode), inode);
+
+	fdput(real);
+
+out_unlock:
+	inode_unlock(inode);
+
+	return ret;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
 	.llseek		= ovl_llseek,
 	.read_iter	= ovl_read_iter,
+	.write_iter	= ovl_write_iter,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 15/35] ovl: add ovl_fsync()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (13 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 14/35] ovl: add ovl_write_iter() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-08  5:14   ` Amir Goldstein
  2018-05-07  8:37 ` [PATCH v2 16/35] ovl: add ovl_mmap() Miklos Szeredi
                   ` (19 subsequent siblings)
  34 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked fsync().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index a7af56861aa5..419aa3f9967b 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -233,10 +233,30 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
 	return ret;
 }
 
+static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
+{
+	struct fd real;
+	const struct cred *old_cred;
+	int ret;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		return ret;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = vfs_fsync_range(real.file, start, end, datasync);
+	revert_creds(old_cred);
+
+	fdput(real);
+
+	return ret;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
 	.llseek		= ovl_llseek,
 	.read_iter	= ovl_read_iter,
 	.write_iter	= ovl_write_iter,
+	.fsync		= ovl_fsync,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 16/35] ovl: add ovl_mmap()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (14 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 15/35] ovl: add ovl_fsync() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 17/35] ovl: add ovl_fallocate() Miklos Szeredi
                   ` (18 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked mmap.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 419aa3f9967b..b75ee0a3655e 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -252,6 +252,33 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	return ret;
 }
 
+static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
+{
+	struct fd real;
+	const struct cred *old_cred;
+	int ret;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		return ret;
+
+	/* transfer ref: */
+	fput(vma->vm_file);
+	vma->vm_file = get_file(real.file);
+	fdput(real);
+
+	if (!vma->vm_file->f_op->mmap)
+		return -ENODEV;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = call_mmap(vma->vm_file, vma);
+	revert_creds(old_cred);
+
+	ovl_file_accessed(file);
+
+	return ret;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -259,4 +286,5 @@ const struct file_operations ovl_file_operations = {
 	.read_iter	= ovl_read_iter,
 	.write_iter	= ovl_write_iter,
 	.fsync		= ovl_fsync,
+	.mmap		= ovl_mmap,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 17/35] ovl: add ovl_fallocate()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (15 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 16/35] ovl: add ovl_mmap() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 18/35] ovl: add lsattr/chattr support Miklos Szeredi
                   ` (17 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked fallocate.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index b75ee0a3655e..3a85bbfa266a 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -279,6 +279,29 @@ static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 	return ret;
 }
 
+static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
+{
+	struct inode *inode = file_inode(file);
+	struct fd real;
+	const struct cred *old_cred;
+	int ret;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		return ret;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = vfs_fallocate(real.file, mode, offset, len);
+	revert_creds(old_cred);
+
+	/* Update size */
+	ovl_copyattr(ovl_inode_real(inode), inode);
+
+	fdput(real);
+
+	return ret;
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -287,4 +310,5 @@ const struct file_operations ovl_file_operations = {
 	.write_iter	= ovl_write_iter,
 	.fsync		= ovl_fsync,
 	.mmap		= ovl_mmap,
+	.fallocate	= ovl_fallocate,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 18/35] ovl: add lsattr/chattr support
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (16 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 17/35] ovl: add ovl_fallocate() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 19/35] ovl: add ovl_fiemap() Miklos Szeredi
                   ` (16 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement FS_IOC_GETFLAGS and FS_IOC_SETFLAGS.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 79 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 3a85bbfa266a..2028f6886649 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -8,6 +8,7 @@
 
 #include <linux/cred.h>
 #include <linux/file.h>
+#include <linux/mount.h>
 #include <linux/xattr.h>
 #include <linux/uio.h>
 #include "overlayfs.h"
@@ -302,6 +303,82 @@ static long ovl_fallocate(struct file *file, int mode, loff_t offset, loff_t len
 	return ret;
 }
 
+static long ovl_real_ioctl(struct file *file, unsigned int cmd,
+			   unsigned long arg)
+{
+	struct fd real;
+	const struct cred *old_cred;
+	long ret;
+
+	ret = ovl_real_fdget(file, &real);
+	if (ret)
+		return ret;
+
+	old_cred = ovl_override_creds(file_inode(file)->i_sb);
+	ret = vfs_ioctl(real.file, cmd, arg);
+	revert_creds(old_cred);
+
+	fdput(real);
+
+	return ret;
+}
+
+static long ovl_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
+{
+	long ret;
+	struct inode *inode = file_inode(file);
+
+	switch (cmd) {
+	case FS_IOC_GETFLAGS:
+		ret = ovl_real_ioctl(file, cmd, arg);
+		break;
+
+	case FS_IOC_SETFLAGS:
+		if (!inode_owner_or_capable(inode))
+			return -EACCES;
+
+		ret = mnt_want_write_file(file);
+		if (ret)
+			return ret;
+
+		ret = ovl_copy_up(file_dentry(file));
+		if (!ret) {
+			ret = ovl_real_ioctl(file, cmd, arg);
+
+			inode_lock(inode);
+			ovl_copyflags(ovl_inode_real(inode), inode);
+			inode_unlock(inode);
+		}
+
+		mnt_drop_write_file(file);
+		break;
+
+	default:
+		ret = -ENOTTY;
+	}
+
+	return ret;
+}
+
+static long ovl_compat_ioctl(struct file *file, unsigned int cmd,
+			     unsigned long arg)
+{
+	switch (cmd) {
+	case FS_IOC32_GETFLAGS:
+		cmd = FS_IOC_GETFLAGS;
+		break;
+
+	case FS_IOC32_SETFLAGS:
+		cmd = FS_IOC_SETFLAGS;
+		break;
+
+	default:
+		return -ENOIOCTLCMD;
+	}
+
+	return ovl_ioctl(file, cmd, arg);
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -311,4 +388,6 @@ const struct file_operations ovl_file_operations = {
 	.fsync		= ovl_fsync,
 	.mmap		= ovl_mmap,
 	.fallocate	= ovl_fallocate,
+	.unlocked_ioctl	= ovl_ioctl,
+	.compat_ioctl	= ovl_compat_ioctl,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 19/35] ovl: add ovl_fiemap()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (17 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 18/35] ovl: add lsattr/chattr support Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 20/35] ovl: add O_DIRECT support Miklos Szeredi
                   ` (15 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Implement stacked fiemap().

Need to split inode operations for regular file (which has fiemap) and
special file (which doesn't have fiemap).

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/inode.c | 29 ++++++++++++++++++++++++++++-
 1 file changed, 28 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
index a228eda66ad8..7abcf96e94fc 100644
--- a/fs/overlayfs/inode.c
+++ b/fs/overlayfs/inode.c
@@ -448,6 +448,23 @@ int ovl_update_time(struct inode *inode, struct timespec *ts, int flags)
 	return 0;
 }
 
+static int ovl_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
+		      u64 start, u64 len)
+{
+	int err;
+	struct inode *realinode = ovl_inode_real(inode);
+	const struct cred *old_cred;
+
+	if (!realinode->i_op->fiemap)
+		return -EOPNOTSUPP;
+
+	old_cred = ovl_override_creds(inode->i_sb);
+	err = realinode->i_op->fiemap(realinode, fieinfo, start, len);
+	revert_creds(old_cred);
+
+	return err;
+}
+
 static const struct inode_operations ovl_file_inode_operations = {
 	.setattr	= ovl_setattr,
 	.permission	= ovl_permission,
@@ -455,6 +472,7 @@ static const struct inode_operations ovl_file_inode_operations = {
 	.listxattr	= ovl_listxattr,
 	.get_acl	= ovl_get_acl,
 	.update_time	= ovl_update_time,
+	.fiemap		= ovl_fiemap,
 };
 
 static const struct inode_operations ovl_symlink_inode_operations = {
@@ -465,6 +483,15 @@ static const struct inode_operations ovl_symlink_inode_operations = {
 	.update_time	= ovl_update_time,
 };
 
+static const struct inode_operations ovl_special_inode_operations = {
+	.setattr	= ovl_setattr,
+	.permission	= ovl_permission,
+	.getattr	= ovl_getattr,
+	.listxattr	= ovl_listxattr,
+	.get_acl	= ovl_get_acl,
+	.update_time	= ovl_update_time,
+};
+
 /*
  * It is possible to stack overlayfs instance on top of another
  * overlayfs instance as lower layer. We need to annonate the
@@ -548,7 +575,7 @@ static void ovl_fill_inode(struct inode *inode, umode_t mode, dev_t rdev,
 		break;
 
 	default:
-		inode->i_op = &ovl_file_inode_operations;
+		inode->i_op = &ovl_special_inode_operations;
 		init_special_inode(inode, mode, rdev);
 		break;
 	}
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 20/35] ovl: add O_DIRECT support
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (18 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 19/35] ovl: add ovl_fiemap() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support Miklos Szeredi
                   ` (14 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 2028f6886649..ce871a15e185 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -110,6 +110,9 @@ static int ovl_open(struct inode *inode, struct file *file)
 	if (IS_ERR(realfile))
 		return PTR_ERR(realfile);
 
+	/* For O_DIRECT dentry_open() checks f_mapping->a_ops->direct_IO */
+	file->f_mapping = realfile->f_mapping;
+
 	file->private_data = realfile;
 
 	return 0;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (19 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 20/35] ovl: add O_DIRECT support Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07 20:43   ` Darrick J. Wong
  2018-05-07  8:37 ` [PATCH v2 22/35] vfs: don't open real Miklos Szeredi
                   ` (13 subsequent siblings)
  34 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

Since set of arguments are so similar, handle in a common helper.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/file.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 88 insertions(+)

diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index ce871a15e185..2ac95c95e8e6 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -382,6 +382,90 @@ static long ovl_compat_ioctl(struct file *file, unsigned int cmd,
 	return ovl_ioctl(file, cmd, arg);
 }
 
+enum ovl_copyop {
+	OVL_COPY,
+	OVL_CLONE,
+	OVL_DEDUPE,
+};
+
+static s64 ovl_copyfile(struct file *file_in, loff_t pos_in,
+			struct file *file_out, loff_t pos_out,
+			u64 len, unsigned int flags, enum ovl_copyop op)
+{
+	struct inode *inode_out = file_inode(file_out);
+	struct fd real_in, real_out;
+	const struct cred *old_cred;
+	s64 ret;
+
+	ret = ovl_real_fdget(file_out, &real_out);
+	if (ret)
+		return ret;
+
+	ret = ovl_real_fdget(file_in, &real_in);
+	if (ret) {
+		fdput(real_out);
+		return ret;
+	}
+
+	old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
+	switch (op) {
+	case OVL_COPY:
+		ret = vfs_copy_file_range(real_in.file, pos_in,
+					  real_out.file, pos_out, len, flags);
+		break;
+
+	case OVL_CLONE:
+		ret = vfs_clone_file_range(real_in.file, pos_in,
+					   real_out.file, pos_out, len);
+		break;
+
+	case OVL_DEDUPE:
+		ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
+						real_out.file, pos_out, len);
+		break;
+	}
+	revert_creds(old_cred);
+
+	/* Update size */
+	ovl_copyattr(ovl_inode_real(inode_out), inode_out);
+
+	fdput(real_in);
+	fdput(real_out);
+
+	return ret;
+}
+
+static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
+				   struct file *file_out, loff_t pos_out,
+				   size_t len, unsigned int flags)
+{
+	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
+			    OVL_COPY);
+}
+
+static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
+				struct file *file_out, loff_t pos_out, u64 len)
+{
+	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
+			    OVL_CLONE);
+}
+
+static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
+				 struct file *file_out, loff_t pos_out,
+				 u64 len)
+{
+	/*
+	 * Don't copy up because of a dedupe request, this wouldn't make sense
+	 * most of the time (data would be duplicated instead of deduplicated).
+	 */
+	if (!ovl_inode_upper(file_inode(file_in)) ||
+	    !ovl_inode_upper(file_inode(file_out)))
+		return -EPERM;
+
+	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
+			    OVL_DEDUPE);
+}
+
 const struct file_operations ovl_file_operations = {
 	.open		= ovl_open,
 	.release	= ovl_release,
@@ -393,4 +477,8 @@ const struct file_operations ovl_file_operations = {
 	.fallocate	= ovl_fallocate,
 	.unlocked_ioctl	= ovl_ioctl,
 	.compat_ioctl	= ovl_compat_ioctl,
+
+	.copy_file_range	= ovl_copy_file_range,
+	.clone_file_range	= ovl_clone_file_range,
+	.dedupe_file_range	= ovl_dedupe_file_range,
 };
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 22/35] vfs: don't open real
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (20 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07 10:27   ` Amir Goldstein
  2018-05-11 18:54   ` Vivek Goyal
  2018-05-07  8:37 ` [PATCH v2 23/35] ovl: copy-up on MAP_SHARED Miklos Szeredi
                   ` (12 subsequent siblings)
  34 siblings, 2 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

Let overlayfs do its thing when opening a file.

This enables stacking and fixes the corner case when a file is opened for
read, modified through a writable open, and data is read from the read-only
file.  After this patch the read-only open will not return stale data even
in this case.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/open.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 6e52fd6fea7c..244cd2ecfefd 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
 int vfs_open(const struct path *path, struct file *file,
 	     const struct cred *cred)
 {
-	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
-
-	if (IS_ERR(dentry))
-		return PTR_ERR(dentry);
-
 	file->f_path = *path;
-	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
+	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
 }
 
 /**
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 23/35] ovl: copy-up on MAP_SHARED
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (21 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 22/35] vfs: don't open real Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07 19:28   ` Randy Dunlap
  2018-05-07  8:37 ` [PATCH v2 24/35] vfs: simplify dentry_open() Miklos Szeredi
                   ` (11 subsequent siblings)
  34 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

A corner case of a corner case is when

 - file opened for O_RDONLY
 - which is then memory mapped SHARED
 - file opened for O_WRONLY
 - contents modified
 - contents read back though the shared mapping

Unfortunately it looks very difficult to do anything about the established
shared map after the file is copied up.

Instead, when a read-only file is mapped shared, copy up the file before
actually doing the map.  This may result in unnecessary copy-ups (but so
may copy-up on open(O_RDWR) for exampe).

We can revisit this later if it turns out to be a performance problem in
real life.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/overlayfs/Kconfig     | 21 +++++++++++++++++++++
 fs/overlayfs/file.c      | 22 ++++++++++++++++++++++
 fs/overlayfs/overlayfs.h |  7 +++++++
 fs/overlayfs/ovl_entry.h |  1 +
 fs/overlayfs/super.c     | 22 ++++++++++++++++++++++
 5 files changed, 73 insertions(+)

diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
index 17032631c5cf..991c0a5a0e00 100644
--- a/fs/overlayfs/Kconfig
+++ b/fs/overlayfs/Kconfig
@@ -103,3 +103,24 @@ config OVERLAY_FS_XINO_AUTO
 	  For more information, see Documentation/filesystems/overlayfs.txt
 
 	  If unsure, say N.
+
+config OVERLAY_FS_COPY_UP_SHARED
+       bool "Overlayfs: copy up when mapping a file shared"
+       default n
+       depends on OVERLAY_FS
+       help
+         If this option is enabled then on mapping a file with MAP_SHARED
+	 overlayfs copies up the file in anticipation of it being modified (just
+	 like we copy up the file on O_WRONLY and O_RDWR in anticipation of
+	 modification).  This does not interfere with shared library loading, as
+	 that uses MAP_PRIVATE.  But there might be use cases out there where
+	 this impacts performance and disk usage.
+
+	 This just selects the default, the feature can also be enabled or
+	 disabled in the running kernel or individually on each overlay mount.
+
+	 To get maximally standard compliant behavior, enable this option.
+
+	 To get a maximally backward compatible kernel, disable this option.
+
+	 If unsure, say N.
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 2ac95c95e8e6..a60734ec89ec 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -10,6 +10,7 @@
 #include <linux/file.h>
 #include <linux/mount.h>
 #include <linux/xattr.h>
+#include <linux/mman.h>
 #include <linux/uio.h>
 #include "overlayfs.h"
 
@@ -256,6 +257,26 @@ static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
 	return ret;
 }
 
+static int ovl_pre_mmap(struct file *file, unsigned long prot,
+			unsigned long flag)
+{
+	int err = 0;
+
+	/*
+	 * Take MAP_SHARED as hint about future writes to the file (through
+	 * another file descriptor).  Caller might not have had such an intent,
+	 * but we hope MAP_PRIVATE will be used in most such cases.
+	 *
+	 * If we don't copy up now and the file is modified, it becomes really
+	 * difficult to change the mapping to match that of the file's content
+	 * later.
+	 */
+	if ((flag & MAP_SHARED) && ovl_copy_up_shared(file_inode(file)->i_sb))
+		err = ovl_copy_up(file_dentry(file));
+
+	return err;
+}
+
 static int ovl_mmap(struct file *file, struct vm_area_struct *vma)
 {
 	struct fd real;
@@ -473,6 +494,7 @@ const struct file_operations ovl_file_operations = {
 	.read_iter	= ovl_read_iter,
 	.write_iter	= ovl_write_iter,
 	.fsync		= ovl_fsync,
+	.pre_mmap	= ovl_pre_mmap,
 	.mmap		= ovl_mmap,
 	.fallocate	= ovl_fallocate,
 	.unlocked_ioctl	= ovl_ioctl,
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index edbf69c8f45d..caaa47cea2aa 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -272,6 +272,13 @@ static inline unsigned int ovl_xino_bits(struct super_block *sb)
 	return ofs->xino_bits;
 }
 
+static inline bool ovl_copy_up_shared(struct super_block *sb)
+{
+	struct ovl_fs *ofs = sb->s_fs_info;
+
+	return !(sb->s_flags & SB_RDONLY) && ofs->config.copy_up_shared;
+}
+
 
 /* namei.c */
 int ovl_check_fh_len(struct ovl_fh *fh, int fh_len);
diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
index 41655a7d6894..3bea47c63fd9 100644
--- a/fs/overlayfs/ovl_entry.h
+++ b/fs/overlayfs/ovl_entry.h
@@ -18,6 +18,7 @@ struct ovl_config {
 	const char *redirect_mode;
 	bool index;
 	bool nfs_export;
+	bool copy_up_shared;
 	int xino;
 };
 
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index ad6a5baf226b..c3d8c7ea180f 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -56,6 +56,12 @@ module_param_named(xino_auto, ovl_xino_auto_def, bool, 0644);
 MODULE_PARM_DESC(ovl_xino_auto_def,
 		 "Auto enable xino feature");
 
+static bool ovl_copy_up_shared_def =
+	IS_ENABLED(CONFIG_OVERLAY_FS_COPY_UP_SHARED);
+module_param_named(copy_up_shared, ovl_copy_up_shared_def, bool, 0644);
+MODULE_PARM_DESC(ovl_copy_up_shared_def,
+		 "Copy up when mapping a file shared");
+
 static void ovl_entry_stack_free(struct ovl_entry *oe)
 {
 	unsigned int i;
@@ -380,6 +386,9 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
 						"on" : "off");
 	if (ofs->config.xino != ovl_xino_def())
 		seq_printf(m, ",xino=%s", ovl_xino_str[ofs->config.xino]);
+	if (ofs->config.copy_up_shared != ovl_copy_up_shared_def)
+		seq_printf(m, ",copy_up_shared=%s",
+			   ofs->config.copy_up_shared ? "on" : "off");
 	return 0;
 }
 
@@ -417,6 +426,8 @@ enum {
 	OPT_XINO_ON,
 	OPT_XINO_OFF,
 	OPT_XINO_AUTO,
+	OPT_COPY_UP_SHARED_ON,
+	OPT_COPY_UP_SHARED_OFF,
 	OPT_ERR,
 };
 
@@ -433,6 +444,8 @@ static const match_table_t ovl_tokens = {
 	{OPT_XINO_ON,			"xino=on"},
 	{OPT_XINO_OFF,			"xino=off"},
 	{OPT_XINO_AUTO,			"xino=auto"},
+	{OPT_COPY_UP_SHARED_ON,		"copy_up_shared=on"},
+	{OPT_COPY_UP_SHARED_OFF,	"copy_up_shared=off"},
 	{OPT_ERR,			NULL}
 };
 
@@ -559,6 +572,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
 			config->xino = OVL_XINO_AUTO;
 			break;
 
+		case OPT_COPY_UP_SHARED_ON:
+			config->copy_up_shared = true;
+			break;
+
+		case OPT_COPY_UP_SHARED_OFF:
+			config->copy_up_shared = false;
+			break;
+
 		default:
 			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
 			return -EINVAL;
@@ -1380,6 +1401,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	ofs->config.index = ovl_index_def;
 	ofs->config.nfs_export = ovl_nfs_export_def;
 	ofs->config.xino = ovl_xino_def();
+	ofs->config.copy_up_shared = ovl_copy_up_shared_def;
 	err = ovl_parse_opt((char *) data, &ofs->config);
 	if (err)
 		goto out_err;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 24/35] vfs: simplify dentry_open()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (22 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 23/35] ovl: copy-up on MAP_SHARED Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 25/35] Revert "ovl: fix may_write_real() for overlayfs directories" Miklos Szeredi
                   ` (10 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

dentry_open() can now just call path_open().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/open.c | 15 +--------------
 1 file changed, 1 insertion(+), 14 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 244cd2ecfefd..1d4bc541c619 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -935,25 +935,12 @@ EXPORT_SYMBOL(path_open);
 struct file *dentry_open(const struct path *path, int flags,
 			 const struct cred *cred)
 {
-	int error;
-	struct file *f;
-
 	validate_creds(cred);
 
 	/* We must always pass in a valid mount pointer. */
 	BUG_ON(!path->mnt);
 
-	f = get_empty_filp();
-	if (IS_ERR(f))
-		return f;
-
-	f->f_flags = flags;
-	error = vfs_open(path, f, cred);
-	if (error) {
-		put_filp(f);
-		return ERR_PTR(error);
-	}
-	return f;
+	return path_open(path, flags, d_backing_inode(path->dentry), cred, true);
 }
 EXPORT_SYMBOL(dentry_open);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 25/35] Revert "ovl: fix may_write_real() for overlayfs directories"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (23 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 24/35] vfs: simplify dentry_open() Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 26/35] Revert "ovl: don't allow writing ioctl on lower layer" Miklos Szeredi
                   ` (9 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit 954c736f865d6c0c68ae4263a2f3502ee7c447a3.

Overlayfs no longer relies on the vfs for checking writability of files.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/namespace.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 5f75969adff1..c3f7152a8419 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -468,9 +468,7 @@ static inline int may_write_real(struct file *file)
 
 	/* File refers to upper, writable layer? */
 	upperdentry = d_real(dentry, NULL, 0, D_REAL_UPPER);
-	if (upperdentry &&
-	    (file_inode(file) == d_inode(upperdentry) ||
-	     file_inode(file) == d_inode(dentry)))
+	if (upperdentry && file_inode(file) == d_inode(upperdentry))
 		return 0;
 
 	/* Lower layer: can't write to real file, sorry... */
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 26/35] Revert "ovl: don't allow writing ioctl on lower layer"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (24 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 25/35] Revert "ovl: fix may_write_real() for overlayfs directories" Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:37 ` [PATCH v2 27/35] vfs: fix freeze protection in mnt_want_write_file() for overlayfs Miklos Szeredi
                   ` (8 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit 7c6893e3c9abf6a9676e060a1e35e5caca673d57.

Overlayfs no longer relies on the vfs for checking writability of files.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/internal.h  |  2 --
 fs/namespace.c | 64 +++-------------------------------------------------------
 fs/open.c      |  4 ++--
 fs/xattr.c     |  9 ++++-----
 4 files changed, 9 insertions(+), 70 deletions(-)

diff --git a/fs/internal.h b/fs/internal.h
index 6821cf475fc6..29c9a2fab592 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -80,10 +80,8 @@ extern void __init mnt_init(void);
 
 extern int __mnt_want_write(struct vfsmount *);
 extern int __mnt_want_write_file(struct file *);
-extern int mnt_want_write_file_path(struct file *);
 extern void __mnt_drop_write(struct vfsmount *);
 extern void __mnt_drop_write_file(struct file *);
-extern void mnt_drop_write_file_path(struct file *);
 
 /*
  * fs_struct.c
diff --git a/fs/namespace.c b/fs/namespace.c
index c3f7152a8419..5286c5313e67 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -431,18 +431,13 @@ int __mnt_want_write_file(struct file *file)
 }
 
 /**
- * mnt_want_write_file_path - get write access to a file's mount
+ * mnt_want_write_file - get write access to a file's mount
  * @file: the file who's mount on which to take a write
  *
  * This is like mnt_want_write, but it takes a file and can
  * do some optimisations if the file is open for write already
- *
- * Called by the vfs for cases when we have an open file at hand, but will do an
- * inode operation on it (important distinction for files opened on overlayfs,
- * since the file operations will come from the real underlying file, while
- * inode operations come from the overlay).
  */
-int mnt_want_write_file_path(struct file *file)
+int mnt_want_write_file(struct file *file)
 {
 	int ret;
 
@@ -452,53 +447,6 @@ int mnt_want_write_file_path(struct file *file)
 		sb_end_write(file->f_path.mnt->mnt_sb);
 	return ret;
 }
-
-static inline int may_write_real(struct file *file)
-{
-	struct dentry *dentry = file->f_path.dentry;
-	struct dentry *upperdentry;
-
-	/* Writable file? */
-	if (file->f_mode & FMODE_WRITER)
-		return 0;
-
-	/* Not overlayfs? */
-	if (likely(!(dentry->d_flags & DCACHE_OP_REAL)))
-		return 0;
-
-	/* File refers to upper, writable layer? */
-	upperdentry = d_real(dentry, NULL, 0, D_REAL_UPPER);
-	if (upperdentry && file_inode(file) == d_inode(upperdentry))
-		return 0;
-
-	/* Lower layer: can't write to real file, sorry... */
-	return -EPERM;
-}
-
-/**
- * mnt_want_write_file - get write access to a file's mount
- * @file: the file who's mount on which to take a write
- *
- * This is like mnt_want_write, but it takes a file and can
- * do some optimisations if the file is open for write already
- *
- * Mostly called by filesystems from their ioctl operation before performing
- * modification.  On overlayfs this needs to check if the file is on a read-only
- * lower layer and deny access in that case.
- */
-int mnt_want_write_file(struct file *file)
-{
-	int ret;
-
-	ret = may_write_real(file);
-	if (!ret) {
-		sb_start_write(file_inode(file)->i_sb);
-		ret = __mnt_want_write_file(file);
-		if (ret)
-			sb_end_write(file_inode(file)->i_sb);
-	}
-	return ret;
-}
 EXPORT_SYMBOL_GPL(mnt_want_write_file);
 
 /**
@@ -536,15 +484,9 @@ void __mnt_drop_write_file(struct file *file)
 	__mnt_drop_write(file->f_path.mnt);
 }
 
-void mnt_drop_write_file_path(struct file *file)
-{
-	mnt_drop_write(file->f_path.mnt);
-}
-
 void mnt_drop_write_file(struct file *file)
 {
-	__mnt_drop_write(file->f_path.mnt);
-	sb_end_write(file_inode(file)->i_sb);
+	mnt_drop_write(file->f_path.mnt);
 }
 EXPORT_SYMBOL(mnt_drop_write_file);
 
diff --git a/fs/open.c b/fs/open.c
index 1d4bc541c619..2db39216c393 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -707,12 +707,12 @@ int ksys_fchown(unsigned int fd, uid_t user, gid_t group)
 	if (!f.file)
 		goto out;
 
-	error = mnt_want_write_file_path(f.file);
+	error = mnt_want_write_file(f.file);
 	if (error)
 		goto out_fput;
 	audit_file(f.file);
 	error = chown_common(&f.file->f_path, user, group);
-	mnt_drop_write_file_path(f.file);
+	mnt_drop_write_file(f.file);
 out_fput:
 	fdput(f);
 out:
diff --git a/fs/xattr.c b/fs/xattr.c
index 61cd28ba25f3..78eaffbdbee0 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -23,7 +23,6 @@
 #include <linux/posix_acl_xattr.h>
 
 #include <linux/uaccess.h>
-#include "internal.h"
 
 static const char *
 strcmp_prefix(const char *a, const char *a_prefix)
@@ -503,10 +502,10 @@ SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
 	if (!f.file)
 		return error;
 	audit_file(f.file);
-	error = mnt_want_write_file_path(f.file);
+	error = mnt_want_write_file(f.file);
 	if (!error) {
 		error = setxattr(f.file->f_path.dentry, name, value, size, flags);
-		mnt_drop_write_file_path(f.file);
+		mnt_drop_write_file(f.file);
 	}
 	fdput(f);
 	return error;
@@ -735,10 +734,10 @@ SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
 	if (!f.file)
 		return error;
 	audit_file(f.file);
-	error = mnt_want_write_file_path(f.file);
+	error = mnt_want_write_file(f.file);
 	if (!error) {
 		error = removexattr(f.file->f_path.dentry, name);
-		mnt_drop_write_file_path(f.file);
+		mnt_drop_write_file(f.file);
 	}
 	fdput(f);
 	return error;
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 27/35] vfs: fix freeze protection in mnt_want_write_file() for overlayfs
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (25 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 26/35] Revert "ovl: don't allow writing ioctl on lower layer" Miklos Szeredi
@ 2018-05-07  8:37 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 28/35] Revert "ovl: fix relatime for directories" Miklos Szeredi
                   ` (7 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:37 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

The underlying real file used by overlayfs still contains the overlay path.
This results in mnt_want_write_file() calls by the filesystem getting
freeze protection on the wrong inode (the overlayfs one instead of the real
one).

Fix by using file_inode(file)->i_sb instead of file->f_path.mnt->mnt_sb.

Reported-by: Amir Goldstein <amir73il@gmail.com>
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/namespace.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 5286c5313e67..0d9023a9af4f 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -441,10 +441,10 @@ int mnt_want_write_file(struct file *file)
 {
 	int ret;
 
-	sb_start_write(file->f_path.mnt->mnt_sb);
+	sb_start_write(file_inode(file)->i_sb);
 	ret = __mnt_want_write_file(file);
 	if (ret)
-		sb_end_write(file->f_path.mnt->mnt_sb);
+		sb_end_write(file_inode(file)->i_sb);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(mnt_want_write_file);
@@ -486,7 +486,8 @@ void __mnt_drop_write_file(struct file *file)
 
 void mnt_drop_write_file(struct file *file)
 {
-	mnt_drop_write(file->f_path.mnt);
+	__mnt_drop_write_file(file);
+	sb_end_write(file_inode(file)->i_sb);
 }
 EXPORT_SYMBOL(mnt_drop_write_file);
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 28/35] Revert "ovl: fix relatime for directories"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (26 preceding siblings ...)
  2018-05-07  8:37 ` [PATCH v2 27/35] vfs: fix freeze protection in mnt_want_write_file() for overlayfs Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 29/35] Revert "vfs: update ovl inode before relatime check" Miklos Szeredi
                   ` (6 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit cd91304e7190b4c4802f8e413ab2214b233e0260.

Overlayfs no longer relies on the vfs correct atime handling.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/inode.c             | 21 ++++-----------------
 fs/overlayfs/super.c   |  3 ---
 include/linux/dcache.h |  3 ---
 3 files changed, 4 insertions(+), 23 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index 13ceb98c3bd3..e97d0193221d 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1573,24 +1573,11 @@ EXPORT_SYMBOL(bmap);
 static void update_ovl_inode_times(struct dentry *dentry, struct inode *inode,
 			       bool rcu)
 {
-	struct dentry *upperdentry;
+	if (!rcu) {
+		struct inode *realinode = d_real_inode(dentry);
 
-	/*
-	 * Nothing to do if in rcu or if non-overlayfs
-	 */
-	if (rcu || likely(!(dentry->d_flags & DCACHE_OP_REAL)))
-		return;
-
-	upperdentry = d_real(dentry, NULL, 0, D_REAL_UPPER);
-
-	/*
-	 * If file is on lower then we can't update atime, so no worries about
-	 * stale mtime/ctime.
-	 */
-	if (upperdentry) {
-		struct inode *realinode = d_inode(upperdentry);
-
-		if ((!timespec_equal(&inode->i_mtime, &realinode->i_mtime) ||
+		if (unlikely(inode != realinode) &&
+		    (!timespec_equal(&inode->i_mtime, &realinode->i_mtime) ||
 		     !timespec_equal(&inode->i_ctime, &realinode->i_ctime))) {
 			inode->i_mtime = realinode->i_mtime;
 			inode->i_ctime = realinode->i_ctime;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index c3d8c7ea180f..006dc70d7425 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -107,9 +107,6 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 	if (inode && d_inode(dentry) == inode)
 		return dentry;
 
-	if (flags & D_REAL_UPPER)
-		return ovl_dentry_upper(dentry);
-
 	if (!d_is_reg(dentry)) {
 		if (!inode || inode == d_inode(dentry))
 			return dentry;
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 94acbde17bb1..b2b829bcb7c0 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -563,9 +563,6 @@ static inline struct dentry *d_backing_dentry(struct dentry *upper)
 	return upper;
 }
 
-/* d_real() flags */
-#define D_REAL_UPPER	0x2	/* return upper dentry or NULL if non-upper */
-
 /**
  * d_real - Return the real dentry
  * @dentry: the dentry to query
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 29/35] Revert "vfs: update ovl inode before relatime check"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (27 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 28/35] Revert "ovl: fix relatime for directories" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 30/35] Revert "vfs: add flags to d_real()" Miklos Szeredi
                   ` (5 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit 598e3c8f72f5b77c84d2cb26cfd936ffb3cfdbaa.

Overlayfs no longer relies on the vfs correct atime handling.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/inode.c         | 33 ++++++---------------------------
 fs/internal.h      |  7 -------
 fs/namei.c         |  2 +-
 include/linux/fs.h |  1 +
 4 files changed, 8 insertions(+), 35 deletions(-)

diff --git a/fs/inode.c b/fs/inode.c
index e97d0193221d..9818c0f48cfa 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -1567,37 +1567,17 @@ sector_t bmap(struct inode *inode, sector_t block)
 }
 EXPORT_SYMBOL(bmap);
 
-/*
- * Update times in overlayed inode from underlying real inode
- */
-static void update_ovl_inode_times(struct dentry *dentry, struct inode *inode,
-			       bool rcu)
-{
-	if (!rcu) {
-		struct inode *realinode = d_real_inode(dentry);
-
-		if (unlikely(inode != realinode) &&
-		    (!timespec_equal(&inode->i_mtime, &realinode->i_mtime) ||
-		     !timespec_equal(&inode->i_ctime, &realinode->i_ctime))) {
-			inode->i_mtime = realinode->i_mtime;
-			inode->i_ctime = realinode->i_ctime;
-		}
-	}
-}
-
 /*
  * With relative atime, only update atime if the previous atime is
  * earlier than either the ctime or mtime or if at least a day has
  * passed since the last atime update.
  */
-static int relatime_need_update(const struct path *path, struct inode *inode,
-				struct timespec now, bool rcu)
+static int relatime_need_update(struct vfsmount *mnt, struct inode *inode,
+			     struct timespec now)
 {
 
-	if (!(path->mnt->mnt_flags & MNT_RELATIME))
+	if (!(mnt->mnt_flags & MNT_RELATIME))
 		return 1;
-
-	update_ovl_inode_times(path->dentry, inode, rcu);
 	/*
 	 * Is mtime younger than atime? If yes, update atime:
 	 */
@@ -1668,8 +1648,7 @@ static int update_time(struct inode *inode, struct timespec *time, int flags)
  *	This function automatically handles read only file systems and media,
  *	as well as the "noatime" flag and inode specific "noatime" markers.
  */
-bool __atime_needs_update(const struct path *path, struct inode *inode,
-			  bool rcu)
+bool atime_needs_update(const struct path *path, struct inode *inode)
 {
 	struct vfsmount *mnt = path->mnt;
 	struct timespec now;
@@ -1695,7 +1674,7 @@ bool __atime_needs_update(const struct path *path, struct inode *inode,
 
 	now = current_time(inode);
 
-	if (!relatime_need_update(path, inode, now, rcu))
+	if (!relatime_need_update(mnt, inode, now))
 		return false;
 
 	if (timespec_equal(&inode->i_atime, &now))
@@ -1710,7 +1689,7 @@ void touch_atime(const struct path *path)
 	struct inode *inode = d_inode(path->dentry);
 	struct timespec now;
 
-	if (!__atime_needs_update(path, inode, false))
+	if (!atime_needs_update(path, inode))
 		return;
 
 	if (!sb_start_write_trylock(inode->i_sb))
diff --git a/fs/internal.h b/fs/internal.h
index 29c9a2fab592..6ada1f356da6 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -138,13 +138,6 @@ extern long prune_icache_sb(struct super_block *sb, struct shrink_control *sc);
 extern void inode_add_lru(struct inode *inode);
 extern int dentry_needs_remove_privs(struct dentry *dentry);
 
-extern bool __atime_needs_update(const struct path *, struct inode *, bool);
-static inline bool atime_needs_update_rcu(const struct path *path,
-					  struct inode *inode)
-{
-	return __atime_needs_update(path, inode, true);
-}
-
 /*
  * fs-writeback.c
  */
diff --git a/fs/namei.c b/fs/namei.c
index 186bd2464fd5..54ab8ccb8d6d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -1013,7 +1013,7 @@ const char *get_link(struct nameidata *nd)
 	if (!(nd->flags & LOOKUP_RCU)) {
 		touch_atime(&last->link);
 		cond_resched();
-	} else if (atime_needs_update_rcu(&last->link, inode)) {
+	} else if (atime_needs_update(&last->link, inode)) {
 		if (unlikely(unlazy_walk(nd)))
 			return ERR_PTR(-ECHILD);
 		touch_atime(&last->link);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c85a8059f038..ebe311541873 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2066,6 +2066,7 @@ enum file_time_flags {
 	S_VERSION = 8,
 };
 
+extern bool atime_needs_update(const struct path *, struct inode *);
 extern void touch_atime(const struct path *);
 static inline void file_accessed(struct file *file)
 {
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 30/35] Revert "vfs: add flags to d_real()"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (28 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 29/35] Revert "vfs: update ovl inode before relatime check" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 31/35] Revert "vfs: do get_write_access() on upper layer of overlayfs" Miklos Szeredi
                   ` (4 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit 495e642939114478a5237a7d91661ba93b76f15a.

No user of "flags" argument of d_real() remain.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/Locking |  2 +-
 Documentation/filesystems/vfs.txt |  2 +-
 fs/open.c                         |  2 +-
 fs/overlayfs/super.c              |  4 ++--
 include/linux/dcache.h            | 11 +++++------
 include/linux/fs.h                |  2 +-
 6 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index 60e76060baff..a4afe96f0112 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -22,7 +22,7 @@ prototypes:
 	struct vfsmount *(*d_automount)(struct path *path);
 	int (*d_manage)(const struct path *, bool);
 	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int, unsigned int);
+				 unsigned int);
 
 locking rules:
 		rename_lock	->d_lock	may block	rcu-walk
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index 2bc77ea8aef4..af54d3651ff8 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -991,7 +991,7 @@ struct dentry_operations {
 	struct vfsmount *(*d_automount)(struct path *);
 	int (*d_manage)(const struct path *, bool);
 	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int, unsigned int);
+				 unsigned int);
 };
 
   d_revalidate: called when the VFS needs to revalidate a dentry. This
diff --git a/fs/open.c b/fs/open.c
index 2db39216c393..127b49819afb 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -96,7 +96,7 @@ long vfs_truncate(const struct path *path, loff_t length)
 	 * write access on the upper inode, not on the overlay inode.  For
 	 * non-overlay filesystems d_real() is an identity function.
 	 */
-	upperdentry = d_real(path->dentry, NULL, O_WRONLY, 0);
+	upperdentry = d_real(path->dentry, NULL, O_WRONLY);
 	error = PTR_ERR(upperdentry);
 	if (IS_ERR(upperdentry))
 		goto mnt_drop_write_and_out;
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 006dc70d7425..7779fc610767 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -98,7 +98,7 @@ static int ovl_check_append_only(struct inode *inode, int flag)
 
 static struct dentry *ovl_d_real(struct dentry *dentry,
 				 const struct inode *inode,
-				 unsigned int open_flags, unsigned int flags)
+				 unsigned int open_flags)
 {
 	struct dentry *real;
 	int err;
@@ -134,7 +134,7 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 		goto bug;
 
 	/* Handle recursion */
-	real = d_real(real, inode, open_flags, 0);
+	real = d_real(real, inode, open_flags);
 
 	if (!inode || inode == d_inode(real))
 		return real;
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index b2b829bcb7c0..58fcc66ddccd 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -146,7 +146,7 @@ struct dentry_operations {
 	struct vfsmount *(*d_automount)(struct path *);
 	int (*d_manage)(const struct path *, bool);
 	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int, unsigned int);
+				 unsigned int);
 } ____cacheline_aligned;
 
 /*
@@ -567,8 +567,7 @@ static inline struct dentry *d_backing_dentry(struct dentry *upper)
  * d_real - Return the real dentry
  * @dentry: the dentry to query
  * @inode: inode to select the dentry from multiple layers (can be NULL)
- * @open_flags: open flags to control copy-up behavior
- * @flags: flags to control what is returned by this function
+ * @flags: open flags to control copy-up behavior
  *
  * If dentry is on a union/overlay, then return the underlying, real dentry.
  * Otherwise return the dentry itself.
@@ -577,10 +576,10 @@ static inline struct dentry *d_backing_dentry(struct dentry *upper)
  */
 static inline struct dentry *d_real(struct dentry *dentry,
 				    const struct inode *inode,
-				    unsigned int open_flags, unsigned int flags)
+				    unsigned int flags)
 {
 	if (unlikely(dentry->d_flags & DCACHE_OP_REAL))
-		return dentry->d_op->d_real(dentry, inode, open_flags, flags);
+		return dentry->d_op->d_real(dentry, inode, flags);
 	else
 		return dentry;
 }
@@ -595,7 +594,7 @@ static inline struct dentry *d_real(struct dentry *dentry,
 static inline struct inode *d_real_inode(const struct dentry *dentry)
 {
 	/* This usage of d_real() results in const dentry */
-	return d_backing_inode(d_real((struct dentry *) dentry, NULL, 0, 0));
+	return d_backing_inode(d_real((struct dentry *) dentry, NULL, 0));
 }
 
 struct name_snapshot {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ebe311541873..ba3078693d4a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1244,7 +1244,7 @@ static inline struct inode *file_inode(const struct file *f)
 
 static inline struct dentry *file_dentry(const struct file *file)
 {
-	return d_real(file->f_path.dentry, file_inode(file), 0, 0);
+	return d_real(file->f_path.dentry, file_inode(file), 0);
 }
 
 static inline int locks_lock_file_wait(struct file *filp, struct file_lock *fl)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 31/35] Revert "vfs: do get_write_access() on upper layer of overlayfs"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (29 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 30/35] Revert "vfs: add flags to d_real()" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs" Miklos Szeredi
                   ` (3 subsequent siblings)
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit 4d0c5ba2ff79ef9f5188998b29fd28fcb05f3667.

We now get write access on both overlay and underlying layers so this patch
is no longer needed for correct operation.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/locks.c |  3 +--
 fs/open.c  | 15 ++-------------
 2 files changed, 3 insertions(+), 15 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 62bbe8b31f26..9c0e5f3da66c 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1654,8 +1654,7 @@ check_conflicting_open(const struct dentry *dentry, const long arg, int flags)
 	if (flags & FL_LAYOUT)
 		return 0;
 
-	if ((arg == F_RDLCK) &&
-	    (atomic_read(&d_real_inode(dentry)->i_writecount) > 0))
+	if ((arg == F_RDLCK) && (atomic_read(&inode->i_writecount) > 0))
 		return -EAGAIN;
 
 	if ((arg == F_WRLCK) && ((d_count(dentry) > 1) ||
diff --git a/fs/open.c b/fs/open.c
index 127b49819afb..0d63b57c7f89 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -68,7 +68,6 @@ int do_truncate(struct dentry *dentry, loff_t length, unsigned int time_attrs,
 long vfs_truncate(const struct path *path, loff_t length)
 {
 	struct inode *inode;
-	struct dentry *upperdentry;
 	long error;
 
 	inode = path->dentry->d_inode;
@@ -91,17 +90,7 @@ long vfs_truncate(const struct path *path, loff_t length)
 	if (IS_APPEND(inode))
 		goto mnt_drop_write_and_out;
 
-	/*
-	 * If this is an overlayfs then do as if opening the file so we get
-	 * write access on the upper inode, not on the overlay inode.  For
-	 * non-overlay filesystems d_real() is an identity function.
-	 */
-	upperdentry = d_real(path->dentry, NULL, O_WRONLY);
-	error = PTR_ERR(upperdentry);
-	if (IS_ERR(upperdentry))
-		goto mnt_drop_write_and_out;
-
-	error = get_write_access(upperdentry->d_inode);
+	error = get_write_access(inode);
 	if (error)
 		goto mnt_drop_write_and_out;
 
@@ -120,7 +109,7 @@ long vfs_truncate(const struct path *path, loff_t length)
 		error = do_truncate(path->dentry, length, 0, NULL);
 
 put_write_and_out:
-	put_write_access(upperdentry->d_inode);
+	put_write_access(inode);
 mnt_drop_write_and_out:
 	mnt_drop_write(path->mnt);
 out:
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (30 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 31/35] Revert "vfs: do get_write_access() on upper layer of overlayfs" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-08 15:15   ` Jeff Layton
  2018-05-07  8:38 ` [PATCH v2 33/35] Revert "fsnotify: support overlayfs" Miklos Szeredi
                   ` (2 subsequent siblings)
  34 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This partially reverts commit c568d68341be7030f5647def68851e469b21ca11.

Overlayfs files will now automatically get the correct locks, no need to
hack overlay support in VFS.

It is a partial revert, because it leaves the locks_inode() calls in place
and defines locks_inode() to file_inode().  We could revert those as well,
but it would be unnecessary code churn and it makes sense to document that
we are getting the inode for locking purposes.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/locks.c              | 17 ++++++-----------
 fs/overlayfs/super.c    |  2 +-
 include/linux/fs.h      | 13 +------------
 include/uapi/linux/fs.h |  1 -
 4 files changed, 8 insertions(+), 25 deletions(-)

diff --git a/fs/locks.c b/fs/locks.c
index 9c0e5f3da66c..40bcbaaa3f52 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -139,11 +139,6 @@
 #define IS_OFDLCK(fl)	(fl->fl_flags & FL_OFDLCK)
 #define IS_REMOTELCK(fl)	(fl->fl_pid <= 0)
 
-static inline bool is_remote_lock(struct file *filp)
-{
-	return likely(!(filp->f_path.dentry->d_sb->s_flags & SB_NOREMOTELOCK));
-}
-
 static bool lease_breaking(struct file_lock *fl)
 {
 	return fl->fl_flags & (FL_UNLOCK_PENDING | FL_DOWNGRADE_PENDING);
@@ -1875,7 +1870,7 @@ EXPORT_SYMBOL(generic_setlease);
 int
 vfs_setlease(struct file *filp, long arg, struct file_lock **lease, void **priv)
 {
-	if (filp->f_op->setlease && is_remote_lock(filp))
+	if (filp->f_op->setlease)
 		return filp->f_op->setlease(filp, arg, lease, priv);
 	else
 		return generic_setlease(filp, arg, lease, priv);
@@ -2022,7 +2017,7 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
 	if (error)
 		goto out_free;
 
-	if (f.file->f_op->flock && is_remote_lock(f.file))
+	if (f.file->f_op->flock)
 		error = f.file->f_op->flock(f.file,
 					  (can_sleep) ? F_SETLKW : F_SETLK,
 					  lock);
@@ -2048,7 +2043,7 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
  */
 int vfs_test_lock(struct file *filp, struct file_lock *fl)
 {
-	if (filp->f_op->lock && is_remote_lock(filp))
+	if (filp->f_op->lock)
 		return filp->f_op->lock(filp, F_GETLK, fl);
 	posix_test_lock(filp, fl);
 	return 0;
@@ -2191,7 +2186,7 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
  */
 int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
 {
-	if (filp->f_op->lock && is_remote_lock(filp))
+	if (filp->f_op->lock)
 		return filp->f_op->lock(filp, cmd, fl);
 	else
 		return posix_lock_file(filp, fl, conf);
@@ -2513,7 +2508,7 @@ locks_remove_flock(struct file *filp, struct file_lock_context *flctx)
 	if (list_empty(&flctx->flc_flock))
 		return;
 
-	if (filp->f_op->flock && is_remote_lock(filp))
+	if (filp->f_op->flock)
 		filp->f_op->flock(filp, F_SETLKW, &fl);
 	else
 		flock_lock_inode(inode, &fl);
@@ -2600,7 +2595,7 @@ EXPORT_SYMBOL(posix_unblock_lock);
  */
 int vfs_cancel_lock(struct file *filp, struct file_lock *fl)
 {
-	if (filp->f_op->lock && is_remote_lock(filp))
+	if (filp->f_op->lock)
 		return filp->f_op->lock(filp, F_CANCELLK, fl);
 	return 0;
 }
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index 7779fc610767..f2a83fabe2eb 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -1479,7 +1479,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
 	sb->s_op = &ovl_super_operations;
 	sb->s_xattr = ovl_xattr_handlers;
 	sb->s_fs_info = ofs;
-	sb->s_flags |= SB_POSIXACL | SB_NOREMOTELOCK;
+	sb->s_flags |= SB_POSIXACL;
 
 	err = -ENOMEM;
 	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
diff --git a/include/linux/fs.h b/include/linux/fs.h
index ba3078693d4a..417e692a606a 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1050,17 +1050,7 @@ struct file_lock_context {
 
 extern void send_sigio(struct fown_struct *fown, int fd, int band);
 
-/*
- * Return the inode to use for locking
- *
- * For overlayfs this should be the overlay inode, not the real inode returned
- * by file_inode().  For any other fs file_inode(filp) and locks_inode(filp) are
- * equal.
- */
-static inline struct inode *locks_inode(const struct file *f)
-{
-	return f->f_path.dentry->d_inode;
-}
+#define locks_inode(f) file_inode(f)
 
 #ifdef CONFIG_FILE_LOCKING
 extern int fcntl_getlk(struct file *, unsigned int, struct flock *);
@@ -1300,7 +1290,6 @@ extern int send_sigurg(struct fown_struct *fown);
 
 /* These sb flags are internal to the kernel */
 #define SB_SUBMOUNT     (1<<26)
-#define SB_NOREMOTELOCK	(1<<27)
 #define SB_NOSEC	(1<<28)
 #define SB_BORN		(1<<29)
 #define SB_ACTIVE	(1<<30)
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index d2a8313fabd7..2840ddcece73 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -134,7 +134,6 @@ struct inodes_stat_t {
 
 /* These sb flags are internal to the kernel */
 #define MS_SUBMOUNT     (1<<26)
-#define MS_NOREMOTELOCK	(1<<27)
 #define MS_NOSEC	(1<<28)
 #define MS_BORN		(1<<29)
 #define MS_ACTIVE	(1<<30)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 33/35] Revert "fsnotify: support overlayfs"
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (31 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 34/35] vfs: remove open_flags from d_real() Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 35/35] ovl: fix documentation of non-standard behavior Miklos Szeredi
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

This reverts commit f3fbbb079263bd29ae592478de6808db7e708267.

Overlayfs now works correctly without adding hacks to fsnotify.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 include/linux/fsnotify.h | 14 +++++---------
 1 file changed, 5 insertions(+), 9 deletions(-)

diff --git a/include/linux/fsnotify.h b/include/linux/fsnotify.h
index bdaf22582f6e..fd1ce10553bf 100644
--- a/include/linux/fsnotify.h
+++ b/include/linux/fsnotify.h
@@ -30,11 +30,7 @@ static inline int fsnotify_parent(const struct path *path, struct dentry *dentry
 static inline int fsnotify_perm(struct file *file, int mask)
 {
 	const struct path *path = &file->f_path;
-	/*
-	 * Do not use file_inode() here or anywhere in this file to get the
-	 * inode.  That would break *notity on overlayfs.
-	 */
-	struct inode *inode = path->dentry->d_inode;
+	struct inode *inode = file_inode(file);
 	__u32 fsnotify_mask = 0;
 	int ret;
 
@@ -178,7 +174,7 @@ static inline void fsnotify_mkdir(struct inode *inode, struct dentry *dentry)
 static inline void fsnotify_access(struct file *file)
 {
 	const struct path *path = &file->f_path;
-	struct inode *inode = path->dentry->d_inode;
+	struct inode *inode = file_inode(file);
 	__u32 mask = FS_ACCESS;
 
 	if (S_ISDIR(inode->i_mode))
@@ -196,7 +192,7 @@ static inline void fsnotify_access(struct file *file)
 static inline void fsnotify_modify(struct file *file)
 {
 	const struct path *path = &file->f_path;
-	struct inode *inode = path->dentry->d_inode;
+	struct inode *inode = file_inode(file);
 	__u32 mask = FS_MODIFY;
 
 	if (S_ISDIR(inode->i_mode))
@@ -214,7 +210,7 @@ static inline void fsnotify_modify(struct file *file)
 static inline void fsnotify_open(struct file *file)
 {
 	const struct path *path = &file->f_path;
-	struct inode *inode = path->dentry->d_inode;
+	struct inode *inode = file_inode(file);
 	__u32 mask = FS_OPEN;
 
 	if (S_ISDIR(inode->i_mode))
@@ -230,7 +226,7 @@ static inline void fsnotify_open(struct file *file)
 static inline void fsnotify_close(struct file *file)
 {
 	const struct path *path = &file->f_path;
-	struct inode *inode = path->dentry->d_inode;
+	struct inode *inode = file_inode(file);
 	fmode_t mode = file->f_mode;
 	__u32 mask = (mode & FMODE_WRITE) ? FS_CLOSE_WRITE : FS_CLOSE_NOWRITE;
 
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 34/35] vfs: remove open_flags from d_real()
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (32 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 33/35] Revert "fsnotify: support overlayfs" Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  2018-05-07  8:38 ` [PATCH v2 35/35] ovl: fix documentation of non-standard behavior Miklos Szeredi
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

Opening regular files on overlayfs is now handled via ovl_open().  Remove
the now unused "open_flags" argument from d_op->d_real() and the d_real()
helper.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/Locking |  3 +--
 Documentation/filesystems/vfs.txt | 16 ++++------------
 fs/overlayfs/super.c              | 36 +++---------------------------------
 include/linux/dcache.h            | 11 ++++-------
 include/linux/fs.h                |  2 +-
 5 files changed, 13 insertions(+), 55 deletions(-)

diff --git a/Documentation/filesystems/Locking b/Documentation/filesystems/Locking
index a4afe96f0112..e1d7e43d302c 100644
--- a/Documentation/filesystems/Locking
+++ b/Documentation/filesystems/Locking
@@ -21,8 +21,7 @@ prototypes:
 	char *(*d_dname)((struct dentry *dentry, char *buffer, int buflen);
 	struct vfsmount *(*d_automount)(struct path *path);
 	int (*d_manage)(const struct path *, bool);
-	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int);
+	struct dentry *(*d_real)(struct dentry *, const struct inode *);
 
 locking rules:
 		rename_lock	->d_lock	may block	rcu-walk
diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
index af54d3651ff8..8b03c5e675bf 100644
--- a/Documentation/filesystems/vfs.txt
+++ b/Documentation/filesystems/vfs.txt
@@ -990,8 +990,7 @@ struct dentry_operations {
 	char *(*d_dname)(struct dentry *, char *, int);
 	struct vfsmount *(*d_automount)(struct path *);
 	int (*d_manage)(const struct path *, bool);
-	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int);
+	struct dentry *(*d_real)(struct dentry *, const struct inode *);
 };
 
   d_revalidate: called when the VFS needs to revalidate a dentry. This
@@ -1125,22 +1124,15 @@ struct dentry_operations {
 	dentry being transited from.
 
   d_real: overlay/union type filesystems implement this method to return one of
-	the underlying dentries hidden by the overlay.  It is used in three
+	the underlying dentries hidden by the overlay.  It is used in two
 	different modes:
 
-	Called from open it may need to copy-up the file depending on the
-	supplied open flags.  This mode is selected with a non-zero flags
-	argument.  In this mode the d_real method can return an error.
-
 	Called from file_dentry() it returns the real dentry matching the inode
 	argument.  The real dentry may be from a lower layer already copied up,
 	but still referenced from the file.  This mode is selected with a
-	non-NULL inode argument.  This will always succeed.
-
-	With NULL inode and zero flags the topmost real underlying dentry is
-	returned.  This will always succeed.
+	non-NULL inode argument.
 
-	This method is never called with both non-NULL inode and non-zero flags.
+	With NULL inode the topmost real underlying dentry is returned.
 
 Each dentry has a pointer to its parent dentry, as well as a hash list
 of child dentries. Child dentries are basically like files in a
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
index f2a83fabe2eb..492d534058ae 100644
--- a/fs/overlayfs/super.c
+++ b/fs/overlayfs/super.c
@@ -80,28 +80,10 @@ static void ovl_dentry_release(struct dentry *dentry)
 	}
 }
 
-static int ovl_check_append_only(struct inode *inode, int flag)
-{
-	/*
-	 * This test was moot in vfs may_open() because overlay inode does
-	 * not have the S_APPEND flag, so re-check on real upper inode
-	 */
-	if (IS_APPEND(inode)) {
-		if  ((flag & O_ACCMODE) != O_RDONLY && !(flag & O_APPEND))
-			return -EPERM;
-		if (flag & O_TRUNC)
-			return -EPERM;
-	}
-
-	return 0;
-}
-
 static struct dentry *ovl_d_real(struct dentry *dentry,
-				 const struct inode *inode,
-				 unsigned int open_flags)
+				 const struct inode *inode)
 {
 	struct dentry *real;
-	int err;
 
 	/* It's an overlay file */
 	if (inode && d_inode(dentry) == inode)
@@ -113,28 +95,16 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
 		goto bug;
 	}
 
-	if (open_flags) {
-		err = ovl_open_maybe_copy_up(dentry, open_flags);
-		if (err)
-			return ERR_PTR(err);
-	}
-
 	real = ovl_dentry_upper(dentry);
-	if (real && (!inode || inode == d_inode(real))) {
-		if (!inode) {
-			err = ovl_check_append_only(d_inode(real), open_flags);
-			if (err)
-				return ERR_PTR(err);
-		}
+	if (real && (!inode || inode == d_inode(real)))
 		return real;
-	}
 
 	real = ovl_dentry_lower(dentry);
 	if (!real)
 		goto bug;
 
 	/* Handle recursion */
-	real = d_real(real, inode, open_flags);
+	real = d_real(real, inode);
 
 	if (!inode || inode == d_inode(real))
 		return real;
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index 58fcc66ddccd..2458b18b9356 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -145,8 +145,7 @@ struct dentry_operations {
 	char *(*d_dname)(struct dentry *, char *, int);
 	struct vfsmount *(*d_automount)(struct path *);
 	int (*d_manage)(const struct path *, bool);
-	struct dentry *(*d_real)(struct dentry *, const struct inode *,
-				 unsigned int);
+	struct dentry *(*d_real)(struct dentry *, const struct inode *);
 } ____cacheline_aligned;
 
 /*
@@ -567,7 +566,6 @@ static inline struct dentry *d_backing_dentry(struct dentry *upper)
  * d_real - Return the real dentry
  * @dentry: the dentry to query
  * @inode: inode to select the dentry from multiple layers (can be NULL)
- * @flags: open flags to control copy-up behavior
  *
  * If dentry is on a union/overlay, then return the underlying, real dentry.
  * Otherwise return the dentry itself.
@@ -575,11 +573,10 @@ static inline struct dentry *d_backing_dentry(struct dentry *upper)
  * See also: Documentation/filesystems/vfs.txt
  */
 static inline struct dentry *d_real(struct dentry *dentry,
-				    const struct inode *inode,
-				    unsigned int flags)
+				    const struct inode *inode)
 {
 	if (unlikely(dentry->d_flags & DCACHE_OP_REAL))
-		return dentry->d_op->d_real(dentry, inode, flags);
+		return dentry->d_op->d_real(dentry, inode);
 	else
 		return dentry;
 }
@@ -594,7 +591,7 @@ static inline struct dentry *d_real(struct dentry *dentry,
 static inline struct inode *d_real_inode(const struct dentry *dentry)
 {
 	/* This usage of d_real() results in const dentry */
-	return d_backing_inode(d_real((struct dentry *) dentry, NULL, 0));
+	return d_backing_inode(d_real((struct dentry *) dentry, NULL));
 }
 
 struct name_snapshot {
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 417e692a606a..8b8d793b3774 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1234,7 +1234,7 @@ static inline struct inode *file_inode(const struct file *f)
 
 static inline struct dentry *file_dentry(const struct file *file)
 {
-	return d_real(file->f_path.dentry, file_inode(file), 0);
+	return d_real(file->f_path.dentry, file_inode(file));
 }
 
 static inline int locks_lock_file_wait(struct file *filp, struct file_lock *fl)
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* [PATCH v2 35/35] ovl: fix documentation of non-standard behavior
  2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
                   ` (33 preceding siblings ...)
  2018-05-07  8:38 ` [PATCH v2 34/35] vfs: remove open_flags from d_real() Miklos Szeredi
@ 2018-05-07  8:38 ` Miklos Szeredi
  34 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07  8:38 UTC (permalink / raw)
  To: linux-unionfs; +Cc: linux-fsdevel, linux-kernel

We can now drop description of the ro/rw inconsistency from the
documentation.

Also clarify, that now fully standard compliant behavior can be enabled
with kernel/module/mount options.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 Documentation/filesystems/overlayfs.txt | 60 +++++++++++++++++++++------------
 1 file changed, 39 insertions(+), 21 deletions(-)

diff --git a/Documentation/filesystems/overlayfs.txt b/Documentation/filesystems/overlayfs.txt
index 961b287ef323..97eae826adf9 100644
--- a/Documentation/filesystems/overlayfs.txt
+++ b/Documentation/filesystems/overlayfs.txt
@@ -10,10 +10,6 @@ union-filesystems).  An overlay-filesystem tries to present a
 filesystem which is the result over overlaying one filesystem on top
 of the other.
 
-The result will inevitably fail to look exactly like a normal
-filesystem for various technical reasons.  The expectation is that
-many use cases will be able to ignore these differences.
-
 
 Overlay objects
 ---------------
@@ -306,27 +302,49 @@ the copied layers will fail the verification of the lower root file handle.
 Non-standard behavior
 ---------------------
 
-The copy_up operation essentially creates a new, identical file and
-moves it over to the old name.  Any open files referring to this inode
-will access the old data.
+Overlayfs can now act as a POSIX compliant filesystem with the following
+features turned on:
+
+1) "redirect_dir"
+
+Enabled with the mount option or module option: "redirect_dir=on" or with
+the kernel config option CONFIG_OVERLAY_FS_REDIRECT_DIR=y.
+
+If this feature is disabled, then rename(2) on a lower or merged directory
+will fail with EXDEV ("Invalid cross-device link").
+
+2) "inode index"
+
+Enabled with the mount option or module option "index=on" or with the
+kernel config option CONFIG_OVERLAY_FS_INDEX=y.
+
+If this feature is disabled and a file with multiple hard links is copied
+up, then this will "break" the link.  Changes will not be propagated to
+other names referring to the same inode.
+
+3) "xino"
+
+Enabled with the mount option "xino=auto" or "xino=on", with the module
+option "xino_auto=on" or with the kernel config option
+CONFIG_OVERLAY_FS_XINO_AUTO=y.  Also implicitly enabled by using the same
+underlying filesystem for all layers making up the overlay.
 
-The new file may be on a different filesystem, so both st_dev and st_ino
-of the real file may change.  The values of st_dev and st_ino returned by
-stat(2) on an overlay object are often not the same as the real file
-stat(2) values to prevent the values from changing on copy_up.
+If this feature is disabled or the underlying filesystem doesn't have
+enough free bits in the inode number, then overlayfs will not be able to
+guarantee that the values of st_ino and st_dev returned by stat(2) and the
+value of d_ino returned by readdir(3) will act like on a normal filesystem.
+E.g. the value of st_dev may be different for two objects in the same
+overlay filesystem and the value of st_ino for directory objects may not be
+persistent and could change even while the overlay filesystem is mounted.
 
-Unless "xino" feature is enabled, when overlay layers are not all on the
-same underlying filesystem, the value of st_dev may be different for two
-non-directory objects in the same overlay filesystem and the value of
-st_ino for directory objects may be non persistent and could change even
-while the overlay filesystem is still mounted.
+4) "copy_up_shared"
 
-Unless "inode index" feature is enabled, if a file with multiple hard
-links is copied up, then this will "break" the link.  Changes will not be
-propagated to other names referring to the same inode.
+Enabled with the mount option or module option "copy_up_shared=on" or with
+the kernel config option CONFIG_OVERLAY_FS_COPY_UP_SHARED=y.
 
-Unless "redirect_dir" feature is enabled, rename(2) on a lower or merged
-directory will fail with EXDEV.
+If this feature is disabled, then a memory mapping created with MAP_SHARED
+might contain stale data if the file has been copied up and modified in the
+meantime.
 
 
 Changes to underlying filesystems
-- 
2.14.3

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-07  8:37 ` [PATCH v2 22/35] vfs: don't open real Miklos Szeredi
@ 2018-05-07 10:27   ` Amir Goldstein
  2018-05-07 10:29     ` Miklos Szeredi
  2018-05-11 18:54   ` Vivek Goyal
  1 sibling, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2018-05-07 10:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, linux-fsdevel, linux-kernel, Al Viro

On Mon, May 7, 2018 at 11:37 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
> Let overlayfs do its thing when opening a file.
>
> This enables stacking and fixes the corner case when a file is opened for
> read, modified through a writable open, and data is read from the read-only
> file.  After this patch the read-only open will not return stale data even
> in this case.
>

So now you can get rid of ovl_do_check_copy_up() and the check_copy_up
module param ;-)

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-07 10:27   ` Amir Goldstein
@ 2018-05-07 10:29     ` Miklos Szeredi
  0 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-07 10:29 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: overlayfs, linux-fsdevel, linux-kernel, Al Viro

On Mon, May 7, 2018 at 12:27 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, May 7, 2018 at 11:37 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
>> Let overlayfs do its thing when opening a file.
>>
>> This enables stacking and fixes the corner case when a file is opened for
>> read, modified through a writable open, and data is read from the read-only
>> file.  After this patch the read-only open will not return stale data even
>> in this case.
>>
>
> So now you can get rid of ovl_do_check_copy_up() and the check_copy_up
> module param ;-)

Ah, forgot about that one.   Indeed.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real()
  2018-05-07  8:37 ` [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real() Miklos Szeredi
@ 2018-05-07 13:17   ` Vivek Goyal
  0 siblings, 0 replies; 53+ messages in thread
From: Vivek Goyal @ 2018-05-07 13:17 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel, linux-kernel

On Mon, May 07, 2018 at 10:37:42AM +0200, Miklos Szeredi wrote:
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/super.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index e8551c97de51..ad6a5baf226b 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -97,6 +97,10 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>  	struct dentry *real;
>  	int err;
>  
> +	/* It's an overlay file */
> +	if (inode && d_inode(dentry) == inode)
> +		return dentry;
> +

Hi Miklos,

inode == d_inode(dentry) check is being done again in following code. We
probably can get rid of it now.

        if (!d_is_reg(dentry)) {
                if (!inode || inode == d_inode(dentry))
                        return dentry;
                goto bug;
        }

Vivek

>  	if (flags & D_REAL_UPPER)
>  		return ovl_dentry_upper(dentry);
>  
> -- 
> 2.14.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 23/35] ovl: copy-up on MAP_SHARED
  2018-05-07  8:37 ` [PATCH v2 23/35] ovl: copy-up on MAP_SHARED Miklos Szeredi
@ 2018-05-07 19:28   ` Randy Dunlap
  2018-05-08 15:03     ` Miklos Szeredi
  0 siblings, 1 reply; 53+ messages in thread
From: Randy Dunlap @ 2018-05-07 19:28 UTC (permalink / raw)
  To: Miklos Szeredi, linux-unionfs; +Cc: linux-fsdevel, linux-kernel

On 05/07/2018 01:37 AM, Miklos Szeredi wrote:
> A corner case of a corner case is when
> 
>  - file opened for O_RDONLY
>  - which is then memory mapped SHARED
>  - file opened for O_WRONLY
>  - contents modified
>  - contents read back though the shared mapping
> 
> Unfortunately it looks very difficult to do anything about the established
> shared map after the file is copied up.
> 
> Instead, when a read-only file is mapped shared, copy up the file before
> actually doing the map.  This may result in unnecessary copy-ups (but so
> may copy-up on open(O_RDWR) for exampe).

                              for example).

> 
> We can revisit this later if it turns out to be a performance problem in
> real life.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/Kconfig     | 21 +++++++++++++++++++++
>  fs/overlayfs/file.c      | 22 ++++++++++++++++++++++
>  fs/overlayfs/overlayfs.h |  7 +++++++
>  fs/overlayfs/ovl_entry.h |  1 +
>  fs/overlayfs/super.c     | 22 ++++++++++++++++++++++
>  5 files changed, 73 insertions(+)
> 
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 17032631c5cf..991c0a5a0e00 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -103,3 +103,24 @@ config OVERLAY_FS_XINO_AUTO
>  	  For more information, see Documentation/filesystems/overlayfs.txt
>  
>  	  If unsure, say N.
> +
> +config OVERLAY_FS_COPY_UP_SHARED
> +       bool "Overlayfs: copy up when mapping a file shared"

	                                        a shared file" ??

> +       default n
> +       depends on OVERLAY_FS
> +       help
> +         If this option is enabled then on mapping a file with MAP_SHARED
> +	 overlayfs copies up the file in anticipation of it being modified (just
> +	 like we copy up the file on O_WRONLY and O_RDWR in anticipation of
> +	 modification).  This does not interfere with shared library loading, as
> +	 that uses MAP_PRIVATE.  But there might be use cases out there where
> +	 this impacts performance and disk usage.
> +
> +	 This just selects the default, the feature can also be enabled or
> +	 disabled in the running kernel or individually on each overlay mount.
> +
> +	 To get maximally standard compliant behavior, enable this option.
> +
> +	 To get a maximally backward compatible kernel, disable this option.
> +
> +	 If unsure, say N.

For Kconfig (coding-style.rst):
Lines under a ``config`` definition are indented with one tab, while help text
is indented an additional two spaces.


-- 
~Randy

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support
  2018-05-07  8:37 ` [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support Miklos Szeredi
@ 2018-05-07 20:43   ` Darrick J. Wong
  2018-05-08 14:13     ` Miklos Szeredi
  0 siblings, 1 reply; 53+ messages in thread
From: Darrick J. Wong @ 2018-05-07 20:43 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs, linux-fsdevel, linux-kernel

On Mon, May 07, 2018 at 10:37:53AM +0200, Miklos Szeredi wrote:
> Since set of arguments are so similar, handle in a common helper.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/file.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 88 insertions(+)
> 
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index ce871a15e185..2ac95c95e8e6 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -382,6 +382,90 @@ static long ovl_compat_ioctl(struct file *file, unsigned int cmd,
>  	return ovl_ioctl(file, cmd, arg);
>  }
>  
> +enum ovl_copyop {
> +	OVL_COPY,
> +	OVL_CLONE,
> +	OVL_DEDUPE,
> +};
> +
> +static s64 ovl_copyfile(struct file *file_in, loff_t pos_in,
> +			struct file *file_out, loff_t pos_out,
> +			u64 len, unsigned int flags, enum ovl_copyop op)
> +{
> +	struct inode *inode_out = file_inode(file_out);
> +	struct fd real_in, real_out;
> +	const struct cred *old_cred;
> +	s64 ret;
> +
> +	ret = ovl_real_fdget(file_out, &real_out);
> +	if (ret)
> +		return ret;
> +
> +	ret = ovl_real_fdget(file_in, &real_in);
> +	if (ret) {
> +		fdput(real_out);
> +		return ret;
> +	}
> +
> +	old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
> +	switch (op) {
> +	case OVL_COPY:
> +		ret = vfs_copy_file_range(real_in.file, pos_in,
> +					  real_out.file, pos_out, len, flags);
> +		break;
> +
> +	case OVL_CLONE:
> +		ret = vfs_clone_file_range(real_in.file, pos_in,
> +					   real_out.file, pos_out, len);
> +		break;
> +
> +	case OVL_DEDUPE:
> +		ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
> +						real_out.file, pos_out, len);
> +		break;
> +	}
> +	revert_creds(old_cred);
> +
> +	/* Update size */
> +	ovl_copyattr(ovl_inode_real(inode_out), inode_out);
> +
> +	fdput(real_in);
> +	fdput(real_out);
> +
> +	return ret;
> +}
> +
> +static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> +				   struct file *file_out, loff_t pos_out,
> +				   size_t len, unsigned int flags)
> +{
> +	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
> +			    OVL_COPY);
> +}
> +
> +static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> +				struct file *file_out, loff_t pos_out, u64 len)
> +{
> +	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> +			    OVL_CLONE);
> +}
> +
> +static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> +				 struct file *file_out, loff_t pos_out,
> +				 u64 len)
> +{
> +	/*
> +	 * Don't copy up because of a dedupe request, this wouldn't make sense
> +	 * most of the time (data would be duplicated instead of deduplicated).
> +	 */
> +	if (!ovl_inode_upper(file_inode(file_in)) ||
> +	    !ovl_inode_upper(file_inode(file_out)))
> +		return -EPERM;

/me wonders, why not EOPNOTSUPP?  That's what we've been using (in xfs
anyway) for "filesystem doesn't want to let you do this".

(Or I guess EXDEV, but "cross-device link not supported" might not be
quite what you want users to see...)

--D

> +
> +	return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> +			    OVL_DEDUPE);
> +}
> +
>  const struct file_operations ovl_file_operations = {
>  	.open		= ovl_open,
>  	.release	= ovl_release,
> @@ -393,4 +477,8 @@ const struct file_operations ovl_file_operations = {
>  	.fallocate	= ovl_fallocate,
>  	.unlocked_ioctl	= ovl_ioctl,
>  	.compat_ioctl	= ovl_compat_ioctl,
> +
> +	.copy_file_range	= ovl_copy_file_range,
> +	.clone_file_range	= ovl_clone_file_range,
> +	.dedupe_file_range	= ovl_dedupe_file_range,
>  };
> -- 
> 2.14.3
> 

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 15/35] ovl: add ovl_fsync()
  2018-05-07  8:37 ` [PATCH v2 15/35] ovl: add ovl_fsync() Miklos Szeredi
@ 2018-05-08  5:14   ` Amir Goldstein
  2018-05-08 14:57     ` Miklos Szeredi
  0 siblings, 1 reply; 53+ messages in thread
From: Amir Goldstein @ 2018-05-08  5:14 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: overlayfs, linux-fsdevel, linux-kernel

On Mon, May 7, 2018 at 11:37 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
> Implement stacked fsync().
>
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/overlayfs/file.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
>
> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> index a7af56861aa5..419aa3f9967b 100644
> --- a/fs/overlayfs/file.c
> +++ b/fs/overlayfs/file.c
> @@ -233,10 +233,30 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>         return ret;
>  }
>
> +static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
> +{
> +       struct fd real;
> +       const struct cred *old_cred;
> +       int ret;
> +

Don't sync non-upper. same as ovl_dir_fsync()

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support
  2018-05-07 20:43   ` Darrick J. Wong
@ 2018-05-08 14:13     ` Miklos Szeredi
  2018-05-08 14:38       ` Darrick J. Wong
  0 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-08 14:13 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Mon, May 7, 2018 at 10:43 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Mon, May 07, 2018 at 10:37:53AM +0200, Miklos Szeredi wrote:
>> Since set of arguments are so similar, handle in a common helper.
>>
>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> ---
>>  fs/overlayfs/file.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
>>  1 file changed, 88 insertions(+)
>>
>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>> index ce871a15e185..2ac95c95e8e6 100644
>> --- a/fs/overlayfs/file.c
>> +++ b/fs/overlayfs/file.c
>> @@ -382,6 +382,90 @@ static long ovl_compat_ioctl(struct file *file, unsigned int cmd,
>>       return ovl_ioctl(file, cmd, arg);
>>  }
>>
>> +enum ovl_copyop {
>> +     OVL_COPY,
>> +     OVL_CLONE,
>> +     OVL_DEDUPE,
>> +};
>> +
>> +static s64 ovl_copyfile(struct file *file_in, loff_t pos_in,
>> +                     struct file *file_out, loff_t pos_out,
>> +                     u64 len, unsigned int flags, enum ovl_copyop op)
>> +{
>> +     struct inode *inode_out = file_inode(file_out);
>> +     struct fd real_in, real_out;
>> +     const struct cred *old_cred;
>> +     s64 ret;
>> +
>> +     ret = ovl_real_fdget(file_out, &real_out);
>> +     if (ret)
>> +             return ret;
>> +
>> +     ret = ovl_real_fdget(file_in, &real_in);
>> +     if (ret) {
>> +             fdput(real_out);
>> +             return ret;
>> +     }
>> +
>> +     old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
>> +     switch (op) {
>> +     case OVL_COPY:
>> +             ret = vfs_copy_file_range(real_in.file, pos_in,
>> +                                       real_out.file, pos_out, len, flags);
>> +             break;
>> +
>> +     case OVL_CLONE:
>> +             ret = vfs_clone_file_range(real_in.file, pos_in,
>> +                                        real_out.file, pos_out, len);
>> +             break;
>> +
>> +     case OVL_DEDUPE:
>> +             ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
>> +                                             real_out.file, pos_out, len);
>> +             break;
>> +     }
>> +     revert_creds(old_cred);
>> +
>> +     /* Update size */
>> +     ovl_copyattr(ovl_inode_real(inode_out), inode_out);
>> +
>> +     fdput(real_in);
>> +     fdput(real_out);
>> +
>> +     return ret;
>> +}
>> +
>> +static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
>> +                                struct file *file_out, loff_t pos_out,
>> +                                size_t len, unsigned int flags)
>> +{
>> +     return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
>> +                         OVL_COPY);
>> +}
>> +
>> +static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
>> +                             struct file *file_out, loff_t pos_out, u64 len)
>> +{
>> +     return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
>> +                         OVL_CLONE);
>> +}
>> +
>> +static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
>> +                              struct file *file_out, loff_t pos_out,
>> +                              u64 len)
>> +{
>> +     /*
>> +      * Don't copy up because of a dedupe request, this wouldn't make sense
>> +      * most of the time (data would be duplicated instead of deduplicated).
>> +      */
>> +     if (!ovl_inode_upper(file_inode(file_in)) ||
>> +         !ovl_inode_upper(file_inode(file_out)))
>> +             return -EPERM;
>
> /me wonders, why not EOPNOTSUPP?  That's what we've been using (in xfs
> anyway) for "filesystem doesn't want to let you do this".

EOPNOTSUPP might be interpreted as "this filesystem doesn't support
dedupe", even though here it's just "these two particular files don't
support dedupe".

>
> (Or I guess EXDEV, but "cross-device link not supported" might not be
> quite what you want users to see...)

Hmm, I like EPERM better.   EPERM means something like  "you can't do
this for some unspecified reason".  This is exactly the case here.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support
  2018-05-08 14:13     ` Miklos Szeredi
@ 2018-05-08 14:38       ` Darrick J. Wong
  0 siblings, 0 replies; 53+ messages in thread
From: Darrick J. Wong @ 2018-05-08 14:38 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Tue, May 08, 2018 at 04:13:01PM +0200, Miklos Szeredi wrote:
> On Mon, May 7, 2018 at 10:43 PM, Darrick J. Wong
> <darrick.wong@oracle.com> wrote:
> > On Mon, May 07, 2018 at 10:37:53AM +0200, Miklos Szeredi wrote:
> >> Since set of arguments are so similar, handle in a common helper.
> >>
> >> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> >> ---
> >>  fs/overlayfs/file.c | 88 +++++++++++++++++++++++++++++++++++++++++++++++++++++
> >>  1 file changed, 88 insertions(+)
> >>
> >> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
> >> index ce871a15e185..2ac95c95e8e6 100644
> >> --- a/fs/overlayfs/file.c
> >> +++ b/fs/overlayfs/file.c
> >> @@ -382,6 +382,90 @@ static long ovl_compat_ioctl(struct file *file, unsigned int cmd,
> >>       return ovl_ioctl(file, cmd, arg);
> >>  }
> >>
> >> +enum ovl_copyop {
> >> +     OVL_COPY,
> >> +     OVL_CLONE,
> >> +     OVL_DEDUPE,
> >> +};
> >> +
> >> +static s64 ovl_copyfile(struct file *file_in, loff_t pos_in,
> >> +                     struct file *file_out, loff_t pos_out,
> >> +                     u64 len, unsigned int flags, enum ovl_copyop op)
> >> +{
> >> +     struct inode *inode_out = file_inode(file_out);
> >> +     struct fd real_in, real_out;
> >> +     const struct cred *old_cred;
> >> +     s64 ret;
> >> +
> >> +     ret = ovl_real_fdget(file_out, &real_out);
> >> +     if (ret)
> >> +             return ret;
> >> +
> >> +     ret = ovl_real_fdget(file_in, &real_in);
> >> +     if (ret) {
> >> +             fdput(real_out);
> >> +             return ret;
> >> +     }
> >> +
> >> +     old_cred = ovl_override_creds(file_inode(file_out)->i_sb);
> >> +     switch (op) {
> >> +     case OVL_COPY:
> >> +             ret = vfs_copy_file_range(real_in.file, pos_in,
> >> +                                       real_out.file, pos_out, len, flags);
> >> +             break;
> >> +
> >> +     case OVL_CLONE:
> >> +             ret = vfs_clone_file_range(real_in.file, pos_in,
> >> +                                        real_out.file, pos_out, len);
> >> +             break;
> >> +
> >> +     case OVL_DEDUPE:
> >> +             ret = vfs_dedupe_file_range_one(real_in.file, pos_in,
> >> +                                             real_out.file, pos_out, len);
> >> +             break;
> >> +     }
> >> +     revert_creds(old_cred);
> >> +
> >> +     /* Update size */
> >> +     ovl_copyattr(ovl_inode_real(inode_out), inode_out);
> >> +
> >> +     fdput(real_in);
> >> +     fdput(real_out);
> >> +
> >> +     return ret;
> >> +}
> >> +
> >> +static ssize_t ovl_copy_file_range(struct file *file_in, loff_t pos_in,
> >> +                                struct file *file_out, loff_t pos_out,
> >> +                                size_t len, unsigned int flags)
> >> +{
> >> +     return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, flags,
> >> +                         OVL_COPY);
> >> +}
> >> +
> >> +static int ovl_clone_file_range(struct file *file_in, loff_t pos_in,
> >> +                             struct file *file_out, loff_t pos_out, u64 len)
> >> +{
> >> +     return ovl_copyfile(file_in, pos_in, file_out, pos_out, len, 0,
> >> +                         OVL_CLONE);
> >> +}
> >> +
> >> +static s64 ovl_dedupe_file_range(struct file *file_in, loff_t pos_in,
> >> +                              struct file *file_out, loff_t pos_out,
> >> +                              u64 len)
> >> +{
> >> +     /*
> >> +      * Don't copy up because of a dedupe request, this wouldn't make sense
> >> +      * most of the time (data would be duplicated instead of deduplicated).
> >> +      */
> >> +     if (!ovl_inode_upper(file_inode(file_in)) ||
> >> +         !ovl_inode_upper(file_inode(file_out)))
> >> +             return -EPERM;
> >
> > /me wonders, why not EOPNOTSUPP?  That's what we've been using (in xfs
> > anyway) for "filesystem doesn't want to let you do this".
> 
> EOPNOTSUPP might be interpreted as "this filesystem doesn't support
> dedupe", even though here it's just "these two particular files don't
> support dedupe".

ocfs2 already uses EOPNOTSUPP for 'these two particular files don't
support dedupe/reflink'.  Granted, the manpage would seem to leave open
the possibility of using EOPNOTSUPP or EINVAL for the "can't do it to
these two files" case.

--D

> >
> > (Or I guess EXDEV, but "cross-device link not supported" might not be
> > quite what you want users to see...)
> 
> Hmm, I like EPERM better.   EPERM means something like  "you can't do
> this for some unspecified reason".  This is exactly the case here.
> 
> Thanks,
> Miklos

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 15/35] ovl: add ovl_fsync()
  2018-05-08  5:14   ` Amir Goldstein
@ 2018-05-08 14:57     ` Miklos Szeredi
  2018-05-08 15:02       ` Amir Goldstein
  0 siblings, 1 reply; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-08 14:57 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Tue, May 8, 2018 at 7:14 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Mon, May 7, 2018 at 11:37 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
>> Implement stacked fsync().
>>
>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> ---
>>  fs/overlayfs/file.c | 20 ++++++++++++++++++++
>>  1 file changed, 20 insertions(+)
>>
>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>> index a7af56861aa5..419aa3f9967b 100644
>> --- a/fs/overlayfs/file.c
>> +++ b/fs/overlayfs/file.c
>> @@ -233,10 +233,30 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>>         return ret;
>>  }
>>
>> +static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
>> +{
>> +       struct fd real;
>> +       const struct cred *old_cred;
>> +       int ret;
>> +
>
> Don't sync non-upper. same as ovl_dir_fsync()

Ah, that was about EROFS returned by lower fsync, right?

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 15/35] ovl: add ovl_fsync()
  2018-05-08 14:57     ` Miklos Szeredi
@ 2018-05-08 15:02       ` Amir Goldstein
  0 siblings, 0 replies; 53+ messages in thread
From: Amir Goldstein @ 2018-05-08 15:02 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Tue, May 8, 2018 at 5:57 PM, Miklos Szeredi <miklos@szeredi.hu> wrote:
> On Tue, May 8, 2018 at 7:14 AM, Amir Goldstein <amir73il@gmail.com> wrote:
>> On Mon, May 7, 2018 at 11:37 AM, Miklos Szeredi <mszeredi@redhat.com> wrote:
>>> Implement stacked fsync().
>>>
>>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>>> ---
>>>  fs/overlayfs/file.c | 20 ++++++++++++++++++++
>>>  1 file changed, 20 insertions(+)
>>>
>>> diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
>>> index a7af56861aa5..419aa3f9967b 100644
>>> --- a/fs/overlayfs/file.c
>>> +++ b/fs/overlayfs/file.c
>>> @@ -233,10 +233,30 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
>>>         return ret;
>>>  }
>>>
>>> +static int ovl_fsync(struct file *file, loff_t start, loff_t end, int datasync)
>>> +{
>>> +       struct fd real;
>>> +       const struct cred *old_cred;
>>> +       int ret;
>>> +
>>
>> Don't sync non-upper. same as ovl_dir_fsync()
>
> Ah, that was about EROFS returned by lower fsync, right?
>

Yap. to reason to try and sync a lower file.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 23/35] ovl: copy-up on MAP_SHARED
  2018-05-07 19:28   ` Randy Dunlap
@ 2018-05-08 15:03     ` Miklos Szeredi
  0 siblings, 0 replies; 53+ messages in thread
From: Miklos Szeredi @ 2018-05-08 15:03 UTC (permalink / raw)
  To: Randy Dunlap; +Cc: Miklos Szeredi, overlayfs, linux-fsdevel, linux-kernel

On Mon, May 7, 2018 at 9:28 PM, Randy Dunlap <rdunlap@infradead.org> wrote:
> On 05/07/2018 01:37 AM, Miklos Szeredi wrote:
>> A corner case of a corner case is when
>>
>>  - file opened for O_RDONLY
>>  - which is then memory mapped SHARED
>>  - file opened for O_WRONLY
>>  - contents modified
>>  - contents read back though the shared mapping
>>
>> Unfortunately it looks very difficult to do anything about the established
>> shared map after the file is copied up.
>>
>> Instead, when a read-only file is mapped shared, copy up the file before
>> actually doing the map.  This may result in unnecessary copy-ups (but so
>> may copy-up on open(O_RDWR) for exampe).
>
>                               for example).
>
>>
>> We can revisit this later if it turns out to be a performance problem in
>> real life.
>>
>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>> ---
>>  fs/overlayfs/Kconfig     | 21 +++++++++++++++++++++
>>  fs/overlayfs/file.c      | 22 ++++++++++++++++++++++
>>  fs/overlayfs/overlayfs.h |  7 +++++++
>>  fs/overlayfs/ovl_entry.h |  1 +
>>  fs/overlayfs/super.c     | 22 ++++++++++++++++++++++
>>  5 files changed, 73 insertions(+)
>>
>> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
>> index 17032631c5cf..991c0a5a0e00 100644
>> --- a/fs/overlayfs/Kconfig
>> +++ b/fs/overlayfs/Kconfig
>> @@ -103,3 +103,24 @@ config OVERLAY_FS_XINO_AUTO
>>         For more information, see Documentation/filesystems/overlayfs.txt
>>
>>         If unsure, say N.
>> +
>> +config OVERLAY_FS_COPY_UP_SHARED
>> +       bool "Overlayfs: copy up when mapping a file shared"
>
>                                                 a shared file" ??

This is referring to MAP_SHARED flag of mmap().  So it's the mapping
that is shared, not the file.

>
>> +       default n
>> +       depends on OVERLAY_FS
>> +       help
>> +         If this option is enabled then on mapping a file with MAP_SHARED
>> +      overlayfs copies up the file in anticipation of it being modified (just
>> +      like we copy up the file on O_WRONLY and O_RDWR in anticipation of
>> +      modification).  This does not interfere with shared library loading, as
>> +      that uses MAP_PRIVATE.  But there might be use cases out there where
>> +      this impacts performance and disk usage.
>> +
>> +      This just selects the default, the feature can also be enabled or
>> +      disabled in the running kernel or individually on each overlay mount.
>> +
>> +      To get maximally standard compliant behavior, enable this option.
>> +
>> +      To get a maximally backward compatible kernel, disable this option.
>> +
>> +      If unsure, say N.
>
> For Kconfig (coding-style.rst):
> Lines under a ``config`` definition are indented with one tab, while help text
> is indented an additional two spaces.

Not sure what went wrong there.  Fixed now.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs"
  2018-05-07  8:38 ` [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs" Miklos Szeredi
@ 2018-05-08 15:15   ` Jeff Layton
  0 siblings, 0 replies; 53+ messages in thread
From: Jeff Layton @ 2018-05-08 15:15 UTC (permalink / raw)
  To: Miklos Szeredi, linux-unionfs; +Cc: linux-fsdevel, linux-kernel, Al Viro

On Mon, 2018-05-07 at 10:38 +0200, Miklos Szeredi wrote:
> This partially reverts commit c568d68341be7030f5647def68851e469b21ca11.
> 
> Overlayfs files will now automatically get the correct locks, no need to
> hack overlay support in VFS.
> 
> It is a partial revert, because it leaves the locks_inode() calls in place
> and defines locks_inode() to file_inode().  We could revert those as well,
> but it would be unnecessary code churn and it makes sense to document that
> we are getting the inode for locking purposes.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/locks.c              | 17 ++++++-----------
>  fs/overlayfs/super.c    |  2 +-
>  include/linux/fs.h      | 13 +------------
>  include/uapi/linux/fs.h |  1 -
>  4 files changed, 8 insertions(+), 25 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 9c0e5f3da66c..40bcbaaa3f52 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -139,11 +139,6 @@
>  #define IS_OFDLCK(fl)	(fl->fl_flags & FL_OFDLCK)
>  #define IS_REMOTELCK(fl)	(fl->fl_pid <= 0)
>  
> -static inline bool is_remote_lock(struct file *filp)
> -{
> -	return likely(!(filp->f_path.dentry->d_sb->s_flags & SB_NOREMOTELOCK));
> -}
> -
>  static bool lease_breaking(struct file_lock *fl)
>  {
>  	return fl->fl_flags & (FL_UNLOCK_PENDING | FL_DOWNGRADE_PENDING);
> @@ -1875,7 +1870,7 @@ EXPORT_SYMBOL(generic_setlease);
>  int
>  vfs_setlease(struct file *filp, long arg, struct file_lock **lease, void **priv)
>  {
> -	if (filp->f_op->setlease && is_remote_lock(filp))
> +	if (filp->f_op->setlease)
>  		return filp->f_op->setlease(filp, arg, lease, priv);
>  	else
>  		return generic_setlease(filp, arg, lease, priv);
> @@ -2022,7 +2017,7 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
>  	if (error)
>  		goto out_free;
>  
> -	if (f.file->f_op->flock && is_remote_lock(f.file))
> +	if (f.file->f_op->flock)
>  		error = f.file->f_op->flock(f.file,
>  					  (can_sleep) ? F_SETLKW : F_SETLK,
>  					  lock);
> @@ -2048,7 +2043,7 @@ SYSCALL_DEFINE2(flock, unsigned int, fd, unsigned int, cmd)
>   */
>  int vfs_test_lock(struct file *filp, struct file_lock *fl)
>  {
> -	if (filp->f_op->lock && is_remote_lock(filp))
> +	if (filp->f_op->lock)
>  		return filp->f_op->lock(filp, F_GETLK, fl);
>  	posix_test_lock(filp, fl);
>  	return 0;
> @@ -2191,7 +2186,7 @@ int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock *flock)
>   */
>  int vfs_lock_file(struct file *filp, unsigned int cmd, struct file_lock *fl, struct file_lock *conf)
>  {
> -	if (filp->f_op->lock && is_remote_lock(filp))
> +	if (filp->f_op->lock)
>  		return filp->f_op->lock(filp, cmd, fl);
>  	else
>  		return posix_lock_file(filp, fl, conf);
> @@ -2513,7 +2508,7 @@ locks_remove_flock(struct file *filp, struct file_lock_context *flctx)
>  	if (list_empty(&flctx->flc_flock))
>  		return;
>  
> -	if (filp->f_op->flock && is_remote_lock(filp))
> +	if (filp->f_op->flock)
>  		filp->f_op->flock(filp, F_SETLKW, &fl);
>  	else
>  		flock_lock_inode(inode, &fl);
> @@ -2600,7 +2595,7 @@ EXPORT_SYMBOL(posix_unblock_lock);
>   */
>  int vfs_cancel_lock(struct file *filp, struct file_lock *fl)
>  {
> -	if (filp->f_op->lock && is_remote_lock(filp))
> +	if (filp->f_op->lock)
>  		return filp->f_op->lock(filp, F_CANCELLK, fl);
>  	return 0;
>  }
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 7779fc610767..f2a83fabe2eb 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -1479,7 +1479,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	sb->s_op = &ovl_super_operations;
>  	sb->s_xattr = ovl_xattr_handlers;
>  	sb->s_fs_info = ofs;
> -	sb->s_flags |= SB_POSIXACL | SB_NOREMOTELOCK;
> +	sb->s_flags |= SB_POSIXACL;
>  
>  	err = -ENOMEM;
>  	root_dentry = d_make_root(ovl_new_inode(sb, S_IFDIR, 0));
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index ba3078693d4a..417e692a606a 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1050,17 +1050,7 @@ struct file_lock_context {
>  
>  extern void send_sigio(struct fown_struct *fown, int fd, int band);
>  
> -/*
> - * Return the inode to use for locking
> - *
> - * For overlayfs this should be the overlay inode, not the real inode returned
> - * by file_inode().  For any other fs file_inode(filp) and locks_inode(filp) are
> - * equal.
> - */
> -static inline struct inode *locks_inode(const struct file *f)
> -{
> -	return f->f_path.dentry->d_inode;
> -}
> +#define locks_inode(f) file_inode(f)
>  
>  #ifdef CONFIG_FILE_LOCKING
>  extern int fcntl_getlk(struct file *, unsigned int, struct flock *);
> @@ -1300,7 +1290,6 @@ extern int send_sigurg(struct fown_struct *fown);
>  
>  /* These sb flags are internal to the kernel */
>  #define SB_SUBMOUNT     (1<<26)
> -#define SB_NOREMOTELOCK	(1<<27)
>  #define SB_NOSEC	(1<<28)
>  #define SB_BORN		(1<<29)
>  #define SB_ACTIVE	(1<<30)
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index d2a8313fabd7..2840ddcece73 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -134,7 +134,6 @@ struct inodes_stat_t {
>  
>  /* These sb flags are internal to the kernel */
>  #define MS_SUBMOUNT     (1<<26)
> -#define MS_NOREMOTELOCK	(1<<27)
>  #define MS_NOSEC	(1<<28)
>  #define MS_BORN		(1<<29)
>  #define MS_ACTIVE	(1<<30)

Acked-by: Jeff Layton <jlayton@kernel.org>

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-07  8:37 ` [PATCH v2 22/35] vfs: don't open real Miklos Szeredi
  2018-05-07 10:27   ` Amir Goldstein
@ 2018-05-11 18:54   ` Vivek Goyal
  2018-05-11 19:42     ` Vivek Goyal
  1 sibling, 1 reply; 53+ messages in thread
From: Vivek Goyal @ 2018-05-11 18:54 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Al Viro,
	linux-security-module, Daniel J Walsh, Paul Moore,
	Stephen Smalley

On Mon, May 07, 2018 at 10:37:54AM +0200, Miklos Szeredi wrote:
> Let overlayfs do its thing when opening a file.
> 
> This enables stacking and fixes the corner case when a file is opened for
> read, modified through a writable open, and data is read from the read-only
> file.  After this patch the read-only open will not return stale data even
> in this case.

[CC Dan, Steven, Paul, linux-security-module list]

Hi Miklos,

I was running selinux-testsuite and one of the tests seems to fail. I
think this is side effect of installing overlay inode in file->f_inode
instead of real underlying inode.

Following test is failing.

sub test_90_1 {
    print "Attempting to enter domain with bad entrypoint, should fail.\n";
    $result = system(
"runcon -t test_overlay_client_t -l s0:c10,c20 $basedir/container1/merged/badentrypoint >/dev/null 2>&1"
    );
    ok($result);
    return;
}

Basically, this test has an executable named "badentrypoint" with selinux
label "unconfined_u:object_r:test_overlay_files_ro_t:s0". And we mount
overlay with context=unconfined_u:object_r:test_overlay_files_rwx_t:s0:c10,c20

So effectively overlay inode of "badentrypoint" now gets the label
specified by "context=".

I think intent of test is that this file's real label is "...ro_t". That
means this file is not supposed to be executed and any attempt to execute
it should be denied.

Currently test works and execution fails with following avc.

AVC avc:  denied  { entrypoint } for  pid=1425 comm="runcon" path="/root/git/selinux-testsuite/tests/overlay/container1/merged/badentrypoint" dev="dm-0" ino=34515261 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0

But with new patches, this test starts passing. 

I think currently selinux_bprm_set_creds() returns error. It does
checks on inode returned by file_inode() and as of now that inode is
real inode and that inode has real lable of "...ro_t" and permission
to execute that file is denied.

But after the patches file_inode() returns overlay inode. Which has
the label specified by context= mount option "...rwx_t". And that
label allows executing file, so file execution is not blocked by
selinux.

I feel that even now code is working accidently. Ideally our theme was
that task's credential as checked against overlay inode and mounter's
creds are checked against underlying inode to determine if certain
permission is allowed. So ideally mounter should not have been allwed
to execute a file of type "...ro_t". But we don't have that workflow
and VFS calls into selinux and selinux checks the underlying file's
label against task.

It worked so far but the moment we install overlay inode in file, selinux
checks it against overlay inode label and allows permission to execute and
mounter is never checked against real inode.

I am not sure what's the right solution. So far selinux is not aware of
two levels of checks and if two levels of checks are to be performed, it
somehow needs to be enforced by overlay and call same hook on two levels.

Thought of atleast starting a conversation on this.

Thanks
Vivek


> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/open.c | 7 +------
>  1 file changed, 1 insertion(+), 6 deletions(-)
> 
> diff --git a/fs/open.c b/fs/open.c
> index 6e52fd6fea7c..244cd2ecfefd 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
>  int vfs_open(const struct path *path, struct file *file,
>  	     const struct cred *cred)
>  {
> -	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
> -
> -	if (IS_ERR(dentry))
> -		return PTR_ERR(dentry);
> -
>  	file->f_path = *path;
> -	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
> +	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
>  }
>  
>  /**
> -- 
> 2.14.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-11 18:54   ` Vivek Goyal
@ 2018-05-11 19:42     ` Vivek Goyal
  2018-05-14 13:58       ` Vivek Goyal
  2018-05-14 14:03       ` Daniel Walsh
  0 siblings, 2 replies; 53+ messages in thread
From: Vivek Goyal @ 2018-05-11 19:42 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Al Viro,
	linux-security-module, Daniel J Walsh, Paul Moore,
	Stephen Smalley

On Fri, May 11, 2018 at 02:54:30PM -0400, Vivek Goyal wrote:
> On Mon, May 07, 2018 at 10:37:54AM +0200, Miklos Szeredi wrote:
> > Let overlayfs do its thing when opening a file.
> > 
> > This enables stacking and fixes the corner case when a file is opened for
> > read, modified through a writable open, and data is read from the read-only
> > file.  After this patch the read-only open will not return stale data even
> > in this case.
> 
> [CC Dan, Steven, Paul, linux-security-module list]
> 
> Hi Miklos,
> 
> I was running selinux-testsuite and one of the tests seems to fail. I
> think this is side effect of installing overlay inode in file->f_inode
> instead of real underlying inode.
> 
> Following test is failing.
> 
> sub test_90_1 {
>     print "Attempting to enter domain with bad entrypoint, should fail.\n";
>     $result = system(
> "runcon -t test_overlay_client_t -l s0:c10,c20 $basedir/container1/merged/badentrypoint >/dev/null 2>&1"
>     );
>     ok($result);
>     return;
> }

I am wondering, shouldn't do_open_execat() have failed. It should have called
into inode_permission(MAY_EXEC). And then ovl_inode_permission()
will in turn call inode_permission(realinode, MAY_EXEC) with mounter's
creds. Shouldn't selinux_inode_permission() have returned that mounter
does not have MAY_EXEC permission on inode.

Dan, I am wondering if this is a selinux policy issue? In my testing
on upstream kernel, do_open_execat() succeeds and it fails much later.
I am wondering why that's the case. Is it expected.

Thanks
Vivek


> 
> Basically, this test has an executable named "badentrypoint" with selinux
> label "unconfined_u:object_r:test_overlay_files_ro_t:s0". And we mount
> overlay with context=unconfined_u:object_r:test_overlay_files_rwx_t:s0:c10,c20
> 
> So effectively overlay inode of "badentrypoint" now gets the label
> specified by "context=".
> 
> I think intent of test is that this file's real label is "...ro_t". That
> means this file is not supposed to be executed and any attempt to execute
> it should be denied.
> 
> Currently test works and execution fails with following avc.
> 
> AVC avc:  denied  { entrypoint } for  pid=1425 comm="runcon" path="/root/git/selinux-testsuite/tests/overlay/container1/merged/badentrypoint" dev="dm-0" ino=34515261 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0
> 
> But with new patches, this test starts passing. 
> 
> I think currently selinux_bprm_set_creds() returns error. It does
> checks on inode returned by file_inode() and as of now that inode is
> real inode and that inode has real lable of "...ro_t" and permission
> to execute that file is denied.
> 
> But after the patches file_inode() returns overlay inode. Which has
> the label specified by context= mount option "...rwx_t". And that
> label allows executing file, so file execution is not blocked by
> selinux.
> 
> I feel that even now code is working accidently. Ideally our theme was
> that task's credential as checked against overlay inode and mounter's
> creds are checked against underlying inode to determine if certain
> permission is allowed. So ideally mounter should not have been allwed
> to execute a file of type "...ro_t". But we don't have that workflow
> and VFS calls into selinux and selinux checks the underlying file's
> label against task.
> 
> It worked so far but the moment we install overlay inode in file, selinux
> checks it against overlay inode label and allows permission to execute and
> mounter is never checked against real inode.
> 
> I am not sure what's the right solution. So far selinux is not aware of
> two levels of checks and if two levels of checks are to be performed, it
> somehow needs to be enforced by overlay and call same hook on two levels.
> 
> Thought of atleast starting a conversation on this.
> 
> Thanks
> Vivek
> 
> 
> > 
> > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> > ---
> >  fs/open.c | 7 +------
> >  1 file changed, 1 insertion(+), 6 deletions(-)
> > 
> > diff --git a/fs/open.c b/fs/open.c
> > index 6e52fd6fea7c..244cd2ecfefd 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
> >  int vfs_open(const struct path *path, struct file *file,
> >  	     const struct cred *cred)
> >  {
> > -	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
> > -
> > -	if (IS_ERR(dentry))
> > -		return PTR_ERR(dentry);
> > -
> >  	file->f_path = *path;
> > -	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
> > +	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
> >  }
> >  
> >  /**
> > -- 
> > 2.14.3
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-11 19:42     ` Vivek Goyal
@ 2018-05-14 13:58       ` Vivek Goyal
  2018-05-15 20:42         ` Vivek Goyal
  2018-05-14 14:03       ` Daniel Walsh
  1 sibling, 1 reply; 53+ messages in thread
From: Vivek Goyal @ 2018-05-14 13:58 UTC (permalink / raw)
  To: Miklos Szeredi, Daniel J Walsh
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Al Viro,
	linux-security-module, Paul Moore, Stephen Smalley

On Fri, May 11, 2018 at 03:42:48PM -0400, Vivek Goyal wrote:
> On Fri, May 11, 2018 at 02:54:30PM -0400, Vivek Goyal wrote:
> > On Mon, May 07, 2018 at 10:37:54AM +0200, Miklos Szeredi wrote:
> > > Let overlayfs do its thing when opening a file.
> > > 
> > > This enables stacking and fixes the corner case when a file is opened for
> > > read, modified through a writable open, and data is read from the read-only
> > > file.  After this patch the read-only open will not return stale data even
> > > in this case.
> > 
> > [CC Dan, Steven, Paul, linux-security-module list]
> > 
> > Hi Miklos,
> > 
> > I was running selinux-testsuite and one of the tests seems to fail. I
> > think this is side effect of installing overlay inode in file->f_inode
> > instead of real underlying inode.
> > 
> > Following test is failing.
> > 
> > sub test_90_1 {
> >     print "Attempting to enter domain with bad entrypoint, should fail.\n";
> >     $result = system(
> > "runcon -t test_overlay_client_t -l s0:c10,c20 $basedir/container1/merged/badentrypoint >/dev/null 2>&1"
> >     );
> >     ok($result);
> >     return;
> > }
> 
> I am wondering, shouldn't do_open_execat() have failed. It should have called
> into inode_permission(MAY_EXEC). And then ovl_inode_permission()
> will in turn call inode_permission(realinode, MAY_EXEC) with mounter's
> creds. Shouldn't selinux_inode_permission() have returned that mounter
> does not have MAY_EXEC permission on inode.

Ok, I noticed that current policy in tests gives exec permission to 
mounter for ro_t file and that's why inode_permission(MAY_EXEC) does
not fail.

can_exec(test_overlay_mounter_t, test_overlay_files_ro_t)

Talked to Dan and he mentioned that he was trying to test entrypoint
failure (and not exec failure) and that's whey he might have allowed exec
to mounter.

I think that current entrypoint test's expectations are wrong.
User process sees overlay inode lablel which is rwx_t and that means
overlay layer will allow entrypoint into that executable. This will be the
behavior on a normal file system where underlying file's label will be
completely overridden by context=.

So in my opinion, we should modify testsuite and not run this test with
context= mounts.

Only little thing to argue is that should we check if mounter has the
permission for this entrypoint. And currently SELinux checks are not
two level checks. So this can be implemented once SELinux is made
aware of multiple levels (if we ever do that).

Thanks
Vivek

> 
> Dan, I am wondering if this is a selinux policy issue? In my testing
> on upstream kernel, do_open_execat() succeeds and it fails much later.
> I am wondering why that's the case. Is it expected.
> 
> Thanks
> Vivek
> 
> 
> > 
> > Basically, this test has an executable named "badentrypoint" with selinux
> > label "unconfined_u:object_r:test_overlay_files_ro_t:s0". And we mount
> > overlay with context=unconfined_u:object_r:test_overlay_files_rwx_t:s0:c10,c20
> > 
> > So effectively overlay inode of "badentrypoint" now gets the label
> > specified by "context=".
> > 
> > I think intent of test is that this file's real label is "...ro_t". That
> > means this file is not supposed to be executed and any attempt to execute
> > it should be denied.
> > 
> > Currently test works and execution fails with following avc.
> > 
> > AVC avc:  denied  { entrypoint } for  pid=1425 comm="runcon" path="/root/git/selinux-testsuite/tests/overlay/container1/merged/badentrypoint" dev="dm-0" ino=34515261 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0
> > 
> > But with new patches, this test starts passing. 
> > 
> > I think currently selinux_bprm_set_creds() returns error. It does
> > checks on inode returned by file_inode() and as of now that inode is
> > real inode and that inode has real lable of "...ro_t" and permission
> > to execute that file is denied.
> > 
> > But after the patches file_inode() returns overlay inode. Which has
> > the label specified by context= mount option "...rwx_t". And that
> > label allows executing file, so file execution is not blocked by
> > selinux.
> > 
> > I feel that even now code is working accidently. Ideally our theme was
> > that task's credential as checked against overlay inode and mounter's
> > creds are checked against underlying inode to determine if certain
> > permission is allowed. So ideally mounter should not have been allwed
> > to execute a file of type "...ro_t". But we don't have that workflow
> > and VFS calls into selinux and selinux checks the underlying file's
> > label against task.
> > 
> > It worked so far but the moment we install overlay inode in file, selinux
> > checks it against overlay inode label and allows permission to execute and
> > mounter is never checked against real inode.
> > 
> > I am not sure what's the right solution. So far selinux is not aware of
> > two levels of checks and if two levels of checks are to be performed, it
> > somehow needs to be enforced by overlay and call same hook on two levels.
> > 
> > Thought of atleast starting a conversation on this.
> > 
> > Thanks
> > Vivek
> > 
> > 
> > > 
> > > Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> > > ---
> > >  fs/open.c | 7 +------
> > >  1 file changed, 1 insertion(+), 6 deletions(-)
> > > 
> > > diff --git a/fs/open.c b/fs/open.c
> > > index 6e52fd6fea7c..244cd2ecfefd 100644
> > > --- a/fs/open.c
> > > +++ b/fs/open.c
> > > @@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
> > >  int vfs_open(const struct path *path, struct file *file,
> > >  	     const struct cred *cred)
> > >  {
> > > -	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
> > > -
> > > -	if (IS_ERR(dentry))
> > > -		return PTR_ERR(dentry);
> > > -
> > >  	file->f_path = *path;
> > > -	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
> > > +	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
> > >  }
> > >  
> > >  /**
> > > -- 
> > > 2.14.3
> > > 
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> > > the body of a message to majordomo@vger.kernel.org
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-11 19:42     ` Vivek Goyal
  2018-05-14 13:58       ` Vivek Goyal
@ 2018-05-14 14:03       ` Daniel Walsh
  1 sibling, 0 replies; 53+ messages in thread
From: Daniel Walsh @ 2018-05-14 14:03 UTC (permalink / raw)
  To: Vivek Goyal, Miklos Szeredi
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Al Viro,
	linux-security-module, Paul Moore, Stephen Smalley

On 05/11/2018 03:42 PM, Vivek Goyal wrote:
> On Fri, May 11, 2018 at 02:54:30PM -0400, Vivek Goyal wrote:
>> On Mon, May 07, 2018 at 10:37:54AM +0200, Miklos Szeredi wrote:
>>> Let overlayfs do its thing when opening a file.
>>>
>>> This enables stacking and fixes the corner case when a file is opened for
>>> read, modified through a writable open, and data is read from the read-only
>>> file.  After this patch the read-only open will not return stale data even
>>> in this case.
>> [CC Dan, Steven, Paul, linux-security-module list]
>>
>> Hi Miklos,
>>
>> I was running selinux-testsuite and one of the tests seems to fail. I
>> think this is side effect of installing overlay inode in file->f_inode
>> instead of real underlying inode.
>>
>> Following test is failing.
>>
>> sub test_90_1 {
>>      print "Attempting to enter domain with bad entrypoint, should fail.\n";
>>      $result = system(
>> "runcon -t test_overlay_client_t -l s0:c10,c20 $basedir/container1/merged/badentrypoint >/dev/null 2>&1"
>>      );
>>      ok($result);
>>      return;
>> }
> I am wondering, shouldn't do_open_execat() have failed. It should have called
> into inode_permission(MAY_EXEC). And then ovl_inode_permission()
> will in turn call inode_permission(realinode, MAY_EXEC) with mounter's
> creds. Shouldn't selinux_inode_permission() have returned that mounter
> does not have MAY_EXEC permission on inode.
>
> Dan, I am wondering if this is a selinux policy issue? In my testing
> on upstream kernel, do_open_execat() succeeds and it fails much later.
> I am wondering why that's the case. Is it expected.
>
> Thanks
> Vivek
>
>
>> Basically, this test has an executable named "badentrypoint" with selinux
>> label "unconfined_u:object_r:test_overlay_files_ro_t:s0". And we mount
>> overlay with context=unconfined_u:object_r:test_overlay_files_rwx_t:s0:c10,c20
>>
>> So effectively overlay inode of "badentrypoint" now gets the label
>> specified by "context=".
>>
>> I think intent of test is that this file's real label is "...ro_t". That
>> means this file is not supposed to be executed and any attempt to execute
>> it should be denied.
>>
>> Currently test works and execution fails with following avc.
>>
>> AVC avc:  denied  { entrypoint } for  pid=1425 comm="runcon" path="/root/git/selinux-testsuite/tests/overlay/container1/merged/badentrypoint" dev="dm-0" ino=34515261 scontext=unconfined_u:unconfined_r:test_overlay_client_t:s0:c10,c20 tcontext=unconfined_u:object_r:test_overlay_files_ro_t:s0 tclass=file permissive=0
>>
>> But with new patches, this test starts passing.
>>
>> I think currently selinux_bprm_set_creds() returns error. It does
>> checks on inode returned by file_inode() and as of now that inode is
>> real inode and that inode has real lable of "...ro_t" and permission
>> to execute that file is denied.
>>
>> But after the patches file_inode() returns overlay inode. Which has
>> the label specified by context= mount option "...rwx_t". And that
>> label allows executing file, so file execution is not blocked by
>> selinux.
>>
>> I feel that even now code is working accidently. Ideally our theme was
>> that task's credential as checked against overlay inode and mounter's
>> creds are checked against underlying inode to determine if certain
>> permission is allowed. So ideally mounter should not have been allwed
>> to execute a file of type "...ro_t". But we don't have that workflow
>> and VFS calls into selinux and selinux checks the underlying file's
>> label against task.
>>
>> It worked so far but the moment we install overlay inode in file, selinux
>> checks it against overlay inode label and allows permission to execute and
>> mounter is never checked against real inode.
>>
>> I am not sure what's the right solution. So far selinux is not aware of
>> two levels of checks and if two levels of checks are to be performed, it
>> somehow needs to be enforced by overlay and call same hook on two levels.
>>
>> Thought of atleast starting a conversation on this.
>>
>> Thanks
>> Vivek
>>
>>
>>> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
>>> ---
>>>   fs/open.c | 7 +------
>>>   1 file changed, 1 insertion(+), 6 deletions(-)
>>>
>>> diff --git a/fs/open.c b/fs/open.c
>>> index 6e52fd6fea7c..244cd2ecfefd 100644
>>> --- a/fs/open.c
>>> +++ b/fs/open.c
>>> @@ -897,13 +897,8 @@ EXPORT_SYMBOL(file_path);
>>>   int vfs_open(const struct path *path, struct file *file,
>>>   	     const struct cred *cred)
>>>   {
>>> -	struct dentry *dentry = d_real(path->dentry, NULL, file->f_flags, 0);
>>> -
>>> -	if (IS_ERR(dentry))
>>> -		return PTR_ERR(dentry);
>>> -
>>>   	file->f_path = *path;
>>> -	return do_dentry_open(file, d_backing_inode(dentry), NULL, cred);
>>> +	return do_dentry_open(file, d_backing_inode(path->dentry), NULL, cred);
>>>   }
>>>   
>>>   /**
>>> -- 
>>> 2.14.3
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Vivek and I talked, and I believe the SELinux check on Entrypoint is 
wrong.  We should be checking on the overlay context not on the lower 
level label for entrypoint.


A little back ground.  Entrypoint check is looking at the target domain 
whether it can be entered via the executable.


For example we might have a label like apache_t and apache_exec_t, we 
would write a rules like:


allow apache_t apache_exec_t:file entrypoint.

allow user_t apache_t:process transition

allow user_t apache_file_t:file execute

allow user_t bin_t:file execute


These rules say a process running as user_t can execute files labeles 
apache_exec_t and bin_t.  It also says that the user_t type can 
transition or start a process as apache_t, BUT since we have the 
entrypoint rule, the only type that user_t can transition to apache_t is 
the apache_exec_t type.

This would prevent user_t from executing something like

runcon -t apache_t /bin/sh


In the case of these tests currently SELinux is verifying that the 
mounter is able to mount a directory with a different label rwx_t, and 
then providing the user with content via this label. So the entrypoint 
check should happen on the new context label, not on the lower label.  
We need to fix the SELinux test suite to reflect the new behaviour.  I 
think the current test and current code is actually a bug.


would say that the apache_t process type can be entered via

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [PATCH v2 22/35] vfs: don't open real
  2018-05-14 13:58       ` Vivek Goyal
@ 2018-05-15 20:42         ` Vivek Goyal
  0 siblings, 0 replies; 53+ messages in thread
From: Vivek Goyal @ 2018-05-15 20:42 UTC (permalink / raw)
  To: Miklos Szeredi, Daniel J Walsh
  Cc: linux-unionfs, linux-fsdevel, linux-kernel, Al Viro,
	linux-security-module, Paul Moore, Stephen Smalley

On Mon, May 14, 2018 at 09:58:03AM -0400, Vivek Goyal wrote:

[..]
> Talked to Dan and he mentioned that he was trying to test entrypoint
> failure (and not exec failure) and that's whey he might have allowed exec
> to mounter.
> 
> I think that current entrypoint test's expectations are wrong.
> User process sees overlay inode lablel which is rwx_t and that means
> overlay layer will allow entrypoint into that executable. This will be the
> behavior on a normal file system where underlying file's label will be
> completely overridden by context=.
> 
> So in my opinion, we should modify testsuite and not run this test with
> context= mounts.

Miklos, now a fix has been merged to the tests so that test passes both with
current kernels and proposed changes.

https://github.com/SELinuxProject/selinux-testsuite/pull/36

Thanks Dan Walsh, Stephen Smalley and Paul More.

Vivek

^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2018-05-15 20:42 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-05-07  8:37 [PATCH v2 00/35] overlayfs: stack file operations Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 01/35] vfs: add path_open() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 02/35] vfs: optionally don't account file in nr_files Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 03/35] vfs: add f_op->pre_mmap() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 04/35] vfs: export vfs_ioctl() to modules Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 05/35] vfs: export vfs_dedupe_file_range_one() " Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 06/35] ovl: copy up times Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 07/35] ovl: copy up inode flags Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 08/35] Revert "Revert "ovl: get_write_access() in truncate"" Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 09/35] ovl: copy up file size as well Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 10/35] ovl: deal with overlay files in ovl_d_real() Miklos Szeredi
2018-05-07 13:17   ` Vivek Goyal
2018-05-07  8:37 ` [PATCH v2 11/35] ovl: stack file ops Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 12/35] ovl: add helper to return real file Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 13/35] ovl: add ovl_read_iter() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 14/35] ovl: add ovl_write_iter() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 15/35] ovl: add ovl_fsync() Miklos Szeredi
2018-05-08  5:14   ` Amir Goldstein
2018-05-08 14:57     ` Miklos Szeredi
2018-05-08 15:02       ` Amir Goldstein
2018-05-07  8:37 ` [PATCH v2 16/35] ovl: add ovl_mmap() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 17/35] ovl: add ovl_fallocate() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 18/35] ovl: add lsattr/chattr support Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 19/35] ovl: add ovl_fiemap() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 20/35] ovl: add O_DIRECT support Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 21/35] ovl: add reflink/copyfile/dedup support Miklos Szeredi
2018-05-07 20:43   ` Darrick J. Wong
2018-05-08 14:13     ` Miklos Szeredi
2018-05-08 14:38       ` Darrick J. Wong
2018-05-07  8:37 ` [PATCH v2 22/35] vfs: don't open real Miklos Szeredi
2018-05-07 10:27   ` Amir Goldstein
2018-05-07 10:29     ` Miklos Szeredi
2018-05-11 18:54   ` Vivek Goyal
2018-05-11 19:42     ` Vivek Goyal
2018-05-14 13:58       ` Vivek Goyal
2018-05-15 20:42         ` Vivek Goyal
2018-05-14 14:03       ` Daniel Walsh
2018-05-07  8:37 ` [PATCH v2 23/35] ovl: copy-up on MAP_SHARED Miklos Szeredi
2018-05-07 19:28   ` Randy Dunlap
2018-05-08 15:03     ` Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 24/35] vfs: simplify dentry_open() Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 25/35] Revert "ovl: fix may_write_real() for overlayfs directories" Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 26/35] Revert "ovl: don't allow writing ioctl on lower layer" Miklos Szeredi
2018-05-07  8:37 ` [PATCH v2 27/35] vfs: fix freeze protection in mnt_want_write_file() for overlayfs Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 28/35] Revert "ovl: fix relatime for directories" Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 29/35] Revert "vfs: update ovl inode before relatime check" Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 30/35] Revert "vfs: add flags to d_real()" Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 31/35] Revert "vfs: do get_write_access() on upper layer of overlayfs" Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 32/35] Partially revert "locks: fix file locking on overlayfs" Miklos Szeredi
2018-05-08 15:15   ` Jeff Layton
2018-05-07  8:38 ` [PATCH v2 33/35] Revert "fsnotify: support overlayfs" Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 34/35] vfs: remove open_flags from d_real() Miklos Szeredi
2018-05-07  8:38 ` [PATCH v2 35/35] ovl: fix documentation of non-standard behavior Miklos Szeredi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).