linux-api.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/3] MAP_DIRECT and block-map sealed files
@ 2017-08-15  6:12 Dan Williams
  2017-08-15  6:12 ` [PATCH v4 1/3] fs, xfs: introduce S_IOMAP_SEALED Dan Williams
                   ` (3 more replies)
  0 siblings, 4 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15  6:12 UTC (permalink / raw)
  To: darrick.wong-QHcLZuEGTsvQT0dZR+AlfA
  Cc: Jan Kara, Arnd Bergmann, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
	linux-api-u79uwXL29TY76Z2rM5mHXA, Dave Chinner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Alexander Viro,
	luto-DgEjT+Ai2ygdnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
	Christoph Hellwig

Changes since v3 [1]:
* Move from an fallocate(2) interface to a new mmap(2) flag and rename
  'immutable' to 'sealed'.

* Do not record the sealed state in permanent metadata it is now purely
  a temporary state for as long as a MAP_DIRECT vma is referencing the
  inode (Christoph)

* Drop the CAP_IMMUTABLE requirement, but do require a PROT_WRITE
  mapping.

[1]: https://lwn.net/Articles/730570/

---

This is the next revision of a patch series that aims to enable
applications that otherwise need to resort to DAX mapping a raw device
file to instead move to a filesystem.

In the course of reviewing a previous posting, Christoph said:

    That being said I think we absolutely should support RDMA memory
    registrations for DAX mappings.  I'm just not sure how S_IOMAP_IMMUTABLE
    helps with that.  We'll want a MAP_SYNC | MAP_POPULATE to make sure all
    the blocks are populated and all ptes are set up.  Second we need to
    make sure get_user_page works, which for now means we'll need a struct
    page mapping for the region (which will be really annoying for PCIe
    mappings, like the upcoming NVMe persistent memory region), and we need
    to guarantee that the extent mapping won't change while the
    get_user_pages holds the pages inside it.  I think that is true due to
    side effects even with the current DAX code, but we'll need to make it
    explicit.  And maybe that's where we need to converge - "sealing" the
    extent map makes sense as such a temporary measure that is not persisted
    on disk, which automatically gets released when the holding process
    exits, because we sort of already do this implicitly.  It might also
    make sense to have explicitly breakable seals similar to what I do for
    the pNFS blocks kernel server, as any userspace RDMA file server would
    also need those semantics.

So, this is an attempt to converge on the idea that we need an explicit
and process-lifetime-temporary mechanism for a process to be able to
make assumptions about the mapping to physical page to dax-file-offset
relationship. The "explicitly breakable seals" aspect is not addressed
in these patches, but I wonder if it might be a voluntary mechanism that
can implemented via userfaultfd.

These pass a basic smoke test and are meant to just gauge 'right track'
/ 'wrong track'. The main question it seems is whether the pinning done
in this patchset is too early (applies before get_user_pages()) and too
coarse (applies to the whole file). Perhaps this is where I discarded
too easily Jan's suggestion to look at Peter Z's mm_mpin() syscall [2]? On
the other hand, the coarseness and simple lifetime rules of MAP_DIRECT
make it an easy mechanism to implement and explain.

Another reason I kept the scope of S_IOMAP_SEALED coarsely defined was
to support Dave's desired use case of sealing for operating on reflinked
files [3].

Suggested mmap(2) man page edits are included in the changelog of patch
3.

[2]: https://lwn.net/Articles/600502/
[3]: https://www.mail-archive.com/linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org/msg1467677.html

---

Dan Williams (3):
      fs, xfs: introduce S_IOMAP_SEALED
      mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
      fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges


 fs/attr.c                              |   10 +++
 fs/dax.c                               |    2 +
 fs/open.c                              |    6 ++
 fs/read_write.c                        |    3 +
 fs/xfs/libxfs/xfs_bmap.c               |    5 +
 fs/xfs/xfs_bmap_util.c                 |    3 +
 fs/xfs/xfs_file.c                      |  107 ++++++++++++++++++++++++++++++++
 fs/xfs/xfs_inode.h                     |    1 
 fs/xfs/xfs_ioctl.c                     |    6 ++
 fs/xfs/xfs_super.c                     |    1 
 include/linux/fs.h                     |    9 +++
 include/linux/mm.h                     |    2 -
 include/linux/mm_types.h               |    1 
 include/linux/mman.h                   |    3 +
 include/uapi/asm-generic/mman-common.h |    2 +
 mm/filemap.c                           |    5 +
 mm/mmap.c                              |   22 ++++++-
 17 files changed, 183 insertions(+), 5 deletions(-)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH v4 1/3] fs, xfs: introduce S_IOMAP_SEALED
  2017-08-15  6:12 [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dan Williams
@ 2017-08-15  6:12 ` Dan Williams
       [not found] ` <150277752553.23945.13932394738552748440.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15  6:12 UTC (permalink / raw)
  To: darrick.wong
  Cc: Jan Kara, linux-nvdimm, linux-api, Dave Chinner, linux-kernel,
	linux-xfs, linux-mm, Jeff Moyer, Alexander Viro, luto,
	linux-fsdevel, Ross Zwisler, Christoph Hellwig

When a filesystem sees this flag set it will not allow changes to the
file-offset to physical-block-offset relationship of any extent in the
file. The extent of the extents covered by the global S_IOMAP_SEALED is
filesystem specific. In other words it is similar to the inode-wide
XFS_DIFLAG2_REFLINK flag where we make the distinction apply globally to
the inode even though we could theoretically limit that effect to a
sub-range of the file.

The interface that sets this flag (mmap(..., MAP_DIRECT, ...)) will be
careful to document that it is implementation specific whether the
'sealed' restrictions apply to a sub-range or the whole file.
Applications should be prepared for unrelated ranges in the file to be
effected.

The term 'sealed' is used instead of 'immutable' to better indicate that
this is a file property that is temporary and can be undone.

Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 fs/attr.c                |   10 ++++++++++
 fs/open.c                |    6 ++++++
 fs/read_write.c          |    3 +++
 fs/xfs/libxfs/xfs_bmap.c |    5 +++++
 fs/xfs/xfs_bmap_util.c   |    3 +++
 fs/xfs/xfs_ioctl.c       |    6 ++++++
 include/linux/fs.h       |    2 ++
 mm/filemap.c             |    5 +++++
 8 files changed, 40 insertions(+)

diff --git a/fs/attr.c b/fs/attr.c
index 135304146120..d940386e0ca9 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -112,6 +112,16 @@ EXPORT_SYMBOL(setattr_prepare);
  */
 int inode_newsize_ok(const struct inode *inode, loff_t offset)
 {
+	if (IS_IOMAP_SEALED(inode)) {
+		/*
+		 * Any size change is disallowed. Size increases may
+		 * dirty metadata that an application is not prepared to
+		 * sync, and a size decrease may expose free blocks to
+		 * in-flight DMA.
+		 */
+		return -ETXTBSY;
+	}
+
 	if (inode->i_size < offset) {
 		unsigned long limit;
 
diff --git a/fs/open.c b/fs/open.c
index 35bb784763a4..92d89ec2d6b3 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -292,6 +292,12 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 		return -ETXTBSY;
 
 	/*
+	 * We cannot allow any allocation changes on an iomap sealed file
+	 */
+	if (IS_IOMAP_SEALED(inode))
+		return -ETXTBSY;
+
+	/*
 	 * Revalidate the write permissions, in case security policy has
 	 * changed since the files were opened.
 	 */
diff --git a/fs/read_write.c b/fs/read_write.c
index 0cc7033aa413..55700ca85f7e 100644
--- a/fs/read_write.c
+++ b/fs/read_write.c
@@ -1706,6 +1706,9 @@ int vfs_clone_file_prep_inodes(struct inode *inode_in, loff_t pos_in,
 	if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out))
 		return -ETXTBSY;
 
+	if (IS_IOMAP_SEALED(inode_in) || IS_IOMAP_SEALED(inode_out))
+		return -ETXTBSY;
+
 	/* Don't reflink dirs, pipes, sockets... */
 	if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode))
 		return -EISDIR;
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index a2d64666cdd4..84d8ee9f414c 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -4481,6 +4481,11 @@ xfs_bmapi_write(
 	if (XFS_FORCED_SHUTDOWN(mp))
 		return -EIO;
 
+	/* fail any attempts to mutate data extents */
+	if (IS_IOMAP_SEALED(VFS_I(ip))
+			&& !(flags & (XFS_BMAPI_METADATA | XFS_BMAPI_ATTRFORK)))
+		return -ETXTBSY;
+
 	ifp = XFS_IFORK_PTR(ip, whichfork);
 
 	XFS_STATS_INC(mp, xs_blk_mapw);
diff --git a/fs/xfs/xfs_bmap_util.c b/fs/xfs/xfs_bmap_util.c
index 93e955262d07..ef4c4e8b0f58 100644
--- a/fs/xfs/xfs_bmap_util.c
+++ b/fs/xfs/xfs_bmap_util.c
@@ -1294,6 +1294,9 @@ xfs_free_file_space(
 
 	trace_xfs_free_file_space(ip);
 
+	if (IS_IOMAP_SEALED(VFS_I(ip)))
+		return -ETXTBSY;
+
 	error = xfs_qm_dqattach(ip, 0);
 	if (error)
 		return error;
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index e75c40a47b7d..b716d184ae9a 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1755,6 +1755,12 @@ xfs_ioc_swapext(
 		goto out_put_tmp_file;
 	}
 
+	if (IS_IOMAP_SEALED(file_inode(f.file)) ||
+	    IS_IOMAP_SEALED(file_inode(tmp.file))) {
+		error = -EINVAL;
+		goto out_put_tmp_file;
+	}
+
 	/*
 	 * We need to ensure that the fds passed in point to XFS inodes
 	 * before we cast and access them as XFS structures as we have no
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 6e1fd5d21248..1104e5df39ef 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1829,6 +1829,7 @@ struct super_operations {
 #else
 #define S_DAX		0	/* Make all the DAX code disappear */
 #endif
+#define S_IOMAP_SEALED 16384 /* logical-to-physical extent map is fixed */
 
 /*
  * Note that nosuid etc flags are inode-specific: setting some file-system
@@ -1867,6 +1868,7 @@ struct super_operations {
 #define IS_AUTOMOUNT(inode)	((inode)->i_flags & S_AUTOMOUNT)
 #define IS_NOSEC(inode)		((inode)->i_flags & S_NOSEC)
 #define IS_DAX(inode)		((inode)->i_flags & S_DAX)
+#define IS_IOMAP_SEALED(inode) ((inode)->i_flags & S_IOMAP_SEALED)
 
 #define IS_WHITEOUT(inode)	(S_ISCHR(inode->i_mode) && \
 				 (inode)->i_rdev == WHITEOUT_DEV)
diff --git a/mm/filemap.c b/mm/filemap.c
index a49702445ce0..4c929961b538 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2806,6 +2806,11 @@ inline ssize_t generic_write_checks(struct kiocb *iocb, struct iov_iter *from)
 	if (unlikely(pos >= inode->i_sb->s_maxbytes))
 		return -EFBIG;
 
+	/* Are we about to mutate the block map on a sealed file? */
+	if (IS_IOMAP_SEALED(inode)
+			&& (pos + iov_iter_count(from) > i_size_read(inode)))
+		return -ETXTBSY;
+
 	iov_iter_truncate(from, inode->i_sb->s_maxbytes - pos);
 	return iov_iter_count(from);
 }

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
       [not found] ` <150277752553.23945.13932394738552748440.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-08-15  6:12   ` Dan Williams
  2017-08-15 12:27     ` Jan Kara
                       ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15  6:12 UTC (permalink / raw)
  To: darrick.wong-QHcLZuEGTsvQT0dZR+AlfA
  Cc: Jan Kara, Arnd Bergmann, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, luto-DgEjT+Ai2ygdnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton,
	Christoph Hellwig

The mmap syscall suffers from the ABI anti-pattern of not validating
unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
mechanism to define new behavior that is known to fail on older kernels
without the feature. Use the fact that specifying MAP_SHARED and
MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
set of validated flags to be introduced.

This also introduces the ->fmmap() file operation that is ->mmap() plus
flags. Each ->fmmap() implementation must fail requests when a locally
unsupported flag is specified.

Cc: Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org>
Cc: Arnd Bergmann <arnd-r2nGTMty4D4@public.gmane.org>
Cc: Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>
Suggested-by: Christoph Hellwig <hch-jcswGhMUV9g@public.gmane.org>
Signed-off-by: Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
---
 include/linux/fs.h                     |    7 +++++++
 include/linux/mm.h                     |    2 +-
 include/linux/mman.h                   |    3 +++
 include/uapi/asm-generic/mman-common.h |    1 +
 mm/mmap.c                              |   20 +++++++++++++++++---
 5 files changed, 29 insertions(+), 4 deletions(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 1104e5df39ef..bbe755d0caee 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1674,6 +1674,7 @@ struct file_operations {
 	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
 	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
 	int (*mmap) (struct file *, struct vm_area_struct *);
+	int (*fmmap) (struct file *, struct vm_area_struct *, unsigned long);
 	int (*open) (struct inode *, struct file *);
 	int (*flush) (struct file *, fl_owner_t id);
 	int (*release) (struct inode *, struct file *);
@@ -1748,6 +1749,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
 	return file->f_op->mmap(file, vma);
 }
 
+static inline int call_fmmap(struct file *file, struct vm_area_struct *vma,
+		unsigned long flags)
+{
+	return file->f_op->fmmap(file, vma, flags);
+}
+
 ssize_t rw_copy_check_uvector(int type, const struct iovec __user * uvector,
 			      unsigned long nr_segs, unsigned long fast_segs,
 			      struct iovec *fast_pointer,
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 46b9ac5e8569..49eef48da4b7 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2090,7 +2090,7 @@ extern unsigned long get_unmapped_area(struct file *, unsigned long, unsigned lo
 
 extern unsigned long mmap_region(struct file *file, unsigned long addr,
 	unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
-	struct list_head *uf);
+	struct list_head *uf, unsigned long flags);
 extern unsigned long do_mmap(struct file *file, unsigned long addr,
 	unsigned long len, unsigned long prot, unsigned long flags,
 	vm_flags_t vm_flags, unsigned long pgoff, unsigned long *populate,
diff --git a/include/linux/mman.h b/include/linux/mman.h
index c8367041fafd..73d4ac7e7136 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -7,6 +7,9 @@
 #include <linux/atomic.h>
 #include <uapi/linux/mman.h>
 
+/* the MAP_VALIDATE set of supported flags */
+#define	MAP_SUPPORTED_MASK (0)
+
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
 extern unsigned long sysctl_overcommit_kbytes;
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 8c27db0c5c08..8bf8c7828275 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -24,6 +24,7 @@
 #else
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
+#define MAP_VALIDATE (MAP_SHARED|MAP_PRIVATE) /* mechanism to define new shared semantics */
 
 /*
  * Flags for mlock
diff --git a/mm/mmap.c b/mm/mmap.c
index f19efcf75418..d2919a9e25bf 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1388,6 +1388,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 		struct inode *inode = file_inode(file);
 
 		switch (flags & MAP_TYPE) {
+		case MAP_VALIDATE:
+			if (flags & ~(MAP_SUPPORTED_MASK | MAP_VALIDATE))
+				return -EINVAL;
+			if (!file->f_op->fmmap)
+				return -EOPNOTSUPP;
+			/* fall through */
 		case MAP_SHARED:
 			if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
 				return -EACCES;
@@ -1464,7 +1470,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 			vm_flags |= VM_NORESERVE;
 	}
 
-	addr = mmap_region(file, addr, len, vm_flags, pgoff, uf);
+	if ((flags & MAP_VALIDATE) == MAP_VALIDATE)
+		flags &= MAP_SUPPORTED_MASK;
+	else
+		flags = 0;
+
+	addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, flags);
 	if (!IS_ERR_VALUE(addr) &&
 	    ((vm_flags & VM_LOCKED) ||
 	     (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
@@ -1601,7 +1612,7 @@ static inline int accountable_mapping(struct file *file, vm_flags_t vm_flags)
 
 unsigned long mmap_region(struct file *file, unsigned long addr,
 		unsigned long len, vm_flags_t vm_flags, unsigned long pgoff,
-		struct list_head *uf)
+		struct list_head *uf, unsigned long flags)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma, *prev;
@@ -1686,7 +1697,10 @@ unsigned long mmap_region(struct file *file, unsigned long addr,
 		 * new file must not have been exposed to user-space, yet.
 		 */
 		vma->vm_file = get_file(file);
-		error = call_mmap(file, vma);
+		if (flags)
+			error = call_fmmap(file, vma, flags);
+		else
+			error = call_mmap(file, vma);
 		if (error)
 			goto unmap_and_free_vma;

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
  2017-08-15  6:12 [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dan Williams
  2017-08-15  6:12 ` [PATCH v4 1/3] fs, xfs: introduce S_IOMAP_SEALED Dan Williams
       [not found] ` <150277752553.23945.13932394738552748440.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-08-15  6:12 ` Dan Williams
  2017-08-15  9:18   ` Kirill A. Shutemov
       [not found]   ` <150277754211.23945.458876600578531019.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-08-15  9:01 ` [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dave Chinner
  3 siblings, 2 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15  6:12 UTC (permalink / raw)
  To: darrick.wong
  Cc: Jan Kara, linux-nvdimm, linux-api, Dave Chinner, linux-kernel,
	linux-xfs, linux-mm, Jeff Moyer, Alexander Viro, luto,
	linux-fsdevel, Ross Zwisler, Christoph Hellwig

MAP_DIRECT is an mmap(2) flag with the following semantics:

  MAP_DIRECT
  In addition to this mapping having MAP_SHARED semantics, successful
  faults in this range may assume that the block map (logical-file-offset
  to physical memory address) is pinned for the lifetime of the mapping.
  Successful MAP_DIRECT faults establish mappings that bypass any kernel
  indirections like the page-cache. All updates are carried directly
  through to the underlying file physical blocks (modulo cpu cache
  effects).

  ETXTBSY is returned on attempts to change the block map (allocate blocks
  / convert unwritten extents / break shared extents) in the mapped range.
  Some filesystems may extend these same restrictions outside the mapped
  range and return ETXTBSY to any file operations that might mutate the
  block map. MAP_DIRECT faults may fail with a SIGSEGV if the filesystem
  needs to write the block map to satisfy the fault. For example, if the
  mapping was established over a hole in a sparse file.

  The kernel ignores attempts to mark a MAP_DIRECT mapping MAP_PRIVATE and
  will silently fall back to MAP_SHARED semantics.

  ERRORS
  EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set.

  EINVAL MAP_ANONYMOUS was specified with MAP_DIRECT.

  EOPNOTSUPP The filesystem explicitly does not support the flag

  SIGSEGV Attempted to write a MAP_DIRECT mapping at a file offset that
          might require block-map updates.

Cc: Jan Kara <jack@suse.cz>
Cc: Jeff Moyer <jmoyer@redhat.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Dave Chinner <david@fromorbit.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
---
 fs/dax.c                               |    2 +
 fs/xfs/xfs_file.c                      |  109 ++++++++++++++++++++++++++++++++
 fs/xfs/xfs_inode.h                     |    1 
 fs/xfs/xfs_super.c                     |    1 
 include/linux/mm_types.h               |    1 
 include/linux/mman.h                   |    2 -
 include/uapi/asm-generic/mman-common.h |    1 
 mm/mmap.c                              |    2 +
 8 files changed, 117 insertions(+), 2 deletions(-)

diff --git a/fs/dax.c b/fs/dax.c
index 306c2b603fb8..a654b2dd9016 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -1121,6 +1121,8 @@ static int dax_fault_return(int error)
 		return VM_FAULT_NOPAGE;
 	if (error == -ENOMEM)
 		return VM_FAULT_OOM;
+	if (error == -ETXTBSY)
+		return VM_FAULT_SIGSEGV;
 	return VM_FAULT_SIGBUS;
 }
 
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index c4893e226fd8..fcdf6d5768aa 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -40,6 +40,7 @@
 #include "xfs_iomap.h"
 #include "xfs_reflink.h"
 
+#include <linux/mman.h>
 #include <linux/dcache.h>
 #include <linux/falloc.h>
 #include <linux/pagevec.h>
@@ -1001,6 +1002,23 @@ xfs_file_llseek(
 	return vfs_setpos(file, offset, inode->i_sb->s_maxbytes);
 }
 
+STATIC int
+xfs_vma_checks(
+	struct vm_area_struct	*vma,
+	struct inode		*inode)
+{
+	if ((vma->fs_flags & MAP_DIRECT) != MAP_DIRECT)
+		return 0;
+
+	if (xfs_is_reflink_inode(XFS_I(inode)))
+		return VM_FAULT_SIGSEGV;
+
+	if (!IS_DAX(inode))
+		return VM_FAULT_SIGSEGV;
+
+	return 0;
+}
+
 /*
  * Locking for serialisation of IO during page faults. This results in a lock
  * ordering of:
@@ -1031,6 +1049,10 @@ xfs_filemap_page_mkwrite(
 	file_update_time(vmf->vma->vm_file);
 	xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
 
+	ret = xfs_vma_checks(vmf->vma, inode);
+	if (ret)
+		goto out_unlock;
+
 	if (IS_DAX(inode)) {
 		ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &xfs_iomap_ops);
 	} else {
@@ -1038,6 +1060,7 @@ xfs_filemap_page_mkwrite(
 		ret = block_page_mkwrite_return(ret);
 	}
 
+out_unlock:
 	xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
 	sb_end_pagefault(inode->i_sb);
 
@@ -1058,10 +1081,15 @@ xfs_filemap_fault(
 		return xfs_filemap_page_mkwrite(vmf);
 
 	xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
+	ret = xfs_vma_checks(vmf->vma, inode);
+	if (ret)
+		goto out_unlock;
+
 	if (IS_DAX(inode))
 		ret = dax_iomap_fault(vmf, PE_SIZE_PTE, &xfs_iomap_ops);
 	else
 		ret = filemap_fault(vmf);
+out_unlock:
 	xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
 
 	return ret;
@@ -1094,7 +1122,9 @@ xfs_filemap_huge_fault(
 	}
 
 	xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
-	ret = dax_iomap_fault(vmf, pe_size, &xfs_iomap_ops);
+	ret = xfs_vma_checks(vmf->vma, inode);
+	if (ret == 0)
+		ret = dax_iomap_fault(vmf, pe_size, &xfs_iomap_ops);
 	xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED);
 
 	if (vmf->flags & FAULT_FLAG_WRITE)
@@ -1137,12 +1167,63 @@ xfs_filemap_pfn_mkwrite(
 
 }
 
+STATIC void
+xfs_filemap_open(
+	struct vm_area_struct	*vma)
+{
+	struct file		*filp = vma->vm_file;
+	struct inode		*inode = file_inode(filp);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	if ((vma->fs_flags & MAP_DIRECT) != MAP_DIRECT)
+		return;
+	atomic_inc(&ip->i_mapdcount);
+}
+
+STATIC int
+atomic_dec_and_xfs_ilock(
+	atomic_t		*atomic,
+	struct xfs_inode	*ip,
+	uint			lock_flags)
+{
+	/* Subtract 1 from counter unless that drops it to 0 (ie. it was 1) */
+	if (atomic_add_unless(atomic, -1, 1))
+		return 0;
+
+	/* Otherwise do it the slow way */
+	xfs_ilock(ip, lock_flags);
+	if (atomic_dec_and_test(atomic))
+		return 1;
+	xfs_iunlock(ip, lock_flags);
+	return 0;
+}
+
+STATIC void
+xfs_filemap_close(
+	struct vm_area_struct	*vma)
+{
+	struct file		*filp = vma->vm_file;
+	struct inode		*inode = file_inode(filp);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	if ((vma->fs_flags & MAP_DIRECT) != MAP_DIRECT)
+		return;
+
+	if (!atomic_dec_and_xfs_ilock(&ip->i_mapdcount, ip,
+				XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL))
+		return;
+	inode->i_flags &= ~S_IOMAP_SEALED;
+	xfs_iunlock(ip, XFS_MMAPLOCK_EXCL | XFS_IOLOCK_EXCL);
+}
+
 static const struct vm_operations_struct xfs_file_vm_ops = {
 	.fault		= xfs_filemap_fault,
 	.huge_fault	= xfs_filemap_huge_fault,
 	.map_pages	= filemap_map_pages,
 	.page_mkwrite	= xfs_filemap_page_mkwrite,
 	.pfn_mkwrite	= xfs_filemap_pfn_mkwrite,
+	.open		= xfs_filemap_open,
+	.close		= xfs_filemap_close,
 };
 
 STATIC int
@@ -1157,6 +1238,31 @@ xfs_file_mmap(
 	return 0;
 }
 
+#define	XFS_MAP_SUPPORTED (MAP_DIRECT)
+
+STATIC int
+xfs_file_fmmap(
+	struct file		*filp,
+	struct vm_area_struct	*vma,
+	unsigned long		flags)
+{
+	struct inode		*inode = file_inode(filp);
+	struct xfs_inode	*ip = XFS_I(inode);
+
+	if (flags & ~(XFS_MAP_SUPPORTED))
+		return -EOPNOTSUPP;
+
+	xfs_ilock(ip, XFS_MMAPLOCK_EXCL|XFS_IOLOCK_EXCL);
+	if ((flags & MAP_DIRECT) == MAP_DIRECT) {
+		vma->fs_flags |= MAP_DIRECT;
+		inode->i_flags |= S_IOMAP_SEALED;
+		atomic_inc(&ip->i_mapdcount);
+	}
+	xfs_iunlock(ip, XFS_MMAPLOCK_EXCL|XFS_IOLOCK_EXCL);
+
+	return xfs_file_mmap(filp, vma);
+}
+
 const struct file_operations xfs_file_operations = {
 	.llseek		= xfs_file_llseek,
 	.read_iter	= xfs_file_read_iter,
@@ -1168,6 +1274,7 @@ const struct file_operations xfs_file_operations = {
 	.compat_ioctl	= xfs_file_compat_ioctl,
 #endif
 	.mmap		= xfs_file_mmap,
+	.fmmap		= xfs_file_fmmap,
 	.open		= xfs_file_open,
 	.release	= xfs_file_release,
 	.fsync		= xfs_file_fsync,
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 0ee453de239a..50d3e1bca1a9 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -58,6 +58,7 @@ typedef struct xfs_inode {
 	mrlock_t		i_lock;		/* inode lock */
 	mrlock_t		i_mmaplock;	/* inode mmap IO lock */
 	atomic_t		i_pincount;	/* inode pin count */
+	atomic_t		i_mapdcount;	/* inode MAP_DIRECT count */
 	spinlock_t		i_flags_lock;	/* inode i_flags lock */
 	/* Miscellaneous state. */
 	unsigned long		i_flags;	/* see defined flags below */
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 664db709cd1a..2604568354db 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -1011,6 +1011,7 @@ xfs_fs_inode_init_once(
 
 	/* xfs inode */
 	atomic_set(&ip->i_pincount, 0);
+	atomic_set(&ip->i_mapdcount, 0);
 	spin_lock_init(&ip->i_flags_lock);
 
 	mrlock_init(&ip->i_mmaplock, MRLOCK_ALLOW_EQUAL_PRI|MRLOCK_BARRIER,
diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index ff151814a02d..73fdc0ada9ee 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -306,6 +306,7 @@ struct vm_area_struct {
 	struct mm_struct *vm_mm;	/* The address space we belong to. */
 	pgprot_t vm_page_prot;		/* Access permissions of this VMA. */
 	unsigned long vm_flags;		/* Flags, see mm.h. */
+	unsigned long fs_flags;		/* fs flags, see MAP_DIRECT etc */
 
 	/*
 	 * For areas with an address space and backing store,
diff --git a/include/linux/mman.h b/include/linux/mman.h
index 73d4ac7e7136..dc120995f684 100644
--- a/include/linux/mman.h
+++ b/include/linux/mman.h
@@ -8,7 +8,7 @@
 #include <uapi/linux/mman.h>
 
 /* the MAP_VALIDATE set of supported flags */
-#define	MAP_SUPPORTED_MASK (0)
+#define	MAP_SUPPORTED_MASK (MAP_DIRECT)
 
 extern int sysctl_overcommit_memory;
 extern int sysctl_overcommit_ratio;
diff --git a/include/uapi/asm-generic/mman-common.h b/include/uapi/asm-generic/mman-common.h
index 8bf8c7828275..a16184402c45 100644
--- a/include/uapi/asm-generic/mman-common.h
+++ b/include/uapi/asm-generic/mman-common.h
@@ -25,6 +25,7 @@
 # define MAP_UNINITIALIZED 0x0		/* Don't support this flag */
 #endif
 #define MAP_VALIDATE (MAP_SHARED|MAP_PRIVATE) /* mechanism to define new shared semantics */
+#define MAP_DIRECT (MAP_VALIDATE | 0x40)	/* shared, sealed, and no page cache */
 
 /*
  * Flags for mlock
diff --git a/mm/mmap.c b/mm/mmap.c
index d2919a9e25bf..f12de3859fec 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1393,6 +1393,8 @@ unsigned long do_mmap(struct file *file, unsigned long addr,
 				return -EINVAL;
 			if (!file->f_op->fmmap)
 				return -EOPNOTSUPP;
+			if ((flags & MAP_DIRECT) && !(prot & PROT_WRITE))
+				return -EACCES;
 			/* fall through */
 		case MAP_SHARED:
 			if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))

^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 0/3] MAP_DIRECT and block-map sealed files
  2017-08-15  6:12 [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dan Williams
                   ` (2 preceding siblings ...)
  2017-08-15  6:12 ` [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges Dan Williams
@ 2017-08-15  9:01 ` Dave Chinner
  3 siblings, 0 replies; 22+ messages in thread
From: Dave Chinner @ 2017-08-15  9:01 UTC (permalink / raw)
  To: Dan Williams
  Cc: darrick.wong, Jan Kara, Arnd Bergmann, linux-nvdimm, linux-api,
	linux-kernel, Christoph Hellwig, linux-xfs, linux-mm, Jeff Moyer,
	Alexander Viro, luto, linux-fsdevel, Andrew Morton, Ross Zwisler

On Mon, Aug 14, 2017 at 11:12:05PM -0700, Dan Williams wrote:
> Changes since v3 [1]:
> * Move from an fallocate(2) interface to a new mmap(2) flag and rename
>   'immutable' to 'sealed'.
> 
> * Do not record the sealed state in permanent metadata it is now purely
>   a temporary state for as long as a MAP_DIRECT vma is referencing the
>   inode (Christoph)
> 
> * Drop the CAP_IMMUTABLE requirement, but do require a PROT_WRITE
>   mapping.
> 
> [1]: https://lwn.net/Articles/730570/
> 
> ---
> 
> This is the next revision of a patch series that aims to enable
> applications that otherwise need to resort to DAX mapping a raw device
> file to instead move to a filesystem.
> 
> In the course of reviewing a previous posting, Christoph said:
> 
>     That being said I think we absolutely should support RDMA memory
>     registrations for DAX mappings.  I'm just not sure how S_IOMAP_IMMUTABLE
>     helps with that.  We'll want a MAP_SYNC | MAP_POPULATE to make sure all
>     the blocks are populated and all ptes are set up.  Second we need to
>     make sure get_user_page works, which for now means we'll need a struct
>     page mapping for the region (which will be really annoying for PCIe
>     mappings, like the upcoming NVMe persistent memory region), and we need
>     to guarantee that the extent mapping won't change while the
>     get_user_pages holds the pages inside it.  I think that is true due to
>     side effects even with the current DAX code, but we'll need to make it
>     explicit.  And maybe that's where we need to converge - "sealing" the
>     extent map makes sense as such a temporary measure that is not persisted
>     on disk, which automatically gets released when the holding process
>     exits, because we sort of already do this implicitly.  It might also
>     make sense to have explicitly breakable seals similar to what I do for
>     the pNFS blocks kernel server, as any userspace RDMA file server would
>     also need those semantics.
> 
> So, this is an attempt to converge on the idea that we need an explicit
> and process-lifetime-temporary mechanism for a process to be able to
> make assumptions about the mapping to physical page to dax-file-offset
> relationship. The "explicitly breakable seals" aspect is not addressed
> in these patches, but I wonder if it might be a voluntary mechanism that
> can implemented via userfaultfd.
> 
> These pass a basic smoke test and are meant to just gauge 'right track'
> / 'wrong track'. The main question it seems is whether the pinning done
> in this patchset is too early (applies before get_user_pages()) and too
> coarse (applies to the whole file). Perhaps this is where I discarded
> too easily Jan's suggestion to look at Peter Z's mm_mpin() syscall [2]? On
> the other hand, the coarseness and simple lifetime rules of MAP_DIRECT
> make it an easy mechanism to implement and explain.
> 
> Another reason I kept the scope of S_IOMAP_SEALED coarsely defined was
> to support Dave's desired use case of sealing for operating on reflinked
> files [3].

Which really needs a fcntl() interface to set/clear iomap seals.

Which, now that I look at it, already has a bunch of "file sealing"
commands defined which arrived in 3.17. It appears to be a special
purpose access control interface for memfd_create() to manage shared
access to anonymous tmpfs files and will EINVAL on any fd that
points to a real file.

Oh, even more problematic:

	Seals are a property of an inode. [....] Furthermore, seals
	can never be removed, only added.

That seems somewhat difficult to reconcile with how I need
F_SEAL_IOMAP to operate.

/me calls it a day and goes looking for the hard liquor.....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
  2017-08-15  6:12 ` [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges Dan Williams
@ 2017-08-15  9:18   ` Kirill A. Shutemov
  2017-08-15 17:11     ` Dan Williams
       [not found]   ` <150277754211.23945.458876600578531019.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  1 sibling, 1 reply; 22+ messages in thread
From: Kirill A. Shutemov @ 2017-08-15  9:18 UTC (permalink / raw)
  To: Dan Williams
  Cc: darrick.wong, Jan Kara, linux-nvdimm, linux-api, Dave Chinner,
	linux-kernel, linux-xfs, linux-mm, Jeff Moyer, Alexander Viro,
	luto, linux-fsdevel, Ross Zwisler, Christoph Hellwig

On Mon, Aug 14, 2017 at 11:12:22PM -0700, Dan Williams wrote:
> MAP_DIRECT is an mmap(2) flag with the following semantics:
> 
>   MAP_DIRECT
>   In addition to this mapping having MAP_SHARED semantics, successful
>   faults in this range may assume that the block map (logical-file-offset
>   to physical memory address) is pinned for the lifetime of the mapping.
>   Successful MAP_DIRECT faults establish mappings that bypass any kernel
>   indirections like the page-cache. All updates are carried directly
>   through to the underlying file physical blocks (modulo cpu cache
>   effects).
> 
>   ETXTBSY is returned on attempts to change the block map (allocate blocks
>   / convert unwritten extents / break shared extents) in the mapped range.
>   Some filesystems may extend these same restrictions outside the mapped
>   range and return ETXTBSY to any file operations that might mutate the
>   block map. MAP_DIRECT faults may fail with a SIGSEGV if the filesystem
>   needs to write the block map to satisfy the fault. For example, if the
>   mapping was established over a hole in a sparse file.

We had issues before with user-imposed ETXTBSY. See MAP_DENYWRITE.

Are we sure it won't a source of denial-of-service attacks?

>   The kernel ignores attempts to mark a MAP_DIRECT mapping MAP_PRIVATE and
>   will silently fall back to MAP_SHARED semantics.

Hm.. Any reason for this strage behaviour? Looks just broken to me.

> 
>   ERRORS
>   EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set.
> 
>   EINVAL MAP_ANONYMOUS was specified with MAP_DIRECT.
> 
>   EOPNOTSUPP The filesystem explicitly does not support the flag
> 
>   SIGSEGV Attempted to write a MAP_DIRECT mapping at a file offset that
>           might require block-map updates.

I think it should be SIGBUS.

-- 
 Kirill A. Shutemov

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-08-15  6:12   ` [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags Dan Williams
@ 2017-08-15 12:27     ` Jan Kara
       [not found]       ` <20170815122701.GF27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
  2017-09-17  3:44       ` Dan Williams
  2017-08-15 16:28     ` Andy Lutomirski
  2017-08-17  8:06     ` kbuild test robot
  2 siblings, 2 replies; 22+ messages in thread
From: Jan Kara @ 2017-08-15 12:27 UTC (permalink / raw)
  To: Dan Williams
  Cc: darrick.wong, Jan Kara, Arnd Bergmann, linux-nvdimm, linux-api,
	linux-kernel, linux-xfs, linux-mm, luto, linux-fsdevel,
	Andrew Morton, Christoph Hellwig

On Mon 14-08-17 23:12:16, Dan Williams wrote:
> The mmap syscall suffers from the ABI anti-pattern of not validating
> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
> mechanism to define new behavior that is known to fail on older kernels
> without the feature. Use the fact that specifying MAP_SHARED and
> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
> set of validated flags to be introduced.
> 
> This also introduces the ->fmmap() file operation that is ->mmap() plus
> flags. Each ->fmmap() implementation must fail requests when a locally
> unsupported flag is specified.
...
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 1104e5df39ef..bbe755d0caee 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1674,6 +1674,7 @@ struct file_operations {
>  	long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>  	long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>  	int (*mmap) (struct file *, struct vm_area_struct *);
> +	int (*fmmap) (struct file *, struct vm_area_struct *, unsigned long);
>  	int (*open) (struct inode *, struct file *);
>  	int (*flush) (struct file *, fl_owner_t id);
>  	int (*release) (struct inode *, struct file *);
> @@ -1748,6 +1749,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>  	return file->f_op->mmap(file, vma);
>  }
>  
> +static inline int call_fmmap(struct file *file, struct vm_area_struct *vma,
> +		unsigned long flags)
> +{
> +	return file->f_op->fmmap(file, vma, flags);
> +}
> +

Hum, I dislike a new file op for this when the only problem with ->mmap is
that it misses 'flags' argument. I understand there are lots of ->mmap
implementations out there and modifying prototype of them all is painful
but is it so bad? Coccinelle patch for this should be rather easy...

Also for MAP_SYNC I want the flag to be copied in VMA anyway so for that I
don't need additional flags argument anyway. And I wonder how you want to
make things work without VMA flag in case of MAP_DIRECT as well - VMAs can
be split, partially unmapped etc. and so without VMA flag you are going to
have hard time to detect whether there's any mapping left which blocks
block mapping changes.

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
       [not found]   ` <150277754211.23945.458876600578531019.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
@ 2017-08-15 12:42     ` Jan Kara
       [not found]       ` <20170815124250.GG27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
  2017-08-17  8:49     ` kbuild test robot
  1 sibling, 1 reply; 22+ messages in thread
From: Jan Kara @ 2017-08-15 12:42 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	darrick.wong-QHcLZuEGTsvQT0dZR+AlfA, Dave Chinner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, Alexander Viro,
	luto-DgEjT+Ai2ygdnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig

On Mon 14-08-17 23:12:22, Dan Williams wrote:
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index ff151814a02d..73fdc0ada9ee 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -306,6 +306,7 @@ struct vm_area_struct {
>  	struct mm_struct *vm_mm;	/* The address space we belong to. */
>  	pgprot_t vm_page_prot;		/* Access permissions of this VMA. */
>  	unsigned long vm_flags;		/* Flags, see mm.h. */
> +	unsigned long fs_flags;		/* fs flags, see MAP_DIRECT etc */
>  
>  	/*
>  	 * For areas with an address space and backing store,

Ah, OK, here are VMA flags I was missing in the previous patch :) But why
did you create separate fs_flags field for this? on 64-bit archs there's
still space in vm_flags and frankly I don't see why we should separate
MAP_DIRECT or MAP_SYNC from other flags? After all a difference in these
flags must also prevent VMA merging (which you forgot to handle I think)
and they need to be copied on split (which happens by chance even now).

								Honza
-- 
Jan Kara <jack-IBi9RG/b67k@public.gmane.org>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
       [not found]       ` <20170815122701.GF27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
@ 2017-08-15 16:24         ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15 16:24 UTC (permalink / raw)
  To: Jan Kara
  Cc: Arnd Bergmann, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Linux API,
	Darrick J. Wong, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA, Linux MM, Andy Lutomirski,
	linux-fsdevel, Andrew Morton, Christoph Hellwig

On Tue, Aug 15, 2017 at 5:27 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> On Mon 14-08-17 23:12:16, Dan Williams wrote:
>> The mmap syscall suffers from the ABI anti-pattern of not validating
>> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
>> mechanism to define new behavior that is known to fail on older kernels
>> without the feature. Use the fact that specifying MAP_SHARED and
>> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
>> set of validated flags to be introduced.
>>
>> This also introduces the ->fmmap() file operation that is ->mmap() plus
>> flags. Each ->fmmap() implementation must fail requests when a locally
>> unsupported flag is specified.
> ...
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 1104e5df39ef..bbe755d0caee 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1674,6 +1674,7 @@ struct file_operations {
>>       long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>>       long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>>       int (*mmap) (struct file *, struct vm_area_struct *);
>> +     int (*fmmap) (struct file *, struct vm_area_struct *, unsigned long);
>>       int (*open) (struct inode *, struct file *);
>>       int (*flush) (struct file *, fl_owner_t id);
>>       int (*release) (struct inode *, struct file *);
>> @@ -1748,6 +1749,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>>       return file->f_op->mmap(file, vma);
>>  }
>>
>> +static inline int call_fmmap(struct file *file, struct vm_area_struct *vma,
>> +             unsigned long flags)
>> +{
>> +     return file->f_op->fmmap(file, vma, flags);
>> +}
>> +
>
> Hum, I dislike a new file op for this when the only problem with ->mmap is
> that it misses 'flags' argument. I understand there are lots of ->mmap
> implementations out there and modifying prototype of them all is painful
> but is it so bad? Coccinelle patch for this should be rather easy...

Changing the prototype is relatively easy with Coccinelle, but we
still need the code in each ->mmap() implementation to validate a
local list of supported flags. How about adding a 'supported mmap
flags' field to 'struct file_operations' so that the validation code
can be made generic? I'll go with that since it's a bit less
surprising than a new operation type, and not as messy as teaching
every mmap implementation in the kernel to validate flags that they
will likely never care about.

> Also for MAP_SYNC I want the flag to be copied in VMA anyway so for that I
> don't need additional flags argument anyway. And I wonder how you want to
> make things work without VMA flag in case of MAP_DIRECT as well - VMAs can
> be split, partially unmapped etc. and so without VMA flag you are going to
> have hard time to detect whether there's any mapping left which blocks
> block mapping changes.

Outside of requiring a 64-bit arch, we're out of vm_flags. Also, the
core mm does not really care about MAP_DIRECT or MAP_SYNC so that's
why I added a new ->fs_flags field since these are more filesystem
properties than core mm.

The problem of tracking MAP_DIRECT over vma splits appears to already
be handled. __split_vma does:

        /* most fields are the same, copy all, and then fixup */
        *new = *vma;
...

        if (new->vm_ops && new->vm_ops->open)
                new->vm_ops->open(new);

In ->open() I'm checking if 'new' has MAP_DIRECT in ->fs_flags and
taking a reference against the S_IOMAP_SEALED flag.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-08-15  6:12   ` [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags Dan Williams
  2017-08-15 12:27     ` Jan Kara
@ 2017-08-15 16:28     ` Andy Lutomirski
  2017-08-15 22:31       ` Dan Williams
  2017-08-17  8:06     ` kbuild test robot
  2 siblings, 1 reply; 22+ messages in thread
From: Andy Lutomirski @ 2017-08-15 16:28 UTC (permalink / raw)
  To: Dan Williams
  Cc: Darrick J. Wong, Jan Kara, Arnd Bergmann, linux-nvdimm,
	Linux API, linux-kernel, linux-xfs, linux-mm, Andrew Lutomirski,
	Linux FS Devel, Andrew Morton, Christoph Hellwig

On Mon, Aug 14, 2017 at 11:12 PM, Dan Williams <dan.j.williams@intel.com> wrote:
> The mmap syscall suffers from the ABI anti-pattern of not validating
> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
> mechanism to define new behavior that is known to fail on older kernels
> without the feature. Use the fact that specifying MAP_SHARED and
> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
> set of validated flags to be introduced.

While this is cute, is it actually better than a new syscall?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
       [not found]       ` <20170815124250.GG27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
@ 2017-08-15 16:29         ` Dan Williams
       [not found]           ` <CAPcyv4h01os0Gc6bYmaGdMXt5q4G4zfirNRPWG3=gQi5POrpmg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2017-08-15 16:29 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Linux API,
	Dave Chinner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA, Linux MM, Jeff Moyer,
	Alexander Viro, Andy Lutomirski, linux-fsdevel, Ross Zwisler,
	Christoph Hellwig

On Tue, Aug 15, 2017 at 5:42 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
> On Mon 14-08-17 23:12:22, Dan Williams wrote:
>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>> index ff151814a02d..73fdc0ada9ee 100644
>> --- a/include/linux/mm_types.h
>> +++ b/include/linux/mm_types.h
>> @@ -306,6 +306,7 @@ struct vm_area_struct {
>>       struct mm_struct *vm_mm;        /* The address space we belong to. */
>>       pgprot_t vm_page_prot;          /* Access permissions of this VMA. */
>>       unsigned long vm_flags;         /* Flags, see mm.h. */
>> +     unsigned long fs_flags;         /* fs flags, see MAP_DIRECT etc */
>>
>>       /*
>>        * For areas with an address space and backing store,
>
> Ah, OK, here are VMA flags I was missing in the previous patch :) But why
> did you create separate fs_flags field for this? on 64-bit archs there's
> still space in vm_flags and frankly I don't see why we should separate
> MAP_DIRECT or MAP_SYNC from other flags?

Where would MAP_DIRECT go in the 32-bit case?

> After all a difference in these
> flags must also prevent VMA merging (which you forgot to handle I think)
> and they need to be copied on split (which happens by chance even now).

Ah, yes I did miss blocking the merge of a vma with MAP_DIRECT and one
without. However, the vma split path looks ok.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
  2017-08-15  9:18   ` Kirill A. Shutemov
@ 2017-08-15 17:11     ` Dan Williams
  2017-08-16 10:25       ` Kirill A. Shutemov
  0 siblings, 1 reply; 22+ messages in thread
From: Dan Williams @ 2017-08-15 17:11 UTC (permalink / raw)
  To: Kirill A. Shutemov
  Cc: Darrick J. Wong, Jan Kara, linux-nvdimm, Linux API, Dave Chinner,
	linux-kernel, linux-xfs, Linux MM, Jeff Moyer, Alexander Viro,
	Andy Lutomirski, linux-fsdevel, Ross Zwisler, Christoph Hellwig

On Tue, Aug 15, 2017 at 2:18 AM, Kirill A. Shutemov
<kirill@shutemov.name> wrote:
> On Mon, Aug 14, 2017 at 11:12:22PM -0700, Dan Williams wrote:
>> MAP_DIRECT is an mmap(2) flag with the following semantics:
>>
>>   MAP_DIRECT
>>   In addition to this mapping having MAP_SHARED semantics, successful
>>   faults in this range may assume that the block map (logical-file-offset
>>   to physical memory address) is pinned for the lifetime of the mapping.
>>   Successful MAP_DIRECT faults establish mappings that bypass any kernel
>>   indirections like the page-cache. All updates are carried directly
>>   through to the underlying file physical blocks (modulo cpu cache
>>   effects).
>>
>>   ETXTBSY is returned on attempts to change the block map (allocate blocks
>>   / convert unwritten extents / break shared extents) in the mapped range.
>>   Some filesystems may extend these same restrictions outside the mapped
>>   range and return ETXTBSY to any file operations that might mutate the
>>   block map. MAP_DIRECT faults may fail with a SIGSEGV if the filesystem
>>   needs to write the block map to satisfy the fault. For example, if the
>>   mapping was established over a hole in a sparse file.
>
> We had issues before with user-imposed ETXTBSY. See MAP_DENYWRITE.
>
> Are we sure it won't a source of denial-of-service attacks?

I believe MAP_DENYWRITE allowed any application with read access to be
able to deny writes which is obviously problematic. MAP_DIRECT is
different. You need write access to the file so you can already
destroy data that another application might depend on, and this only
blocks allocation and reflink.

However, I'm not opposed to adding more safety around this. I think we
can address this concern with an fcntl seal as Dave suggests, but the
seal only applies to the 'struct file' instance and only gates whether
MAP_DIRECT is allowed on that file. The act of setting
F_MAY_SEAL_IOMAP requires CAP_IMMUTABLE, but MAP_DIRECT does not. This
allows the 'permission to mmap(MAP_DIRECT)' to be passed around with
an open file descriptor.

>
>>   The kernel ignores attempts to mark a MAP_DIRECT mapping MAP_PRIVATE and
>>   will silently fall back to MAP_SHARED semantics.
>
> Hm.. Any reason for this strage behaviour? Looks just broken to me.
>
>>
>>   ERRORS
>>   EACCES A MAP_DIRECT mapping was requested and PROT_WRITE was not set.
>>
>>   EINVAL MAP_ANONYMOUS was specified with MAP_DIRECT.
>>
>>   EOPNOTSUPP The filesystem explicitly does not support the flag
>>
>>   SIGSEGV Attempted to write a MAP_DIRECT mapping at a file offset that
>>           might require block-map updates.
>
> I think it should be SIGBUS.

Ok, that does seem to fit this definition from the mmap(2) man page:

SIGBUS Attempted access to a portion of the buffer that does not
correspond to the file  (for example, beyond the end of the file,
including the case where another process has truncated the file).

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-08-15 16:28     ` Andy Lutomirski
@ 2017-08-15 22:31       ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-15 22:31 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Darrick J. Wong, Jan Kara, Arnd Bergmann, linux-nvdimm,
	Linux API, linux-kernel, linux-xfs, linux-mm, Linux FS Devel,
	Andrew Morton, Christoph Hellwig

On Tue, Aug 15, 2017 at 9:28 AM, Andy Lutomirski <luto@kernel.org> wrote:
> On Mon, Aug 14, 2017 at 11:12 PM, Dan Williams <dan.j.williams@intel.com> wrote:
>> The mmap syscall suffers from the ABI anti-pattern of not validating
>> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
>> mechanism to define new behavior that is known to fail on older kernels
>> without the feature. Use the fact that specifying MAP_SHARED and
>> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
>> set of validated flags to be introduced.
>
> While this is cute, is it actually better than a new syscall?

After playing with MAP_DIRECT defined as (MAP_SHARED|MAP_PRIVATE|0x40)
I think a new syscall is better. It's very easy to make the mistake
that "MAP_DIRECT" defines a single flag vs representing a multi-bit
encoding.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
       [not found]           ` <CAPcyv4h01os0Gc6bYmaGdMXt5q4G4zfirNRPWG3=gQi5POrpmg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-08-16  1:15             ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2017-08-16  1:15 UTC (permalink / raw)
  To: Jan Kara
  Cc: linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Linux API, Darrick J. Wong,
	Dave Chinner, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA, Linux MM, Alexander Viro,
	Andy Lutomirski, linux-fsdevel, Christoph Hellwig

On Tue, Aug 15, 2017 at 9:29 AM, Dan Williams <dan.j.williams-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
> On Tue, Aug 15, 2017 at 5:42 AM, Jan Kara <jack-AlSwsSmVLrQ@public.gmane.org> wrote:
>> On Mon 14-08-17 23:12:22, Dan Williams wrote:
>>> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
>>> index ff151814a02d..73fdc0ada9ee 100644
>>> --- a/include/linux/mm_types.h
>>> +++ b/include/linux/mm_types.h
>>> @@ -306,6 +306,7 @@ struct vm_area_struct {
>>>       struct mm_struct *vm_mm;        /* The address space we belong to. */
>>>       pgprot_t vm_page_prot;          /* Access permissions of this VMA. */
>>>       unsigned long vm_flags;         /* Flags, see mm.h. */
>>> +     unsigned long fs_flags;         /* fs flags, see MAP_DIRECT etc */
>>>
>>>       /*
>>>        * For areas with an address space and backing store,
>>
>> Ah, OK, here are VMA flags I was missing in the previous patch :) But why
>> did you create separate fs_flags field for this? on 64-bit archs there's
>> still space in vm_flags and frankly I don't see why we should separate
>> MAP_DIRECT or MAP_SYNC from other flags?
>
> Where would MAP_DIRECT go in the 32-bit case?
>
>> After all a difference in these
>> flags must also prevent VMA merging (which you forgot to handle I think)
>> and they need to be copied on split (which happens by chance even now).
>
> Ah, yes I did miss blocking the merge of a vma with MAP_DIRECT and one
> without. However, the vma split path looks ok.

The merge path already blocks merging vmas that have the ->close()
operation defined in is_mergeable_vma().

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
  2017-08-15 17:11     ` Dan Williams
@ 2017-08-16 10:25       ` Kirill A. Shutemov
  0 siblings, 0 replies; 22+ messages in thread
From: Kirill A. Shutemov @ 2017-08-16 10:25 UTC (permalink / raw)
  To: Dan Williams
  Cc: Darrick J. Wong, Jan Kara, linux-nvdimm, Linux API, Dave Chinner,
	linux-kernel, linux-xfs, Linux MM, Jeff Moyer, Alexander Viro,
	Andy Lutomirski, linux-fsdevel, Ross Zwisler, Christoph Hellwig

On Tue, Aug 15, 2017 at 10:11:27AM -0700, Dan Williams wrote:
> > We had issues before with user-imposed ETXTBSY. See MAP_DENYWRITE.
> >
> > Are we sure it won't a source of denial-of-service attacks?
> 
> I believe MAP_DENYWRITE allowed any application with read access to be
> able to deny writes which is obviously problematic. MAP_DIRECT is
> different. You need write access to the file so you can already
> destroy data that another application might depend on, and this only
> blocks allocation and reflink.
> 
> However, I'm not opposed to adding more safety around this. I think we
> can address this concern with an fcntl seal as Dave suggests, but the
> seal only applies to the 'struct file' instance and only gates whether
> MAP_DIRECT is allowed on that file. The act of setting
> F_MAY_SEAL_IOMAP requires CAP_IMMUTABLE, but MAP_DIRECT does not. This
> allows the 'permission to mmap(MAP_DIRECT)' to be passed around with
> an open file descriptor.

Sounds like a good approach to me.

-- 
 Kirill A. Shutemov

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-08-15  6:12   ` [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags Dan Williams
  2017-08-15 12:27     ` Jan Kara
  2017-08-15 16:28     ` Andy Lutomirski
@ 2017-08-17  8:06     ` kbuild test robot
  2 siblings, 0 replies; 22+ messages in thread
From: kbuild test robot @ 2017-08-17  8:06 UTC (permalink / raw)
  To: Dan Williams
  Cc: kbuild-all, darrick.wong, Jan Kara, Arnd Bergmann, linux-nvdimm,
	linux-api, linux-kernel, linux-xfs, linux-mm, luto,
	linux-fsdevel, Andrew Morton, Christoph Hellwig

[-- Attachment #1: Type: text/plain, Size: 6397 bytes --]

Hi Dan,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.13-rc5 next-20170816]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dan-Williams/fs-xfs-introduce-S_IOMAP_SEALED/20170817-114711
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=xtensa 

All errors (new ones prefixed by >>):

   mm/mmap.c: In function 'do_mmap':
>> mm/mmap.c:1391:8: error: 'MAP_VALIDATE' undeclared (first use in this function)
      case MAP_VALIDATE:
           ^
   mm/mmap.c:1391:8: note: each undeclared identifier is reported only once for each function it appears in

vim +/MAP_VALIDATE +1391 mm/mmap.c

  1316	
  1317	/*
  1318	 * The caller must hold down_write(&current->mm->mmap_sem).
  1319	 */
  1320	unsigned long do_mmap(struct file *file, unsigned long addr,
  1321				unsigned long len, unsigned long prot,
  1322				unsigned long flags, vm_flags_t vm_flags,
  1323				unsigned long pgoff, unsigned long *populate,
  1324				struct list_head *uf)
  1325	{
  1326		struct mm_struct *mm = current->mm;
  1327		int pkey = 0;
  1328	
  1329		*populate = 0;
  1330	
  1331		if (!len)
  1332			return -EINVAL;
  1333	
  1334		/*
  1335		 * Does the application expect PROT_READ to imply PROT_EXEC?
  1336		 *
  1337		 * (the exception is when the underlying filesystem is noexec
  1338		 *  mounted, in which case we dont add PROT_EXEC.)
  1339		 */
  1340		if ((prot & PROT_READ) && (current->personality & READ_IMPLIES_EXEC))
  1341			if (!(file && path_noexec(&file->f_path)))
  1342				prot |= PROT_EXEC;
  1343	
  1344		if (!(flags & MAP_FIXED))
  1345			addr = round_hint_to_min(addr);
  1346	
  1347		/* Careful about overflows.. */
  1348		len = PAGE_ALIGN(len);
  1349		if (!len)
  1350			return -ENOMEM;
  1351	
  1352		/* offset overflow? */
  1353		if ((pgoff + (len >> PAGE_SHIFT)) < pgoff)
  1354			return -EOVERFLOW;
  1355	
  1356		/* Too many mappings? */
  1357		if (mm->map_count > sysctl_max_map_count)
  1358			return -ENOMEM;
  1359	
  1360		/* Obtain the address to map to. we verify (or select) it and ensure
  1361		 * that it represents a valid section of the address space.
  1362		 */
  1363		addr = get_unmapped_area(file, addr, len, pgoff, flags);
  1364		if (offset_in_page(addr))
  1365			return addr;
  1366	
  1367		if (prot == PROT_EXEC) {
  1368			pkey = execute_only_pkey(mm);
  1369			if (pkey < 0)
  1370				pkey = 0;
  1371		}
  1372	
  1373		/* Do simple checking here so the lower-level routines won't have
  1374		 * to. we assume access permissions have been handled by the open
  1375		 * of the memory object, so we don't do any here.
  1376		 */
  1377		vm_flags |= calc_vm_prot_bits(prot, pkey) | calc_vm_flag_bits(flags) |
  1378				mm->def_flags | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC;
  1379	
  1380		if (flags & MAP_LOCKED)
  1381			if (!can_do_mlock())
  1382				return -EPERM;
  1383	
  1384		if (mlock_future_check(mm, vm_flags, len))
  1385			return -EAGAIN;
  1386	
  1387		if (file) {
  1388			struct inode *inode = file_inode(file);
  1389	
  1390			switch (flags & MAP_TYPE) {
> 1391			case MAP_VALIDATE:
  1392				if (flags & ~(MAP_SUPPORTED_MASK | MAP_VALIDATE))
  1393					return -EINVAL;
  1394				if (!file->f_op->fmmap)
  1395					return -EOPNOTSUPP;
  1396				/* fall through */
  1397			case MAP_SHARED:
  1398				if ((prot&PROT_WRITE) && !(file->f_mode&FMODE_WRITE))
  1399					return -EACCES;
  1400	
  1401				/*
  1402				 * Make sure we don't allow writing to an append-only
  1403				 * file..
  1404				 */
  1405				if (IS_APPEND(inode) && (file->f_mode & FMODE_WRITE))
  1406					return -EACCES;
  1407	
  1408				/*
  1409				 * Make sure there are no mandatory locks on the file.
  1410				 */
  1411				if (locks_verify_locked(file))
  1412					return -EAGAIN;
  1413	
  1414				vm_flags |= VM_SHARED | VM_MAYSHARE;
  1415				if (!(file->f_mode & FMODE_WRITE))
  1416					vm_flags &= ~(VM_MAYWRITE | VM_SHARED);
  1417	
  1418				/* fall through */
  1419			case MAP_PRIVATE:
  1420				if (!(file->f_mode & FMODE_READ))
  1421					return -EACCES;
  1422				if (path_noexec(&file->f_path)) {
  1423					if (vm_flags & VM_EXEC)
  1424						return -EPERM;
  1425					vm_flags &= ~VM_MAYEXEC;
  1426				}
  1427	
  1428				if (!file->f_op->mmap)
  1429					return -ENODEV;
  1430				if (vm_flags & (VM_GROWSDOWN|VM_GROWSUP))
  1431					return -EINVAL;
  1432				break;
  1433	
  1434			default:
  1435				return -EINVAL;
  1436			}
  1437		} else {
  1438			switch (flags & MAP_TYPE) {
  1439			case MAP_SHARED:
  1440				if (vm_flags & (VM_GROWSDOWN|VM_GROWSUP))
  1441					return -EINVAL;
  1442				/*
  1443				 * Ignore pgoff.
  1444				 */
  1445				pgoff = 0;
  1446				vm_flags |= VM_SHARED | VM_MAYSHARE;
  1447				break;
  1448			case MAP_PRIVATE:
  1449				/*
  1450				 * Set pgoff according to addr for anon_vma.
  1451				 */
  1452				pgoff = addr >> PAGE_SHIFT;
  1453				break;
  1454			default:
  1455				return -EINVAL;
  1456			}
  1457		}
  1458	
  1459		/*
  1460		 * Set 'VM_NORESERVE' if we should not account for the
  1461		 * memory use of this mapping.
  1462		 */
  1463		if (flags & MAP_NORESERVE) {
  1464			/* We honor MAP_NORESERVE if allowed to overcommit */
  1465			if (sysctl_overcommit_memory != OVERCOMMIT_NEVER)
  1466				vm_flags |= VM_NORESERVE;
  1467	
  1468			/* hugetlb applies strict overcommit unless MAP_NORESERVE */
  1469			if (file && is_file_hugepages(file))
  1470				vm_flags |= VM_NORESERVE;
  1471		}
  1472	
  1473		if ((flags & MAP_VALIDATE) == MAP_VALIDATE)
  1474			flags &= MAP_SUPPORTED_MASK;
  1475		else
  1476			flags = 0;
  1477	
  1478		addr = mmap_region(file, addr, len, vm_flags, pgoff, uf, flags);
  1479		if (!IS_ERR_VALUE(addr) &&
  1480		    ((vm_flags & VM_LOCKED) ||
  1481		     (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE))
  1482			*populate = len;
  1483		return addr;
  1484	}
  1485	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 50926 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges
       [not found]   ` <150277754211.23945.458876600578531019.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
  2017-08-15 12:42     ` Jan Kara
@ 2017-08-17  8:49     ` kbuild test robot
  1 sibling, 0 replies; 22+ messages in thread
From: kbuild test robot @ 2017-08-17  8:49 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
	linux-api-u79uwXL29TY76Z2rM5mHXA,
	darrick.wong-QHcLZuEGTsvQT0dZR+AlfA, Dave Chinner,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, kbuild-all-JC7UmRfGjtg,
	luto-DgEjT+Ai2ygdnm+yROfE0A,
	linux-fsdevel-u79uwXL29TY76Z2rM5mHXA, Christoph Hellwig,
	Alexander Viro

Hi Dan,

[auto build test ERROR on linus/master]
[also build test ERROR on v4.13-rc5 next-20170816]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Dan-Williams/fs-xfs-introduce-S_IOMAP_SEALED/20170817-114711
config: xtensa-allmodconfig (attached as .config)
compiler: xtensa-linux-gcc (GCC) 4.9.0
reproduce:
        wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        make.cross ARCH=xtensa 

All error/warnings (new ones prefixed by >>):

   mm/mmap.c: In function 'do_mmap':
   mm/mmap.c:1391:8: error: 'MAP_VALIDATE' undeclared (first use in this function)
      case MAP_VALIDATE:
           ^
   mm/mmap.c:1391:8: note: each undeclared identifier is reported only once for each function it appears in
   In file included from mm/mmap.c:17:0:
>> include/linux/mman.h:11:29: error: 'MAP_DIRECT' undeclared (first use in this function)
    #define MAP_SUPPORTED_MASK (MAP_DIRECT)
                                ^
>> mm/mmap.c:1392:18: note: in expansion of macro 'MAP_SUPPORTED_MASK'
       if (flags & ~(MAP_SUPPORTED_MASK | MAP_VALIDATE))
                     ^

vim +/MAP_DIRECT +11 include/linux/mman.h

     9	
    10	/* the MAP_VALIDATE set of supported flags */
  > 11	#define	MAP_SUPPORTED_MASK (MAP_DIRECT)
    12	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-08-15 12:27     ` Jan Kara
       [not found]       ` <20170815122701.GF27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
@ 2017-09-17  3:44       ` Dan Williams
       [not found]         ` <CAA9_cmc0vejxCsc1NWp5b4C0CSsO5xetF3t6LCoCuEYB6yPiwQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2017-09-18  9:26         ` Jan Kara
  1 sibling, 2 replies; 22+ messages in thread
From: Dan Williams @ 2017-09-17  3:44 UTC (permalink / raw)
  To: Jan Kara
  Cc: Darrick J. Wong, Arnd Bergmann, linux-nvdimm, Linux API,
	Linux Kernel Mailing List, linux-xfs, linux-mm, Andy Lutomirski,
	linux-fsdevel, Andrew Morton, Christoph Hellwig

On Tue, Aug 15, 2017 at 5:27 AM, Jan Kara <jack@suse.cz> wrote:
> On Mon 14-08-17 23:12:16, Dan Williams wrote:
>> The mmap syscall suffers from the ABI anti-pattern of not validating
>> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
>> mechanism to define new behavior that is known to fail on older kernels
>> without the feature. Use the fact that specifying MAP_SHARED and
>> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
>> set of validated flags to be introduced.
>>
>> This also introduces the ->fmmap() file operation that is ->mmap() plus
>> flags. Each ->fmmap() implementation must fail requests when a locally
>> unsupported flag is specified.
> ...
>> diff --git a/include/linux/fs.h b/include/linux/fs.h
>> index 1104e5df39ef..bbe755d0caee 100644
>> --- a/include/linux/fs.h
>> +++ b/include/linux/fs.h
>> @@ -1674,6 +1674,7 @@ struct file_operations {
>>       long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
>>       long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
>>       int (*mmap) (struct file *, struct vm_area_struct *);
>> +     int (*fmmap) (struct file *, struct vm_area_struct *, unsigned long);
>>       int (*open) (struct inode *, struct file *);
>>       int (*flush) (struct file *, fl_owner_t id);
>>       int (*release) (struct inode *, struct file *);
>> @@ -1748,6 +1749,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
>>       return file->f_op->mmap(file, vma);
>>  }
>>
>> +static inline int call_fmmap(struct file *file, struct vm_area_struct *vma,
>> +             unsigned long flags)
>> +{
>> +     return file->f_op->fmmap(file, vma, flags);
>> +}
>> +
>
> Hum, I dislike a new file op for this when the only problem with ->mmap is
> that it misses 'flags' argument. I understand there are lots of ->mmap
> implementations out there and modifying prototype of them all is painful
> but is it so bad? Coccinelle patch for this should be rather easy...

So it wasn't all that easy, and Linus declined to take it. I think we
should add a new ->mmap_validate() file operation and save the
tree-wide cleanup until later.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
       [not found]         ` <CAA9_cmc0vejxCsc1NWp5b4C0CSsO5xetF3t6LCoCuEYB6yPiwQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-09-17 17:39           ` Christoph Hellwig
       [not found]             ` <20170917173945.GA22200-jcswGhMUV9g@public.gmane.org>
  0 siblings, 1 reply; 22+ messages in thread
From: Christoph Hellwig @ 2017-09-17 17:39 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, Darrick J. Wong, Arnd Bergmann,
	linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw, Linux API,
	Linux Kernel Mailing List, linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-mm, Andy Lutomirski, linux-fsdevel, Andrew Morton,
	Christoph Hellwig

On Sat, Sep 16, 2017 at 08:44:14PM -0700, Dan Williams wrote:
> So it wasn't all that easy, and Linus declined to take it. I think we
> should add a new ->mmap_validate() file operation and save the
> tree-wide cleanup until later.

Note that we already have a mmap_capabilities callout for nommu,
I wonder if we could generalize that.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-09-17  3:44       ` Dan Williams
       [not found]         ` <CAA9_cmc0vejxCsc1NWp5b4C0CSsO5xetF3t6LCoCuEYB6yPiwQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2017-09-18  9:26         ` Jan Kara
  1 sibling, 0 replies; 22+ messages in thread
From: Jan Kara @ 2017-09-18  9:26 UTC (permalink / raw)
  To: Dan Williams
  Cc: Jan Kara, Darrick J. Wong, Arnd Bergmann, linux-nvdimm,
	Linux API, Linux Kernel Mailing List, linux-xfs, linux-mm,
	Andy Lutomirski, linux-fsdevel, Andrew Morton, Christoph Hellwig

On Sat 16-09-17 20:44:14, Dan Williams wrote:
> On Tue, Aug 15, 2017 at 5:27 AM, Jan Kara <jack@suse.cz> wrote:
> > On Mon 14-08-17 23:12:16, Dan Williams wrote:
> >> The mmap syscall suffers from the ABI anti-pattern of not validating
> >> unknown flags. However, proposals like MAP_SYNC and MAP_DIRECT need a
> >> mechanism to define new behavior that is known to fail on older kernels
> >> without the feature. Use the fact that specifying MAP_SHARED and
> >> MAP_PRIVATE at the same time is invalid as a cute hack to allow a new
> >> set of validated flags to be introduced.
> >>
> >> This also introduces the ->fmmap() file operation that is ->mmap() plus
> >> flags. Each ->fmmap() implementation must fail requests when a locally
> >> unsupported flag is specified.
> > ...
> >> diff --git a/include/linux/fs.h b/include/linux/fs.h
> >> index 1104e5df39ef..bbe755d0caee 100644
> >> --- a/include/linux/fs.h
> >> +++ b/include/linux/fs.h
> >> @@ -1674,6 +1674,7 @@ struct file_operations {
> >>       long (*unlocked_ioctl) (struct file *, unsigned int, unsigned long);
> >>       long (*compat_ioctl) (struct file *, unsigned int, unsigned long);
> >>       int (*mmap) (struct file *, struct vm_area_struct *);
> >> +     int (*fmmap) (struct file *, struct vm_area_struct *, unsigned long);
> >>       int (*open) (struct inode *, struct file *);
> >>       int (*flush) (struct file *, fl_owner_t id);
> >>       int (*release) (struct inode *, struct file *);
> >> @@ -1748,6 +1749,12 @@ static inline int call_mmap(struct file *file, struct vm_area_struct *vma)
> >>       return file->f_op->mmap(file, vma);
> >>  }
> >>
> >> +static inline int call_fmmap(struct file *file, struct vm_area_struct *vma,
> >> +             unsigned long flags)
> >> +{
> >> +     return file->f_op->fmmap(file, vma, flags);
> >> +}
> >> +
> >
> > Hum, I dislike a new file op for this when the only problem with ->mmap is
> > that it misses 'flags' argument. I understand there are lots of ->mmap
> > implementations out there and modifying prototype of them all is painful
> > but is it so bad? Coccinelle patch for this should be rather easy...
> 
> So it wasn't all that easy, and Linus declined to take it. I think we
> should add a new ->mmap_validate() file operation and save the
> tree-wide cleanup until later.

Well, we don't even strictly need the flags passed to ->mmap callback if we
are willing to use VMA flags. I want to use it for MAP_SYNC anyway... So
bumping vma->flags to u64 and using a flag is also an option (and frankly
I'd personally just go for that).

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
       [not found]             ` <20170917173945.GA22200-jcswGhMUV9g@public.gmane.org>
@ 2017-09-18  9:31               ` Jan Kara
  2017-09-18 15:47                 ` Dan Williams
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Kara @ 2017-09-18  9:31 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jan Kara, Arnd Bergmann, linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw,
	Linux API, Darrick J. Wong, Linux Kernel Mailing List,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA, linux-mm, Andy Lutomirski,
	linux-fsdevel, Andrew Morton

On Sun 17-09-17 19:39:45, Christoph Hellwig wrote:
> On Sat, Sep 16, 2017 at 08:44:14PM -0700, Dan Williams wrote:
> > So it wasn't all that easy, and Linus declined to take it. I think we
> > should add a new ->mmap_validate() file operation and save the
> > tree-wide cleanup until later.
> 
> Note that we already have a mmap_capabilities callout for nommu,
> I wonder if we could generalize that.

So if I understood Dan right, Linus refused to merge the patch which adds
'flags' argument to ->mmap callback. That is actually logically independent
change from validating flags passed to mmap(2) syscall. Dan did it just to
save himself from adding a VMA flag for MAP_DIRECT.

For validating flags passed to mmap(2), I agree we could use
->mmap_capabilities() instead of mmap_supported_mask Dan has added. But I
don't have a strong opinion there.

								Honza
-- 
Jan Kara <jack-IBi9RG/b67k@public.gmane.org>
SUSE Labs, CR

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags
  2017-09-18  9:31               ` Jan Kara
@ 2017-09-18 15:47                 ` Dan Williams
  0 siblings, 0 replies; 22+ messages in thread
From: Dan Williams @ 2017-09-18 15:47 UTC (permalink / raw)
  To: Jan Kara
  Cc: Christoph Hellwig, Darrick J. Wong, Arnd Bergmann, linux-nvdimm,
	Linux API, Linux Kernel Mailing List, linux-xfs, linux-mm,
	Andy Lutomirski, linux-fsdevel, Andrew Morton, david

On Mon, Sep 18, 2017 at 2:31 AM, Jan Kara <jack@suse.cz> wrote:
> On Sun 17-09-17 19:39:45, Christoph Hellwig wrote:
>> On Sat, Sep 16, 2017 at 08:44:14PM -0700, Dan Williams wrote:
>> > So it wasn't all that easy, and Linus declined to take it. I think we
>> > should add a new ->mmap_validate() file operation and save the
>> > tree-wide cleanup until later.
>>
>> Note that we already have a mmap_capabilities callout for nommu,
>> I wonder if we could generalize that.
>
> So if I understood Dan right, Linus refused to merge the patch which adds
> 'flags' argument to ->mmap callback. That is actually logically independent
> change from validating flags passed to mmap(2) syscall. Dan did it just to
> save himself from adding a VMA flag for MAP_DIRECT.
>
> For validating flags passed to mmap(2), I agree we could use
> ->mmap_capabilities() instead of mmap_supported_mask Dan has added. But I
> don't have a strong opinion there.

The drawback I see with mmap_capabilities is that it requires all mmap
flags to have a corresponding vm_flag. After the cold reaction the
VM_DAX flag received I'd want to be sure they were on board with this
direction.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2017-09-18 15:47 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-08-15  6:12 [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dan Williams
2017-08-15  6:12 ` [PATCH v4 1/3] fs, xfs: introduce S_IOMAP_SEALED Dan Williams
     [not found] ` <150277752553.23945.13932394738552748440.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-15  6:12   ` [PATCH v4 2/3] mm: introduce MAP_VALIDATE a mechanism for adding new mmap flags Dan Williams
2017-08-15 12:27     ` Jan Kara
     [not found]       ` <20170815122701.GF27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-08-15 16:24         ` Dan Williams
2017-09-17  3:44       ` Dan Williams
     [not found]         ` <CAA9_cmc0vejxCsc1NWp5b4C0CSsO5xetF3t6LCoCuEYB6yPiwQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-09-17 17:39           ` Christoph Hellwig
     [not found]             ` <20170917173945.GA22200-jcswGhMUV9g@public.gmane.org>
2017-09-18  9:31               ` Jan Kara
2017-09-18 15:47                 ` Dan Williams
2017-09-18  9:26         ` Jan Kara
2017-08-15 16:28     ` Andy Lutomirski
2017-08-15 22:31       ` Dan Williams
2017-08-17  8:06     ` kbuild test robot
2017-08-15  6:12 ` [PATCH v4 3/3] fs, xfs: introduce MAP_DIRECT for creating block-map-sealed file ranges Dan Williams
2017-08-15  9:18   ` Kirill A. Shutemov
2017-08-15 17:11     ` Dan Williams
2017-08-16 10:25       ` Kirill A. Shutemov
     [not found]   ` <150277754211.23945.458876600578531019.stgit-p8uTFz9XbKj2zm6wflaqv1nYeNYlB/vhral2JQCrhuEAvxtiuMwx3w@public.gmane.org>
2017-08-15 12:42     ` Jan Kara
     [not found]       ` <20170815124250.GG27505-4I4JzKEfoa/jFM9bn6wA6Q@public.gmane.org>
2017-08-15 16:29         ` Dan Williams
     [not found]           ` <CAPcyv4h01os0Gc6bYmaGdMXt5q4G4zfirNRPWG3=gQi5POrpmg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2017-08-16  1:15             ` Dan Williams
2017-08-17  8:49     ` kbuild test robot
2017-08-15  9:01 ` [PATCH v4 0/3] MAP_DIRECT and block-map sealed files Dave Chinner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).