All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/7][RFC] Introduce fallocate query support mode
@ 2017-09-18 15:52 Lukas Czerner
  2017-09-18 15:52 ` [PATCH 1/7] vfs: " Lukas Czerner
                   ` (6 more replies)
  0 siblings, 7 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel

Hello,

this is a proposal, followed by implementation, of new fallocate flag I decided
to call "query support" (FALLOC_FL_QUERY_SUPPORT).

Currently there is not way to tell which subset of fallocate operations are
supported by any given file system, so users have to rely on "just knowing"
or trying if it works or not.

This patchset introduces the new FALLOC_FL_QUERY_SUPPORT fallocate mode
implements it in couple of file systems and takes advantage of this new flag in
loop driver where we can finally test specifically for punch hole (and zero
range) rather than just relying on mere existence of fallocate callback.

When FALLOC_FL_QUERY_SUPPORT mode is specified with offset and length set to
zero it returns bitmask with all fallocate mode flags supported for a given
file. Because of this I also had to introduce additional flag for the original
fallocate mode (preallocation: mode = 0).

I expect that in many file systems the implementation will be straightforward
and once we can agree on this I plan to implement this for other file systems
with fallocate support.

In ext4 (and this may be true for other file systems as well) supported modes
can differ from file to file depending on the features (like extents and
encryption). Because of that I decided to make it return all the modes supported
by the particular file, not file system. This might complicate implementation
for some file systems, however I think this will be beneficial.

Thoughts ?

Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH 1/7] vfs: Introduce fallocate query support mode
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  2017-09-18 15:52 ` [PATCH 2/7] ext4: Implement " Lukas Czerner
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner

Filesystems are free to implement any subset of fallocate modes and
there is currently no way of telling what modes are actually supported
by the file system other than trying them all.

Change this by introducing new fallocate mode FALLOC_FL_QUERY_SUPPORT
that is supposed to return all supported modes.

FALLOC_FL_QUERY_SUPPORT mode can be only used exclusively with offset
and length set to zero.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/open.c                   | 18 +++++++++++++++++-
 include/linux/falloc.h      |  4 +++-
 include/uapi/linux/falloc.h | 13 +++++++++++++
 3 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 7ea1184..15c3fce 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -241,13 +241,22 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	struct inode *inode = file_inode(file);
 	long ret;
 
-	if (offset < 0 || len <= 0)
+	if ((offset < 0 || len <= 0) && !(mode & FALLOC_FL_QUERY_SUPPORT))
 		return -EINVAL;
 
 	/* Return error if mode is not supported */
 	if (mode & ~FALLOC_FL_SUPPORTED_MASK)
 		return -EOPNOTSUPP;
 
+	/* offset and length are not used in query support */
+	if ((mode & FALLOC_FL_QUERY_SUPPORT) && (offset != 0 || len != 0))
+		return -EINVAL;
+
+	/* Query support should only be used exclusively */
+	if ((mode & FALLOC_FL_QUERY_SUPPORT) &&
+	    (mode & ~FALLOC_FL_QUERY_SUPPORT))
+		return -EINVAL;
+
 	/* Punch hole and zero range are mutually exclusive */
 	if ((mode & (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE)) ==
 	    (FALLOC_FL_PUNCH_HOLE | FALLOC_FL_ZERO_RANGE))
@@ -328,6 +337,13 @@ int vfs_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	if (ret == 0)
 		fsnotify_modify(file);
 
+	/*
+	 * Let's not allow file systems return any random data, just fallocate
+	 * modes.
+	 */
+	if ((ret > 0) && (mode & FALLOC_FL_QUERY_SUPPORT))
+		ret &= FALLOC_FL_SUPPORTED_MASK;
+
 	file_end_write(file);
 	return ret;
 }
diff --git a/include/linux/falloc.h b/include/linux/falloc.h
index 7494dc6..1558bce 100644
--- a/include/linux/falloc.h
+++ b/include/linux/falloc.h
@@ -26,6 +26,8 @@ struct space_resv {
 					 FALLOC_FL_COLLAPSE_RANGE |	\
 					 FALLOC_FL_ZERO_RANGE |		\
 					 FALLOC_FL_INSERT_RANGE |	\
-					 FALLOC_FL_UNSHARE_RANGE)
+					 FALLOC_FL_QUERY_SUPPORT |	\
+					 FALLOC_FL_UNSHARE_RANGE |	\
+					 FALLOC_FL_PREALLOC_RANGE)
 
 #endif /* _FALLOC_H_ */
diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h
index b075f60..9959355 100644
--- a/include/uapi/linux/falloc.h
+++ b/include/uapi/linux/falloc.h
@@ -76,4 +76,17 @@
  */
 #define FALLOC_FL_UNSHARE_RANGE		0x40
 
+/*
+ * FALLOC_FL_QUERY_SUPPORT is used to query file system, or block device
+ * for a supported fallocate flags.
+ */
+#define FALLOC_FL_QUERY_SUPPORT		0x80
+
+/*
+ * FALLOC_FL_PREALLOC_RANGE is a placeholder for a default preallocation
+ * mode. It has the same effect as not specifying any flag at all. We need
+ * this to report back support for this particular mode,
+ */
+#define FALLOC_FL_PREALLOC_RANGE	0x100
+
 #endif /* _UAPI_FALLOC_H_ */
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 2/7] ext4: Implement fallocate query support mode
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
  2017-09-18 15:52 ` [PATCH 1/7] vfs: " Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  2017-09-18 15:52 ` [PATCH 3/7] ext4: Remove unnecessary S_ISREG checks in fallocate operations Lukas Czerner
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner, linux-ext4

Return all fallocate modes supported by ext4 file system. Ext4 does have
a lot of exceptions for inodes with various features enabled so take
that into account as well.

Cc: linux-ext4@vger.kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/ext4/ext4.h    | 11 +++++++++++
 fs/ext4/extents.c | 42 ++++++++++++++++++------------------------
 2 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index e2abe01..6546c2c 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -558,6 +558,17 @@ enum {
 };
 
 /*
+ * Supported fallocate modes
+ */
+#define EXT4_FALLOC_SUPPORTED	(FALLOC_FL_KEEP_SIZE |		\
+				 FALLOC_FL_PUNCH_HOLE |		\
+				 FALLOC_FL_COLLAPSE_RANGE |	\
+				 FALLOC_FL_ZERO_RANGE |		\
+				 FALLOC_FL_INSERT_RANGE |	\
+				 FALLOC_FL_QUERY_SUPPORT |	\
+				 FALLOC_FL_PREALLOC_RANGE)
+
+/*
  * Flags used by ext4_map_blocks()
  */
 	/* Allocate any needed blocks and/or convert an unwritten
diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 97f0fd0..91071b3 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4905,7 +4905,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	loff_t new_size = 0;
 	unsigned int max_blocks;
 	int ret = 0;
-	int flags;
+	int flags = 0;
 	ext4_lblk_t lblk;
 	unsigned int blkbits = inode->i_blkbits;
 
@@ -4919,17 +4919,27 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 	 * leave it disabled for encrypted inodes for now.  This is a
 	 * bug we should fix....
 	 */
-	if (ext4_encrypted_inode(inode) &&
-	    (mode & (FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_INSERT_RANGE |
-		     FALLOC_FL_ZERO_RANGE)))
-		return -EOPNOTSUPP;
+	if (ext4_encrypted_inode(inode)) {
+		/* Modes not supported on encrypted indoes */
+		flags |= (FALLOC_FL_COLLAPSE_RANGE |
+			  FALLOC_FL_INSERT_RANGE |
+			  FALLOC_FL_ZERO_RANGE);
+	}
+	if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) {
+		/* Modes not supported on non-extent inodes */
+		flags |= (FALLOC_FL_COLLAPSE_RANGE |
+			  FALLOC_FL_INSERT_RANGE |
+			  FALLOC_FL_ZERO_RANGE |
+			  FALLOC_FL_PREALLOC_RANGE);
+	}
 
 	/* Return error if mode is not supported */
-	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
-		     FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |
-		     FALLOC_FL_INSERT_RANGE))
+	if (mode & ~(EXT4_FALLOC_SUPPORTED & ~flags))
 		return -EOPNOTSUPP;
 
+	if (mode & FALLOC_FL_QUERY_SUPPORT)
+		return EXT4_FALLOC_SUPPORTED & ~flags;
+
 	if (mode & FALLOC_FL_PUNCH_HOLE)
 		return ext4_punch_hole(inode, offset, len);
 
@@ -5449,14 +5459,6 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
 	loff_t new_size, ioffset;
 	int ret;
 
-	/*
-	 * We need to test this early because xfstests assumes that a
-	 * collapse range of (0, 1) will return EOPNOTSUPP if the file
-	 * system does not support collapse range.
-	 */
-	if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
-		return -EOPNOTSUPP;
-
 	/* Collapse range works only on fs block size aligned offsets. */
 	if (offset & (EXT4_CLUSTER_SIZE(sb) - 1) ||
 	    len & (EXT4_CLUSTER_SIZE(sb) - 1))
@@ -5596,14 +5598,6 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
 	int ret = 0, depth, split_flag = 0;
 	loff_t ioffset;
 
-	/*
-	 * We need to test this early because xfstests assumes that an
-	 * insert range of (0, 1) will return EOPNOTSUPP if the file
-	 * system does not support insert range.
-	 */
-	if (!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
-		return -EOPNOTSUPP;
-
 	/* Insert range works only on fs block size aligned offsets. */
 	if (offset & (EXT4_CLUSTER_SIZE(sb) - 1) ||
 			len & (EXT4_CLUSTER_SIZE(sb) - 1))
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 3/7] ext4: Remove unnecessary S_ISREG checks in fallocate operations
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
  2017-09-18 15:52 ` [PATCH 1/7] vfs: " Lukas Czerner
  2017-09-18 15:52 ` [PATCH 2/7] ext4: Implement " Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  2017-09-18 15:52 ` [PATCH 4/7] xfs: Implement fallocate query support mode Lukas Czerner
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner, linux-ext4

vfs_fallocate() already has this check so there is not need to check for
this again in ext4.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
Cc: linux-ext4@vger.kernel.org
---
 fs/ext4/extents.c | 9 ---------
 fs/ext4/inode.c   | 3 ---
 2 files changed, 12 deletions(-)

diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c
index 91071b3..ce46d65 100644
--- a/fs/ext4/extents.c
+++ b/fs/ext4/extents.c
@@ -4752,9 +4752,6 @@ static long ext4_zero_range(struct file *file, loff_t offset,
 
 	trace_ext4_zero_range(inode, offset, len, mode);
 
-	if (!S_ISREG(inode->i_mode))
-		return -EINVAL;
-
 	/* Call ext4_force_commit to flush all data in case of data=journal. */
 	if (ext4_should_journal_data(inode)) {
 		ret = ext4_force_commit(inode->i_sb);
@@ -5464,9 +5461,6 @@ int ext4_collapse_range(struct inode *inode, loff_t offset, loff_t len)
 	    len & (EXT4_CLUSTER_SIZE(sb) - 1))
 		return -EINVAL;
 
-	if (!S_ISREG(inode->i_mode))
-		return -EINVAL;
-
 	trace_ext4_collapse_range(inode, offset, len);
 
 	punch_start = offset >> EXT4_BLOCK_SIZE_BITS(sb);
@@ -5603,9 +5597,6 @@ int ext4_insert_range(struct inode *inode, loff_t offset, loff_t len)
 			len & (EXT4_CLUSTER_SIZE(sb) - 1))
 		return -EINVAL;
 
-	if (!S_ISREG(inode->i_mode))
-		return -EOPNOTSUPP;
-
 	trace_ext4_insert_range(inode, offset, len);
 
 	offset_lblk = offset >> EXT4_BLOCK_SIZE_BITS(sb);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 31db875..adda865 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4144,9 +4144,6 @@ int ext4_punch_hole(struct inode *inode, loff_t offset, loff_t length)
 	unsigned int credits;
 	int ret = 0;
 
-	if (!S_ISREG(inode->i_mode))
-		return -EOPNOTSUPP;
-
 	trace_ext4_punch_hole(inode, offset, length, 0);
 
 	/*
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
                   ` (2 preceding siblings ...)
  2017-09-18 15:52 ` [PATCH 3/7] ext4: Remove unnecessary S_ISREG checks in fallocate operations Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  2017-09-18 17:56   ` Christoph Hellwig
  2017-09-18 20:48   ` Darrick J. Wong
  2017-09-18 15:52 ` [PATCH 5/7] fat: " Lukas Czerner
                   ` (2 subsequent siblings)
  6 siblings, 2 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner, linux-xfs

Return all fallcoate modes supported by xfs file system.

Cc: linux-xfs@vger.kernel.org
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/xfs/xfs_file.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index ebdd0bd..85e06c6 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -746,7 +746,8 @@ xfs_file_write_iter(
 #define	XFS_FALLOC_FL_SUPPORTED						\
 		(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |		\
 		 FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |	\
-		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE)
+		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE |	\
+		 FALLOC_FL_QUERY_SUPPORT | FALLOC_FL_PREALLOC_RANGE)
 
 STATIC long
 xfs_file_fallocate(
@@ -768,6 +769,9 @@ xfs_file_fallocate(
 	if (mode & ~XFS_FALLOC_FL_SUPPORTED)
 		return -EOPNOTSUPP;
 
+	if (mode & FALLOC_FL_QUERY_SUPPORT)
+		return XFS_FALLOC_FL_SUPPORTED;
+
 	xfs_ilock(ip, iolock);
 	error = xfs_break_layouts(inode, &iolock);
 	if (error)
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 5/7] fat: Implement fallocate query support mode
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
                   ` (3 preceding siblings ...)
  2017-09-18 15:52 ` [PATCH 4/7] xfs: Implement fallocate query support mode Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  2017-09-18 15:52   ` [Cluster-devel] " Lukas Czerner
  2017-09-18 15:52 ` [PATCH 7/7] loop: Check for puch hole and zero range support specifically Lukas Czerner
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner, OGAWA Hirofumi

Return all fallcoate modes supported by fat file system.

Remove unnecessary check for regular file since it's already done in
vfs_fallocate()

Cc: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/fat/fat.h  | 7 +++++++
 fs/fat/file.c | 7 +++----
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 051dac1..2ff49ee0 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -57,6 +57,13 @@ struct fat_mount_options {
 #define FAT_HASH_SIZE	(1UL << FAT_HASH_BITS)
 
 /*
+ * Supported fallocate modes
+ */
+#define FAT_FALLOC_SUPPORTED	(FALLOC_FL_KEEP_SIZE |		\
+				 FALLOC_FL_QUERY_SUPPORT |	\
+				 FALLOC_FL_PREALLOC_RANGE)
+
+/*
  * MS-DOS file system in-core superblock data
  */
 struct msdos_sb_info {
diff --git a/fs/fat/file.c b/fs/fat/file.c
index 4724cc9..119915e 100644
--- a/fs/fat/file.c
+++ b/fs/fat/file.c
@@ -239,12 +239,11 @@ static long fat_fallocate(struct file *file, int mode,
 	int err = 0;
 
 	/* No support for hole punch or other fallocate flags. */
-	if (mode & ~FALLOC_FL_KEEP_SIZE)
+	if (mode & ~FAT_FALLOC_SUPPORTED)
 		return -EOPNOTSUPP;
 
-	/* No support for dir */
-	if (!S_ISREG(inode->i_mode))
-		return -EOPNOTSUPP;
+	if (mode & FALLOC_FL_QUERY_SUPPORT)
+		return FAT_FALLOC_SUPPORTED;
 
 	inode_lock(inode);
 	if (mode & FALLOC_FL_KEEP_SIZE) {
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 6/7] gfs2: Implement fallocate query support mode
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
@ 2017-09-18 15:52   ` Lukas Czerner
  2017-09-18 15:52 ` [PATCH 2/7] ext4: Implement " Lukas Czerner
                     ` (5 subsequent siblings)
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner, cluster-devel

Return all fallcoate modes supported by gfs2 file system.

Cc: cluster-devel@redhat.com
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/gfs2/file.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 33a0cb5..12e1de5 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -910,6 +910,14 @@ static long __gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t
 	return error;
 }
 
+/*
+ * Supported fallocate modes
+ */
+#define GFS2_FALLOC_SUPPORTED   (FALLOC_FL_KEEP_SIZE |		\
+				 FALLOC_FL_QUERY_SUPPORT |	\
+				 FALLOC_FL_PREALLOC_RANGE)
+
+
 static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 {
 	struct inode *inode = file_inode(file);
@@ -918,11 +926,18 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t le
 	struct gfs2_holder gh;
 	int ret;
 
-	if (mode & ~FALLOC_FL_KEEP_SIZE)
+	if (mode & ~GFS2_FALLOC_SUPPORTED)
 		return -EOPNOTSUPP;
+
 	/* fallocate is needed by gfs2_grow to reserve space in the rindex */
-	if (gfs2_is_jdata(ip) && inode != sdp->sd_rindex)
+	if (gfs2_is_jdata(ip) && inode != sdp->sd_rindex) {
+		if (mode & FALLOC_FL_QUERY_SUPPORT)
+			return 0;
 		return -EOPNOTSUPP;
+	}
+
+	if (mode & FALLOC_FL_QUERY_SUPPORT)
+		return GFS2_FALLOC_SUPPORTED;
 
 	inode_lock(inode);
 
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [Cluster-devel] [PATCH 6/7] gfs2: Implement fallocate query support mode
@ 2017-09-18 15:52   ` Lukas Czerner
  0 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: cluster-devel.redhat.com

Return all fallcoate modes supported by gfs2 file system.

Cc: cluster-devel at redhat.com
Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 fs/gfs2/file.c | 19 +++++++++++++++++--
 1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index 33a0cb5..12e1de5 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -910,6 +910,14 @@ static long __gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t
 	return error;
 }
 
+/*
+ * Supported fallocate modes
+ */
+#define GFS2_FALLOC_SUPPORTED   (FALLOC_FL_KEEP_SIZE |		\
+				 FALLOC_FL_QUERY_SUPPORT |	\
+				 FALLOC_FL_PREALLOC_RANGE)
+
+
 static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t len)
 {
 	struct inode *inode = file_inode(file);
@@ -918,11 +926,18 @@ static long gfs2_fallocate(struct file *file, int mode, loff_t offset, loff_t le
 	struct gfs2_holder gh;
 	int ret;
 
-	if (mode & ~FALLOC_FL_KEEP_SIZE)
+	if (mode & ~GFS2_FALLOC_SUPPORTED)
 		return -EOPNOTSUPP;
+
 	/* fallocate is needed by gfs2_grow to reserve space in the rindex */
-	if (gfs2_is_jdata(ip) && inode != sdp->sd_rindex)
+	if (gfs2_is_jdata(ip) && inode != sdp->sd_rindex) {
+		if (mode & FALLOC_FL_QUERY_SUPPORT)
+			return 0;
 		return -EOPNOTSUPP;
+	}
+
+	if (mode & FALLOC_FL_QUERY_SUPPORT)
+		return GFS2_FALLOC_SUPPORTED;
 
 	inode_lock(inode);
 
-- 
2.7.5



^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH 7/7] loop: Check for puch hole and zero range support specifically
  2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
                   ` (5 preceding siblings ...)
  2017-09-18 15:52   ` [Cluster-devel] " Lukas Czerner
@ 2017-09-18 15:52 ` Lukas Czerner
  6 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-18 15:52 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Lukas Czerner

Currently, when deciding whether the dirver can support discard, or
write_zeroes we only check whether filesystem registered fallocate
callback. However this is not really enough because file system does
not necessarily need to support punch hole.

With previous addition of query support to fallocate, we can query
whether the file system support punch hole and zero range and set it up
accordingly.

Also, distinguish between discard, write_zeroes and write_zeroes with
nounmap.

Signed-off-by: Lukas Czerner <lczerner@redhat.com>
---
 drivers/block/loop.c | 61 ++++++++++++++++++++++++++++++++++++----------------
 1 file changed, 43 insertions(+), 18 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 85de673..cefabf4 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -414,7 +414,7 @@ static int lo_read_transfer(struct loop_device *lo, struct request *rq,
 	return ret;
 }
 
-static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
+static int lo_falloc(struct loop_device *lo, struct request *rq, loff_t pos)
 {
 	/*
 	 * We use punch hole to reclaim the free space used by the
@@ -423,14 +423,28 @@ static int lo_discard(struct loop_device *lo, struct request *rq, loff_t pos)
 	 * useful information.
 	 */
 	struct file *file = lo->lo_backing_file;
-	int mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
-	int ret;
+	int mode, ret;
 
 	if ((!file->f_op->fallocate) || lo->lo_encrypt_key_size) {
 		ret = -EOPNOTSUPP;
 		goto out;
 	}
 
+	switch (req_op(rq)) {
+	case REQ_OP_WRITE_ZEROES:
+		if (rq->cmd_flags & REQ_NOUNMAP)
+			mode = FALLOC_FL_ZERO_RANGE | FALLOC_FL_KEEP_SIZE;
+		else
+			mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
+		break;
+	case REQ_OP_DISCARD:
+		mode = FALLOC_FL_PUNCH_HOLE | FALLOC_FL_KEEP_SIZE;
+		break;
+	default:
+		ret = -EOPNOTSUPP;
+		goto out;
+	}
+
 	ret = file->f_op->fallocate(file, mode, pos, blk_rq_bytes(rq));
 	if (unlikely(ret && ret != -EINVAL && ret != -EOPNOTSUPP))
 		ret = -EIO;
@@ -567,7 +581,7 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq)
 		return lo_req_flush(lo, rq);
 	case REQ_OP_DISCARD:
 	case REQ_OP_WRITE_ZEROES:
-		return lo_discard(lo, rq, pos);
+		return lo_falloc(lo, rq, pos);
 	case REQ_OP_WRITE:
 		if (lo->transfer)
 			return lo_write_transfer(lo, rq, pos);
@@ -794,11 +808,18 @@ static void loop_sysfs_exit(struct loop_device *lo)
 			   &loop_attribute_group);
 }
 
-static void loop_config_discard(struct loop_device *lo)
+static void loop_config_falloc(struct loop_device *lo)
 {
 	struct file *file = lo->lo_backing_file;
 	struct inode *inode = file->f_mapping->host;
 	struct request_queue *q = lo->lo_queue;
+	int mode, ret;
+
+	q->limits.discard_granularity = 0;
+	q->limits.discard_alignment = 0;
+	blk_queue_max_discard_sectors(q, 0);
+	blk_queue_max_write_zeroes_sectors(q, 0);
+	queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, q);
 
 	/*
 	 * We use punch hole to reclaim the free space used by the
@@ -807,21 +828,25 @@ static void loop_config_discard(struct loop_device *lo)
 	 * useful information.
 	 */
 	if ((!file->f_op->fallocate) ||
-	    lo->lo_encrypt_key_size) {
-		q->limits.discard_granularity = 0;
-		q->limits.discard_alignment = 0;
-		blk_queue_max_discard_sectors(q, 0);
-		blk_queue_max_write_zeroes_sectors(q, 0);
-		queue_flag_clear_unlocked(QUEUE_FLAG_DISCARD, q);
+	    lo->lo_encrypt_key_size)
 		return;
-	}
 
-	q->limits.discard_granularity = inode->i_sb->s_blocksize;
-	q->limits.discard_alignment = 0;
+	/*
+	 * Now find out whether the backing file supports punch hole
+	 * or at least zero range operations.
+	 */
+	mode = FALLOC_FL_QUERY_SUPPORT;
+	ret = file->f_op->fallocate(file, mode, 0, 0);
+	if (ret & FALLOC_FL_ZERO_RANGE)
+		blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
+	if (ret & FALLOC_FL_PUNCH_HOLE) {
+		q->limits.discard_granularity = inode->i_sb->s_blocksize;
+		q->limits.discard_alignment = 0;
 
-	blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
-	blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
-	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
+		blk_queue_max_discard_sectors(q, UINT_MAX >> 9);
+		blk_queue_max_write_zeroes_sectors(q, UINT_MAX >> 9);
+		queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
+	}
 }
 
 static void loop_unprepare_queue(struct loop_device *lo)
@@ -1118,7 +1143,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
 		}
 	}
 
-	loop_config_discard(lo);
+	loop_config_falloc(lo);
 
 	memcpy(lo->lo_file_name, info->lo_file_name, LO_NAME_SIZE);
 	memcpy(lo->lo_crypt_name, info->lo_crypt_name, LO_NAME_SIZE);
-- 
2.7.5

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 15:52 ` [PATCH 4/7] xfs: Implement fallocate query support mode Lukas Czerner
@ 2017-09-18 17:56   ` Christoph Hellwig
  2017-09-19  3:28     ` OGAWA Hirofumi
  2017-09-19  8:15     ` Lukas Czerner
  2017-09-18 20:48   ` Darrick J. Wong
  1 sibling, 2 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-18 17:56 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-fsdevel, linux-xfs, linux-api

This seems to miss any explanation and 6 out of 7 patches.

That being said, I really detest adding weird query modes to random
syscalls.  We should expose capabilities through the interface that
was designed for such things: (f)pathconf, as there are a lot more
important parameters we currently fail to expose to userspace to
start with.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 15:52 ` [PATCH 4/7] xfs: Implement fallocate query support mode Lukas Czerner
  2017-09-18 17:56   ` Christoph Hellwig
@ 2017-09-18 20:48   ` Darrick J. Wong
  2017-09-18 21:55     ` Andreas Dilger
  2017-09-19  8:20     ` Lukas Czerner
  1 sibling, 2 replies; 28+ messages in thread
From: Darrick J. Wong @ 2017-09-18 20:48 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: linux-fsdevel, linux-xfs

On Mon, Sep 18, 2017 at 05:52:24PM +0200, Lukas Czerner wrote:
> Return all fallcoate modes supported by xfs file system.
> 
> Cc: linux-xfs@vger.kernel.org
> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> ---
>  fs/xfs/xfs_file.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index ebdd0bd..85e06c6 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -746,7 +746,8 @@ xfs_file_write_iter(
>  #define	XFS_FALLOC_FL_SUPPORTED						\
>  		(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |		\
>  		 FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |	\
> -		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE)
> +		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE |	\
> +		 FALLOC_FL_QUERY_SUPPORT | FALLOC_FL_PREALLOC_RANGE)
>  
>  STATIC long
>  xfs_file_fallocate(
> @@ -768,6 +769,9 @@ xfs_file_fallocate(
>  	if (mode & ~XFS_FALLOC_FL_SUPPORTED)
>  		return -EOPNOTSUPP;
>  
> +	if (mode & FALLOC_FL_QUERY_SUPPORT)
> +		return XFS_FALLOC_FL_SUPPORTED;

Urk, manpage update describing the goals of the query interface and how
this is supposed to work is needed.

Are we required to return only the mode flags that would reasonably be
expected to work on this file, or the fs in general?  Do we return zero
if the file is immutable (I guess the fd has to be O_RDONLY?) or if the
fs is readonly?

And like hch said, why not {f,}pathconf?

--D

> +
>  	xfs_ilock(ip, iolock);
>  	error = xfs_break_layouts(inode, &iolock);
>  	if (error)
> -- 
> 2.7.5
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 20:48   ` Darrick J. Wong
@ 2017-09-18 21:55     ` Andreas Dilger
  2017-09-19 14:55       ` Theodore Ts'o
  2017-09-19  8:20     ` Lukas Czerner
  1 sibling, 1 reply; 28+ messages in thread
From: Andreas Dilger @ 2017-09-18 21:55 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Lukas Czerner, linux-fsdevel, linux-xfs

[-- Attachment #1: Type: text/plain, Size: 2525 bytes --]

On Sep 18, 2017, at 2:48 PM, Darrick J. Wong <darrick.wong@oracle.com> wrote:
> 
> On Mon, Sep 18, 2017 at 05:52:24PM +0200, Lukas Czerner wrote:
>> Return all fallcoate modes supported by xfs file system.
>> 
>> Cc: linux-xfs@vger.kernel.org
>> Signed-off-by: Lukas Czerner <lczerner@redhat.com>
>> ---
>> fs/xfs/xfs_file.c | 6 +++++-
>> 1 file changed, 5 insertions(+), 1 deletion(-)
>> 
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index ebdd0bd..85e06c6 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -746,7 +746,8 @@ xfs_file_write_iter(
>> #define	XFS_FALLOC_FL_SUPPORTED						\
>> 		(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |		\
>> 		 FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |	\
>> -		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE)
>> +		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE |	\
>> +		 FALLOC_FL_QUERY_SUPPORT | FALLOC_FL_PREALLOC_RANGE)
>> 
>> STATIC long
>> xfs_file_fallocate(
>> @@ -768,6 +769,9 @@ xfs_file_fallocate(
>> 	if (mode & ~XFS_FALLOC_FL_SUPPORTED)
>> 		return -EOPNOTSUPP;
>> 
>> +	if (mode & FALLOC_FL_QUERY_SUPPORT)
>> +		return XFS_FALLOC_FL_SUPPORTED;
> 
> Urk, manpage update describing the goals of the query interface and how
> this is supposed to work is needed.
> 
> Are we required to return only the mode flags that would reasonably be
> expected to work on this file, or the fs in general?  Do we return zero
> if the file is immutable (I guess the fd has to be O_RDONLY?) or if the
> fs is readonly?
> 
> And like hch said, why not {f,}pathconf?

The problem with pathconf() is that this is just made up by GlibC, and
doesn't actually communicate with the kernel or the filesystem at all.
For example, _PC_LINK_MAX does not return different values for ext2/ext4
filesystems based on feature flags, only what is hard-coded into GlibC
based on the filesystem magic.  This is not going to work as a per-file
source of information.

For example, even though EXT4_LINK_MAX has been 65000 for many years,
GlibC reports 32000 for an ext4 filesystem since it only uses the magic:

$ df -T /
Filesystem     Type 1024-blocks     Used Available Capacity Mounted on
/dev/mapper/vg_twoshoes-lvroot
               ext4    25991484 22475376   2205504      92% /
$ strace getconf LINK_MAX /
:
:
statfs("/", {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, f_blocks=6497871,
	     f_bfree=879030, f_bavail=551379, f_files=819200, f_ffree=405130,
	     f_fsid={1950084367, 1019861045}, f_namelen=255}) = 0
write(1, "32000\n",) = 6


Cheers, Andreas






[-- Attachment #2: Message signed with OpenPGP --]
[-- Type: application/pgp-signature, Size: 195 bytes --]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 17:56   ` Christoph Hellwig
@ 2017-09-19  3:28     ` OGAWA Hirofumi
  2017-09-19  8:15     ` Lukas Czerner
  1 sibling, 0 replies; 28+ messages in thread
From: OGAWA Hirofumi @ 2017-09-19  3:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Lukas Czerner, linux-fsdevel, linux-xfs, linux-api

Christoph Hellwig <hch@infradead.org> writes:

I myself

Acked-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

to [5/7] fat part though,

> That being said, I really detest adding weird query modes to random
> syscalls.  We should expose capabilities through the interface that
> was designed for such things: (f)pathconf, as there are a lot more
> important parameters we currently fail to expose to userspace to
> start with.

This is true. For example, max link count. And it might be able to use
for fs's bug/issue detection too (in past, userland has to add
workaround for all cases, because noway to detect the buggy or not).

Thanks.
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 17:56   ` Christoph Hellwig
  2017-09-19  3:28     ` OGAWA Hirofumi
@ 2017-09-19  8:15     ` Lukas Czerner
  2017-09-19 14:13         ` Christoph Hellwig
  1 sibling, 1 reply; 28+ messages in thread
From: Lukas Czerner @ 2017-09-19  8:15 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-fsdevel, linux-xfs, linux-api

On Mon, Sep 18, 2017 at 10:56:51AM -0700, Christoph Hellwig wrote:
> This seems to miss any explanation and 6 out of 7 patches.

See linux-fsdevel for entire patch set.
https://www.spinics.net/lists/linux-fsdevel/msg115740.html

> 
> That being said, I really detest adding weird query modes to random
> syscalls.  We should expose capabilities through the interface that
> was designed for such things: (f)pathconf, as there are a lot more
> important parameters we currently fail to expose to userspace to
> start with.

(f)pathconf on it's own is useless because I need to get this
information directly from the file system. Supported fallocate modes can
differ with inode flags (encrypted files, extent/indirect based files on
ext4). I also need in-kernel interface.

That's why I though fallocate is the convenient place to put this. I
have not looked to it, but other option might be to use fstat to get
this.

-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 20:48   ` Darrick J. Wong
  2017-09-18 21:55     ` Andreas Dilger
@ 2017-09-19  8:20     ` Lukas Czerner
  1 sibling, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-19  8:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-fsdevel, linux-xfs

On Mon, Sep 18, 2017 at 01:48:38PM -0700, Darrick J. Wong wrote:
> On Mon, Sep 18, 2017 at 05:52:24PM +0200, Lukas Czerner wrote:
> > Return all fallcoate modes supported by xfs file system.
> > 
> > Cc: linux-xfs@vger.kernel.org
> > Signed-off-by: Lukas Czerner <lczerner@redhat.com>
> > ---
> >  fs/xfs/xfs_file.c | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index ebdd0bd..85e06c6 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -746,7 +746,8 @@ xfs_file_write_iter(
> >  #define	XFS_FALLOC_FL_SUPPORTED						\
> >  		(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |		\
> >  		 FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE |	\
> > -		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE)
> > +		 FALLOC_FL_INSERT_RANGE | FALLOC_FL_UNSHARE_RANGE |	\
> > +		 FALLOC_FL_QUERY_SUPPORT | FALLOC_FL_PREALLOC_RANGE)
> >  
> >  STATIC long
> >  xfs_file_fallocate(
> > @@ -768,6 +769,9 @@ xfs_file_fallocate(
> >  	if (mode & ~XFS_FALLOC_FL_SUPPORTED)
> >  		return -EOPNOTSUPP;
> >  
> > +	if (mode & FALLOC_FL_QUERY_SUPPORT)
> > +		return XFS_FALLOC_FL_SUPPORTED;
> 
> Urk, manpage update describing the goals of the query interface and how
> this is supposed to work is needed.

See linux-fsdevel for entire patch set. It's RFC, manpage update will
come later.
https://www.spinics.net/lists/linux-fsdevel/msg115740.html

> 
> Are we required to return only the mode flags that would reasonably be
> expected to work on this file, or the fs in general?  Do we return zero

Yes, as mentioned in cover letter it's supposed to return flags that
would be expected to work on this file.

> if the file is immutable (I guess the fd has to be O_RDONLY?) or if the
> fs is readonly?

It's going through the same path as any fallocate call.

> 
> And like hch said, why not {f,}pathconf?

I do not think fpathconf is suitable for this use case, see my reply to hch
and reply form Andreas.

-Lukas

> 
> --D
> 
> > +
> >  	xfs_ilock(ip, iolock);
> >  	error = xfs_break_layouts(inode, &iolock);
> >  	if (error)
> > -- 
> > 2.7.5
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19  8:15     ` Lukas Czerner
@ 2017-09-19 14:13         ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-19 14:13 UTC (permalink / raw)
  To: Lukas Czerner; +Cc: Christoph Hellwig, linux-fsdevel, linux-xfs, linux-api

On Tue, Sep 19, 2017 at 10:15:38AM +0200, Lukas Czerner wrote:
> On Mon, Sep 18, 2017 at 10:56:51AM -0700, Christoph Hellwig wrote:
> > This seems to miss any explanation and 6 out of 7 patches.
> 
> See linux-fsdevel for entire patch set.
> https://www.spinics.net/lists/linux-fsdevel/msg115740.html

But only one made it to linux-xfs, so please fix your patch sending.

> (f)pathconf on it's own is useless because I need to get this
> information directly from the file system. Supported fallocate modes can
> differ with inode flags (encrypted files, extent/indirect based files on
> ext4). I also need in-kernel interface.

And that is exactly what I want you to do:  finally bite the bullet
and add pathconf inode operation.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
@ 2017-09-19 14:13         ` Christoph Hellwig
  0 siblings, 0 replies; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-19 14:13 UTC (permalink / raw)
  To: Lukas Czerner
  Cc: Christoph Hellwig, linux-fsdevel-u79uwXL29TY76Z2rM5mHXA,
	linux-xfs-u79uwXL29TY76Z2rM5mHXA,
	linux-api-u79uwXL29TY76Z2rM5mHXA

On Tue, Sep 19, 2017 at 10:15:38AM +0200, Lukas Czerner wrote:
> On Mon, Sep 18, 2017 at 10:56:51AM -0700, Christoph Hellwig wrote:
> > This seems to miss any explanation and 6 out of 7 patches.
> 
> See linux-fsdevel for entire patch set.
> https://www.spinics.net/lists/linux-fsdevel/msg115740.html

But only one made it to linux-xfs, so please fix your patch sending.

> (f)pathconf on it's own is useless because I need to get this
> information directly from the file system. Supported fallocate modes can
> differ with inode flags (encrypted files, extent/indirect based files on
> ext4). I also need in-kernel interface.

And that is exactly what I want you to do:  finally bite the bullet
and add pathconf inode operation.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-18 21:55     ` Andreas Dilger
@ 2017-09-19 14:55       ` Theodore Ts'o
  2017-09-19 15:33         ` Lukas Czerner
  2017-09-19 15:55         ` Christoph Hellwig
  0 siblings, 2 replies; 28+ messages in thread
From: Theodore Ts'o @ 2017-09-19 14:55 UTC (permalink / raw)
  To: Andreas Dilger; +Cc: Darrick J. Wong, Lukas Czerner, linux-fsdevel, linux-xfs

On Mon, Sep 18, 2017 at 03:55:40PM -0600, Andreas Dilger wrote:
> > And like hch said, why not {f,}pathconf?
> 
> The problem with pathconf() is that this is just made up by GlibC, and
> doesn't actually communicate with the kernel or the filesystem at all.
> For example, _PC_LINK_MAX does not return different values for ext2/ext4
> filesystems based on feature flags, only what is hard-coded into GlibC
> based on the filesystem magic.  This is not going to work as a per-file
> source of information.
> 
> For example, even though EXT4_LINK_MAX has been 65000 for many years,
> GlibC reports 32000 for an ext4 filesystem since it only uses the magic:

Maybe the right answer is we should define a new pathconfat(2) system
call which can be used as part of a C library's implementation of
pathconf() and fpathconf()?  glibc probably won't use it for years, of
course.  But we can at least provide the information via an interface
which we can control, and which is capable of returning correct
results?

						- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 14:55       ` Theodore Ts'o
@ 2017-09-19 15:33         ` Lukas Czerner
  2017-09-19 15:55         ` Christoph Hellwig
  1 sibling, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-19 15:33 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, linux-fsdevel, linux-xfs

On Tue, Sep 19, 2017 at 10:55:20AM -0400, Theodore Ts'o wrote:
> On Mon, Sep 18, 2017 at 03:55:40PM -0600, Andreas Dilger wrote:
> > > And like hch said, why not {f,}pathconf?
> > 
> > The problem with pathconf() is that this is just made up by GlibC, and
> > doesn't actually communicate with the kernel or the filesystem at all.
> > For example, _PC_LINK_MAX does not return different values for ext2/ext4
> > filesystems based on feature flags, only what is hard-coded into GlibC
> > based on the filesystem magic.  This is not going to work as a per-file
> > source of information.
> > 
> > For example, even though EXT4_LINK_MAX has been 65000 for many years,
> > GlibC reports 32000 for an ext4 filesystem since it only uses the magic:
> 
> Maybe the right answer is we should define a new pathconfat(2) system
> call which can be used as part of a C library's implementation of
> pathconf() and fpathconf()?  glibc probably won't use it for years, of
> course.  But we can at least provide the information via an interface
> which we can control, and which is capable of returning correct
> results?
> 
> 						- Ted

Right, *I think* this is what Christoph suggested in his recent reply.
It sounds fair enough to me and this is much more usefull beyond just
fallocate, so I can look into it.

-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 14:55       ` Theodore Ts'o
  2017-09-19 15:33         ` Lukas Czerner
@ 2017-09-19 15:55         ` Christoph Hellwig
  2017-09-19 19:17           ` Florian Weimer
  1 sibling, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-19 15:55 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, Lukas Czerner, linux-fsdevel, linux-xfs

On Tue, Sep 19, 2017 at 10:55:20AM -0400, Theodore Ts'o wrote:
> Maybe the right answer is we should define a new pathconfat(2) system
> call which can be used as part of a C library's implementation of
> pathconf() and fpathconf()?  glibc probably won't use it for years, of
> course.  But we can at least provide the information via an interface
> which we can control, and which is capable of returning correct
> results?

glibc is very fast at picking up new kernel interface these days
as long as they aren't too controversial.  Implementing a syscall
that backs a function they implement should not be in the controversial
category I think.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 15:55         ` Christoph Hellwig
@ 2017-09-19 19:17           ` Florian Weimer
  2017-09-19 20:37             ` Christoph Hellwig
  0 siblings, 1 reply; 28+ messages in thread
From: Florian Weimer @ 2017-09-19 19:17 UTC (permalink / raw)
  To: Christoph Hellwig, Theodore Ts'o
  Cc: Andreas Dilger, Darrick J. Wong, Lukas Czerner, linux-fsdevel, linux-xfs

On 09/19/2017 05:55 PM, Christoph Hellwig wrote:
> On Tue, Sep 19, 2017 at 10:55:20AM -0400, Theodore Ts'o wrote:
>> Maybe the right answer is we should define a new pathconfat(2) system
>> call which can be used as part of a C library's implementation of
>> pathconf() and fpathconf()?  glibc probably won't use it for years, of
>> course.  But we can at least provide the information via an interface
>> which we can control, and which is capable of returning correct
>> results?
> 
> glibc is very fast at picking up new kernel interface these days
> as long as they aren't too controversial.  Implementing a syscall
> that backs a function they implement should not be in the controversial
> category I think.

We have a significant backlog, but I don't expect opposition to patches 
implementing syscall wrappers which are just slightly generic (and 
pathconfat should really be fine).

Technically, pathconfat might be tricky to maintain if it's expected to 
be a variadic dispatcher function, and the accompanying UAPI header 
isn't very clean.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 19:17           ` Florian Weimer
@ 2017-09-19 20:37             ` Christoph Hellwig
  2017-09-19 23:17               ` Theodore Ts'o
  0 siblings, 1 reply; 28+ messages in thread
From: Christoph Hellwig @ 2017-09-19 20:37 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Christoph Hellwig, Theodore Ts'o, Andreas Dilger,
	Darrick J. Wong, Lukas Czerner, linux-fsdevel, linux-xfs

On Tue, Sep 19, 2017 at 09:17:28PM +0200, Florian Weimer wrote:
> We have a significant backlog, but I don't expect opposition to patches
> implementing syscall wrappers which are just slightly generic (and
> pathconfat should really be fine).
> 
> Technically, pathconfat might be tricky to maintain if it's expected to be a
> variadic dispatcher function, and the accompanying UAPI header isn't very
> clean.

(f)pathconf is defined as:

       long fpathconf(int fd, int name);
       long pathconf(const char *path, int name);

so there really shouldn't be any issue.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 20:37             ` Christoph Hellwig
@ 2017-09-19 23:17               ` Theodore Ts'o
  2017-09-21 13:17                 ` pathconf syscall for linux Lukas Czerner
  2017-09-21 13:54                 ` [PATCH 4/7] xfs: Implement fallocate query support mode Florian Weimer
  0 siblings, 2 replies; 28+ messages in thread
From: Theodore Ts'o @ 2017-09-19 23:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Florian Weimer, Andreas Dilger, Darrick J. Wong, Lukas Czerner,
	linux-fsdevel, linux-xfs

On Tue, Sep 19, 2017 at 01:37:19PM -0700, Christoph Hellwig wrote:
> On Tue, Sep 19, 2017 at 09:17:28PM +0200, Florian Weimer wrote:
> > We have a significant backlog, but I don't expect opposition to patches
> > implementing syscall wrappers which are just slightly generic (and
> > pathconfat should really be fine).
> > 
> > Technically, pathconfat might be tricky to maintain if it's expected to be a
> > variadic dispatcher function, and the accompanying UAPI header isn't very
> > clean.
> 
> (f)pathconf is defined as:
> 
>        long fpathconf(int fd, int name);
>        long pathconf(const char *path, int name);
> 
> so there really shouldn't be any issue.

There might be some tricky bits will be if glibc feels it needs to
override the values returned by the kernel (if it believes it may
impose some limitations due to its implementation).  Also, not all
file systems will implement some or all pathconf parameters initially
(or perhaps ever).  So should the VFS be responsible for returning
some intelligent defaults, or glibc?

Also, will application programs want to know whether they are getting
kernel values versus glibc's best guess (which may be depedent on the
kernel version, modulo distributions / handset/tablet vendors who
might decide to backport the syscall to their kernels)?

It might be a good idea to come to a rough agreement (between the
kernel and glibc devs) about how to handle some of these practical
issues up front.

Cheers,

					- Ted

^ permalink raw reply	[flat|nested] 28+ messages in thread

* pathconf syscall for linux
  2017-09-19 23:17               ` Theodore Ts'o
@ 2017-09-21 13:17                 ` Lukas Czerner
  2017-09-21 13:49                   ` Florian Weimer
  2017-09-21 13:54                 ` [PATCH 4/7] xfs: Implement fallocate query support mode Florian Weimer
  1 sibling, 1 reply; 28+ messages in thread
From: Lukas Czerner @ 2017-09-21 13:17 UTC (permalink / raw)
  To: Theodore Ts'o
  Cc: Christoph Hellwig, Florian Weimer, Andreas Dilger,
	Darrick J. Wong, linux-fsdevel, linux-xfs

On Tue, Sep 19, 2017 at 07:17:16PM -0400, Theodore Ts'o wrote:
> On Tue, Sep 19, 2017 at 01:37:19PM -0700, Christoph Hellwig wrote:
> > On Tue, Sep 19, 2017 at 09:17:28PM +0200, Florian Weimer wrote:
> > > We have a significant backlog, but I don't expect opposition to patches
> > > implementing syscall wrappers which are just slightly generic (and
> > > pathconfat should really be fine).
> > > 
> > > Technically, pathconfat might be tricky to maintain if it's expected to be a
> > > variadic dispatcher function, and the accompanying UAPI header isn't very
> > > clean.
> > 
> > (f)pathconf is defined as:
> > 
> >        long fpathconf(int fd, int name);
> >        long pathconf(const char *path, int name);
> > 
> > so there really shouldn't be any issue.
> 
> There might be some tricky bits will be if glibc feels it needs to
> override the values returned by the kernel (if it believes it may
> impose some limitations due to its implementation).  Also, not all
> file systems will implement some or all pathconf parameters initially
> (or perhaps ever).  So should the VFS be responsible for returning
> some intelligent defaults, or glibc?
> 
> Also, will application programs want to know whether they are getting
> kernel values versus glibc's best guess (which may be depedent on the
> kernel version, modulo distributions / handset/tablet vendors who
> might decide to backport the syscall to their kernels)?
> 
> It might be a good idea to come to a rough agreement (between the
> kernel and glibc devs) about how to handle some of these practical
> issues up front.
> 
> Cheers,
> 
> 					- Ted

Hi,

I think that it's reasonable to expect that we will implement the
following basic set of options. If I am not mistaken this is the basic
set that must be implemented by pathconf/fpathconf and it's also the
set of options that currently has linux specific implementation (or just
value definition) in glibc.


I think that most of those can be implemented directly in the vfs with
some exceptions, where we'll need to ask the file system. However this
should not be problem to implement right away when we introduce
pathconf.

Here is the list, although I am still not sure about some of those. Any
clarification on these would help.

_PC_LINK_MAX		include/uapi/linux/limits.h (linux limits)
_PC_MAX_CANON		include/uapi/linux/limits.h (linux limits)
_PC_MAX_INPUT		linux limits
_PC_NAME_MAX		linux limits
_PC_PATH_MAX		linux limits
_PC_PIPE_BUF		linux limits
_PC_CHOWN_RESTRICTED	currently yes
_PC_NO_TRUNC		specified per fs (at least affs set at mount)
_PC_VDISABLE		include/linux/tty.h __DISABLED_CHAR
_PC_SYNC_IO		we support O_SYNC, the question is how is it
			honored ? Currently my system returns undefined
_PC_ASYNC_IO		Not sure what exctly is ment by async here ?
			We always do asynchronous IO unless O_SYNC
			My system returns undefined
_PC_PRIO_IO		no support
_PC_SOCK_MAXBUF		?
_PC_FILESIZEBITS	per fs (s_maxbytes)
_PC_REC_INCR_XFER_SIZE 	?
_PC_REC_MAX_XFER_SIZE	?
_PC_REC_MIN_XFER_SIZE	? block size ?
_PC_REC_XFER_ALIGN	per fs (block size)
_PC_ALLOC_SIZE_MIN	per fs (block size)
_PC_SYMLINK_MAX		per fs (but we do not have limit AFACT)
_PC_2_SYMLINKS		per fs (do we have fs that don't have symlinks ?


As for the cases where glibc want's to override the value returned by
the kernel - I am not sure in what scenarios will they need to do it,
but if they do, it's just a matter of changing it...Not sure that I
understand you concern.

Also, at the moment we do not even know if it's a guess or a value returned
by kernel, so I'd say it does not matter. It can only get better from here.

Adding libc-alpha so they can share some of their wisdom. It'd be very
appreciated.

Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: pathconf syscall for linux
  2017-09-21 13:17                 ` pathconf syscall for linux Lukas Czerner
@ 2017-09-21 13:49                   ` Florian Weimer
  2017-09-22  8:38                     ` Lukas Czerner
  0 siblings, 1 reply; 28+ messages in thread
From: Florian Weimer @ 2017-09-21 13:49 UTC (permalink / raw)
  To: Lukas Czerner, Theodore Ts'o
  Cc: Christoph Hellwig, Andreas Dilger, Darrick J. Wong,
	linux-fsdevel, linux-xfs

On 09/21/2017 03:17 PM, Lukas Czerner wrote:

> _PC_NAME_MAX		linux limits

Note that _PC_NAME_MAX is basically a lie.  There are file systems which 
support longer names than 255 bytes.  It's quite common for file systems 
which store names in UCS-2 or UTF-16 and have code point limit of 255, 
which translates to something larger after UTF-8 conversion in the 
kernel.  This value is available today through statfs/f_namemax, but 
historically, the reported value was not correct.

But due to file bind mounts, any value the kernel can return is useless 
for application use anyway.

The only real user of this constant was readdir_r, and I've been trying 
to kill it (it's still in POSIX, but will be removed in a future edition).

> _PC_REC_INCR_XFER_SIZE 	?
> _PC_REC_MAX_XFER_SIZE	?
> _PC_REC_MIN_XFER_SIZE	? block size ?
> _PC_REC_XFER_ALIGN	per fs (block size)

NFS (and perhaps other network file systems) treats f_bsize to signal 
something similar.  We had to remove f_bsize-based tuning from glibc as 
a result because the value is quite misleading for many workloads.

> _PC_ALLOC_SIZE_MIN	per fs (block size)

Like the other size limits quoted above, the semantics are quite 
unclear.  POSIX documents them mostly in the context of the RT 
extensions, so the acceptable values likely depend whether the 
application uses kernel-assisted aio, or aio based on userspace threads.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-19 23:17               ` Theodore Ts'o
  2017-09-21 13:17                 ` pathconf syscall for linux Lukas Czerner
@ 2017-09-21 13:54                 ` Florian Weimer
  2017-09-22  8:40                   ` Lukas Czerner
  1 sibling, 1 reply; 28+ messages in thread
From: Florian Weimer @ 2017-09-21 13:54 UTC (permalink / raw)
  To: Theodore Ts'o, Christoph Hellwig
  Cc: Andreas Dilger, Darrick J. Wong, Lukas Czerner, linux-fsdevel, linux-xfs

On 09/20/2017 01:17 AM, Theodore Ts'o wrote:
> There might be some tricky bits will be if glibc feels it needs to
> override the values returned by the kernel (if it believes it may
> impose some limitations due to its implementation).  Also, not all
> file systems will implement some or all pathconf parameters initially
> (or perhaps ever).  So should the VFS be responsible for returning
> some intelligent defaults, or glibc?

I'd prefer if we could export pathconfat directly on the glibc side, 
without any emulation, and have the application deal with ENOSYS.

We would use it to implement parts of pathconf/fpathconf if available, 
but I don't think we should copy the existing emulation for pathconfat 
because every time we add such emulation, it comes back to haunt us.

Thanks,
Florian

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: pathconf syscall for linux
  2017-09-21 13:49                   ` Florian Weimer
@ 2017-09-22  8:38                     ` Lukas Czerner
  0 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-22  8:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Theodore Ts'o, Christoph Hellwig, Andreas Dilger,
	Darrick J. Wong, linux-fsdevel, linux-xfs

On Thu, Sep 21, 2017 at 03:49:48PM +0200, Florian Weimer wrote:
> On 09/21/2017 03:17 PM, Lukas Czerner wrote:
> 
> > _PC_NAME_MAX		linux limits
> 
> Note that _PC_NAME_MAX is basically a lie.  There are file systems which
> support longer names than 255 bytes.  It's quite common for file systems
> which store names in UCS-2 or UTF-16 and have code point limit of 255, which
> translates to something larger after UTF-8 conversion in the kernel.  This
> value is available today through statfs/f_namemax, but historically, the
> reported value was not correct.

Right, we can do this correctly from kernel though.

> 
> But due to file bind mounts, any value the kernel can return is useless for
> application use anyway.
> 
> The only real user of this constant was readdir_r, and I've been trying to
> kill it (it's still in POSIX, but will be removed in a future edition).

Good to know, but we still need to implement this, at least for now.

> 
> > _PC_REC_INCR_XFER_SIZE 	?
> > _PC_REC_MAX_XFER_SIZE	?
> > _PC_REC_MIN_XFER_SIZE	? block size ?
> > _PC_REC_XFER_ALIGN	per fs (block size)
> 
> NFS (and perhaps other network file systems) treats f_bsize to signal
> something similar.  We had to remove f_bsize-based tuning from glibc as a
> result because the value is quite misleading for many workloads.
> 
> > _PC_ALLOC_SIZE_MIN	per fs (block size)
> 
> Like the other size limits quoted above, the semantics are quite unclear.
> POSIX documents them mostly in the context of the RT extensions, so the
> acceptable values likely depend whether the application uses kernel-assisted
> aio, or aio based on userspace threads.


Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH 4/7] xfs: Implement fallocate query support mode
  2017-09-21 13:54                 ` [PATCH 4/7] xfs: Implement fallocate query support mode Florian Weimer
@ 2017-09-22  8:40                   ` Lukas Czerner
  0 siblings, 0 replies; 28+ messages in thread
From: Lukas Czerner @ 2017-09-22  8:40 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Theodore Ts'o, Christoph Hellwig, Andreas Dilger,
	Darrick J. Wong, linux-fsdevel, linux-xfs

On Thu, Sep 21, 2017 at 03:54:01PM +0200, Florian Weimer wrote:
> On 09/20/2017 01:17 AM, Theodore Ts'o wrote:
> > There might be some tricky bits will be if glibc feels it needs to
> > override the values returned by the kernel (if it believes it may
> > impose some limitations due to its implementation).  Also, not all
> > file systems will implement some or all pathconf parameters initially
> > (or perhaps ever).  So should the VFS be responsible for returning
> > some intelligent defaults, or glibc?
> 
> I'd prefer if we could export pathconfat directly on the glibc side, without
> any emulation, and have the application deal with ENOSYS.
> 
> We would use it to implement parts of pathconf/fpathconf if available, but I
> don't think we should copy the existing emulation for pathconfat because
> every time we add such emulation, it comes back to haunt us.

Fair enough, I'll get back with some kernel code.

Thanks!
-Lukas

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-09-22  8:40 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-18 15:52 [PATCH 0/7][RFC] Introduce fallocate query support mode Lukas Czerner
2017-09-18 15:52 ` [PATCH 1/7] vfs: " Lukas Czerner
2017-09-18 15:52 ` [PATCH 2/7] ext4: Implement " Lukas Czerner
2017-09-18 15:52 ` [PATCH 3/7] ext4: Remove unnecessary S_ISREG checks in fallocate operations Lukas Czerner
2017-09-18 15:52 ` [PATCH 4/7] xfs: Implement fallocate query support mode Lukas Czerner
2017-09-18 17:56   ` Christoph Hellwig
2017-09-19  3:28     ` OGAWA Hirofumi
2017-09-19  8:15     ` Lukas Czerner
2017-09-19 14:13       ` Christoph Hellwig
2017-09-19 14:13         ` Christoph Hellwig
2017-09-18 20:48   ` Darrick J. Wong
2017-09-18 21:55     ` Andreas Dilger
2017-09-19 14:55       ` Theodore Ts'o
2017-09-19 15:33         ` Lukas Czerner
2017-09-19 15:55         ` Christoph Hellwig
2017-09-19 19:17           ` Florian Weimer
2017-09-19 20:37             ` Christoph Hellwig
2017-09-19 23:17               ` Theodore Ts'o
2017-09-21 13:17                 ` pathconf syscall for linux Lukas Czerner
2017-09-21 13:49                   ` Florian Weimer
2017-09-22  8:38                     ` Lukas Czerner
2017-09-21 13:54                 ` [PATCH 4/7] xfs: Implement fallocate query support mode Florian Weimer
2017-09-22  8:40                   ` Lukas Czerner
2017-09-19  8:20     ` Lukas Czerner
2017-09-18 15:52 ` [PATCH 5/7] fat: " Lukas Czerner
2017-09-18 15:52 ` [PATCH 6/7] gfs2: " Lukas Czerner
2017-09-18 15:52   ` [Cluster-devel] " Lukas Czerner
2017-09-18 15:52 ` [PATCH 7/7] loop: Check for puch hole and zero range support specifically Lukas Czerner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.