All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/4] ovl: optimize dir iteration
@ 2016-12-22 17:25 Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Miklos,

This patch series implements dir iteration optimizations
using fs support for setting file types on rename.

Implementing support for xfs was very easy, although this
is just a demo patch, because proper support would require
a new feature flag and relaxing xfs_repair file type checks.

The xfs patch is based on top of my cleanup series, but is
not dependent on it in any significant way.

I tested on the following setup of upper/lower on same base xfs:

/dev-lower_layer on /base type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /lower type xfs (rw,relatime,attr2,inode64,noquota)
/dev-lower_layer on /upper type xfs (rw,relatime,attr2,inode64,noquota)
overlay on /mnt type overlay (rw,relatime,lowerdir=/lower,upperdir=/upper/0,workdir=/upper/work)

Generated some whiteouts some copy ups and some opaque objects:

root@kvm-xfstests:~# rm /mnt/a/pointless100 
root@kvm-xfstests:~# rmdir /mnt/a/empty100
root@kvm-xfstests:~# touch /mnt/a/foo100 
root@kvm-xfstests:~# touch /mnt/a/dir100 
root@kvm-xfstests:~# touch /mnt/a/newfile
root@kvm-xfstests:~# mkdir /mnt/a/newdir

Used this tool I introduced to xfstests to print the resulting dtypes:

root@kvm-xfstests:~# ./xfstests/src/t_dir_type /upper/0/a/
. d
.. d
pointless100 w
empty100 w
foo100 u
dir100 u
newfile f
newdir d


What do you think of the proposed rename API?
To me, it makes some sense, because a request to re-classify
a directory entry is like a request to re-index it, which
is basically what rename is about. It may even make sense
to be able to call the rename API for changing dtype,
without changing the name/parent at all.

Amir.

Amir Goldstein (4):
  vfs: add RENAME_VFS_DTYPE vfs_rename() flag
  xfs: support RENAME_VFS_DTYPE flag
  ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
  ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()

 fs/overlayfs/copy_up.c   |  8 +++++++-
 fs/overlayfs/dir.c       | 12 +++++++++---
 fs/overlayfs/overlayfs.h |  5 +++++
 fs/overlayfs/readdir.c   |  2 +-
 fs/xfs/libxfs/xfs_dir2.c |  1 +
 fs/xfs/xfs_iops.c        | 11 ++++++++---
 include/linux/fs.h       | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h  |  4 ++++
 8 files changed, 65 insertions(+), 8 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2017-03-21 20:11   ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Add support for extra rename flags that can be passed to
vfs_rename() from kernel code.

Define a new internal vfs flag RENAME_VFS_DTYPE.
This flag indicates that caller would like fs to set the dtype
of the target entry to a specified value.  The dtype value to use
is specified on the S_IFMT mask bits (12..15) of the rename flags.

For example, code can call vfs_rename(..., RENAME_DT_WHT) to set
the target dir entry type to DT_WHT, regardless of the value
of inode->i_mode.

File systems that supports the new RENAME_VFS_DTYPE flag would
check for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags)
instead of inode->i_mode to determine the value of dtype to store
in the directory entry.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 include/linux/fs.h      | 30 ++++++++++++++++++++++++++++++
 include/uapi/linux/fs.h |  4 ++++
 2 files changed, 34 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 8f1580d..82ce8ca 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -1559,6 +1559,36 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
 extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
 extern int vfs_rmdir(struct inode *, struct dentry *);
 extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
+
+/* vfs_rename() extra flags */
+#define RENAME_VFS_FLAG(i)	(1 << (RENAME_UAPI_BITS + (i)))
+#define RENAME_VFS_DTYPE	RENAME_VFS_FLAG(0)	/* Set dest dtype */
+
+#define RENAME_VFS_FLAG_BITS	1
+
+#define RENAME_FLAGS_BITS	(RENAME_UAPI_BITS + RENAME_VFS_FLAG_BITS)
+#define RENAME_FLAGS_MASK	((1 << RENAME_FLAGS_BITS) - 1)
+
+/*
+ * S_IFMT bits (12..15) carry the dtype value to set for RENAME_VFS_DTYPE.
+ *
+ * For example, code can call vfs_rename(..., RENAME_DT_WHT) to set the
+ * target dir entry type to DT_WHT, regardless of inode->i_mode.
+ * file systems that supports the new RENAME_VFS_DTYPE flag, would check
+ * for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags) instead
+ * of inode->i_mode to determine the dtype to store in the directory entry.
+ */
+#define RENAME_DT_BITS		4
+#define RENAME_DT_SHIFT		12
+#define RENAME_DT_MASK		S_IFMT
+#define RENAME_DT(dt)		(((dt) << RENAME_DT_SHIFT) | RENAME_VFS_DTYPE)
+#define RENAME_DT_UNKNOWN	RENAME_DT(DT_UNKNOWN)
+#define RENAME_DT_WHT		RENAME_DT(DT_WHT)
+/* mode to use instead of i_mode when setting dtype */
+#define RENAME_DT_MODE(f)	((f) & RENAME_DT_MASK)
+
+#define RENAME_VFS_MASK		(RENAME_FLAGS_MASK | RENAME_DT_MASK)
+
 extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
 extern int vfs_whiteout(struct inode *, struct dentry *);
 
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index 36da93f..11f0c6b 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -38,10 +38,14 @@
 #define SEEK_HOLE	4	/* seek to the next hole */
 #define SEEK_MAX	SEEK_HOLE
 
+/* renameat2(2) user api flags */
 #define RENAME_NOREPLACE	(1 << 0)	/* Don't overwrite target */
 #define RENAME_EXCHANGE		(1 << 1)	/* Exchange source and dest */
 #define RENAME_WHITEOUT		(1 << 2)	/* Whiteout source */
 
+#define RENAME_UAPI_BITS	3
+#define RENAME_UAPI_MASK	((1 << RENAME_UAPI_BITS) - 1)
+
 struct file_clone_range {
 	__s64 src_fd;
 	__u64 src_offset;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2017-03-21 20:14   ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
  3 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

If caller provided the target dtype to use and indicated that with
rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
of inode->i_mode to determine the value of dtype to store in the
directory entry.

Adding this functionality to official xfs code will require to add
a new feature flag to xfs directry naming on-disk format.

Without that new feature flag, xfs_repair will report that custom
dtype as a warning and set it back to the dtype value according to mode.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/xfs/libxfs/xfs_dir2.c |  1 +
 fs/xfs/xfs_iops.c        | 11 ++++++++---
 2 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 984530e..71c6b2b 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -49,6 +49,7 @@ const unsigned char xfs_dtype_to_ftype[DT_MAX] = {
 	[DT_FIFO]   = XFS_DIR3_FT_FIFO,
 	[DT_SOCK]   = XFS_DIR3_FT_SOCK,
 	[DT_LNK]    = XFS_DIR3_FT_SYMLINK,
+	[DT_WHT]    = XFS_DIR3_FT_WHT,
 };
 
 /*
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index d2da9ca..8574155 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -394,19 +394,24 @@ xfs_vn_rename(
 	unsigned int	flags)
 {
 	struct inode	*new_inode = d_inode(ndentry);
-	int		omode = 0;
+	int		omode = 0, nmode = 0;
 	struct xfs_name	oname;
 	struct xfs_name	nname;
 
-	if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT))
+	if (flags & ~RENAME_VFS_MASK)
 		return -EINVAL;
 
 	/* if we are exchanging files, we need to set i_mode of both files */
 	if (flags & RENAME_EXCHANGE)
 		omode = d_inode(ndentry)->i_mode;
+	/* if requested, use provided dtype for target */
+	if (flags & RENAME_VFS_DTYPE)
+		nmode = RENAME_DT_MODE(flags);
+	else
+		nmode = d_inode(odentry)->i_mode;
 
 	xfs_dentry_to_name(&oname, odentry, omode);
-	xfs_dentry_to_name(&nname, ndentry, d_inode(odentry)->i_mode);
+	xfs_dentry_to_name(&nname, ndentry, nmode);
 
 	return xfs_rename(XFS_I(odir), &oname, XFS_I(d_inode(odentry)),
 			  XFS_I(ndir), &nname,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein
  3 siblings, 0 replies; 9+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Try to use the new vfs rename flag RENAME_DT_UNKNOWN, to request
from underlying file system to mark copy up directory entries as
DT_UNKNOWN instead of their actual file type.

When copy ups are classified as DT_UNKNOWN, ovl_dir_read_merged() can
distinguish between copy up to opaque objects without having to check
the extended attribute trusted.overlay.inode on the iterated entries.

This will allow to know if upper d_inode value needs to be substitued
with lower d_inode value.

Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_UNKNOWN flag, which
is most likely the case, so we retry the rename without the flag.

This patch does not implement the actual optimization.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/copy_up.c   | 8 +++++++-
 fs/overlayfs/overlayfs.h | 5 +++++
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c
index f57043d..bcbcef5 100644
--- a/fs/overlayfs/copy_up.c
+++ b/fs/overlayfs/copy_up.c
@@ -298,7 +298,13 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir,
 	if (err)
 		goto out_cleanup;
 
-	err = ovl_do_rename(wdir, newdentry, udir, upper, 0);
+	/*
+	 * Mark copy up objects in upper as DT_UNKNOWN, so it easy to
+	 * distinguish them from opaque objects in iterate_dir().
+	 * File system repair tools may re-classify the file type
+	 * and that will break optimization, but not functionality.
+	 */
+	err = ovl_do_rename(wdir, newdentry, udir, upper, RENAME_DT_UNKNOWN);
 	if (err)
 		goto out_cleanup;
 
diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
index 8af450b..54162c0 100644
--- a/fs/overlayfs/overlayfs.h
+++ b/fs/overlayfs/overlayfs.h
@@ -113,6 +113,11 @@ static inline int ovl_do_rename(struct inode *olddir, struct dentry *olddentry,
 
 	err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL, flags);
 
+	/* retry without RENAME_VFS_DTYPE if fs does not support it */
+	if (err == -EINVAL && (flags & RENAME_VFS_DTYPE))
+		err = vfs_rename(olddir, olddentry, newdir, newdentry, NULL,
+				 flags & RENAME_UAPI_MASK);
+
 	if (err) {
 		pr_debug("...rename(%pd2, %pd2, ...) = %i\n",
 			 olddentry, newdentry, err);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged()
  2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
                   ` (2 preceding siblings ...)
  2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
@ 2016-12-22 17:25 ` Amir Goldstein
  3 siblings, 0 replies; 9+ messages in thread
From: Amir Goldstein @ 2016-12-22 17:25 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-unionfs

Try to use the new vfs rename flag RENAME_DT_WHT, to request from
underlying file system to mark whiteout directory entries as DT_WHT
instead of DT_CHR.

When whiteouts are classified as DT_WHT, ovl_dir_read_merged() can
distinguish between whiteouts to real character devices without
having to stat the suspect inodes.

Because this is only done for optimization, it is not a problem
if file system does not respect the new RENAME_DT_WHT flag, which
is most likely the case, so we retry the rename without the flag.

Even if DT_WHT type is set on rename using an experimental file
system patch, file system repair tools may later re-classify the
file type. That would break the optimization, but will not break
the functionality of ovl_dir_read_merged().

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---
 fs/overlayfs/dir.c     | 12 +++++++++---
 fs/overlayfs/readdir.c |  2 +-
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/fs/overlayfs/dir.c b/fs/overlayfs/dir.c
index 16e06dd..0e1c3f2 100644
--- a/fs/overlayfs/dir.c
+++ b/fs/overlayfs/dir.c
@@ -630,7 +630,13 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 	struct dentry *upper;
 	struct dentry *opaquedir = NULL;
 	int err;
-	int flags = 0;
+	/*
+	 * Mark whiteout objects as DT_WHT instead of DT_CHR, so it easy
+	 * to distinguish them real character devices in iterate_dir().
+	 * File system repair tools may re-classify the file type
+	 * and that will break optimization, but not functionality.
+	 */
+	int flags = RENAME_DT_WHT;
 
 	if (WARN_ON(!workdir))
 		return -EROFS;
@@ -665,12 +671,12 @@ static int ovl_remove_and_whiteout(struct dentry *dentry, bool is_dir)
 		goto out_dput_upper;
 
 	if (d_is_dir(upper))
-		flags = RENAME_EXCHANGE;
+		flags |= RENAME_EXCHANGE;
 
 	err = ovl_do_rename(wdir, whiteout, udir, upper, flags);
 	if (err)
 		goto kill_whiteout;
-	if (flags)
+	if (flags & RENAME_EXCHANGE)
 		ovl_cleanup(wdir, upper);
 
 	ovl_dentry_version_inc(dentry->d_parent);
diff --git a/fs/overlayfs/readdir.c b/fs/overlayfs/readdir.c
index f241b4e..32c5c96 100644
--- a/fs/overlayfs/readdir.c
+++ b/fs/overlayfs/readdir.c
@@ -98,7 +98,7 @@ static struct ovl_cache_entry *ovl_cache_entry_new(struct ovl_readdir_data *rdd,
 	p->len = len;
 	p->type = d_type;
 	p->ino = ino;
-	p->is_whiteout = false;
+	p->is_whiteout = (d_type == DT_WHT);
 
 	if (d_type == DT_CHR) {
 		p->next_maybe_whiteout = rdd->first_maybe_whiteout;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag
  2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
@ 2017-03-21 20:11   ` Amir Goldstein
  0 siblings, 0 replies; 9+ messages in thread
From: Amir Goldstein @ 2017-03-21 20:11 UTC (permalink / raw)
  To: Theodore Tso; +Cc: linux-fsdevel, Darrick J. Wong

On Thu, Dec 22, 2016 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> Add support for extra rename flags that can be passed to
> vfs_rename() from kernel code.
>
> Define a new internal vfs flag RENAME_VFS_DTYPE.
> This flag indicates that caller would like fs to set the dtype
> of the target entry to a specified value.  The dtype value to use
> is specified on the S_IFMT mask bits (12..15) of the rename flags.
>
> For example, code can call vfs_rename(..., RENAME_DT_WHT) to set
> the target dir entry type to DT_WHT, regardless of the value
> of inode->i_mode.
>
> File systems that supports the new RENAME_VFS_DTYPE flag would
> check for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags)
> instead of inode->i_mode to determine the value of dtype to store
> in the directory entry.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Ted,

My bad. I posted this thing on the overlayfs list only.
That's the VFS API part.
I'll also post the XFS POC.

> ---
>  include/linux/fs.h      | 30 ++++++++++++++++++++++++++++++
>  include/uapi/linux/fs.h |  4 ++++
>  2 files changed, 34 insertions(+)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 8f1580d..82ce8ca 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -1559,6 +1559,36 @@ extern int vfs_symlink(struct inode *, struct dentry *, const char *);
>  extern int vfs_link(struct dentry *, struct inode *, struct dentry *, struct inode **);
>  extern int vfs_rmdir(struct inode *, struct dentry *);
>  extern int vfs_unlink(struct inode *, struct dentry *, struct inode **);
> +
> +/* vfs_rename() extra flags */
> +#define RENAME_VFS_FLAG(i)     (1 << (RENAME_UAPI_BITS + (i)))
> +#define RENAME_VFS_DTYPE       RENAME_VFS_FLAG(0)      /* Set dest dtype */
> +
> +#define RENAME_VFS_FLAG_BITS   1
> +
> +#define RENAME_FLAGS_BITS      (RENAME_UAPI_BITS + RENAME_VFS_FLAG_BITS)
> +#define RENAME_FLAGS_MASK      ((1 << RENAME_FLAGS_BITS) - 1)
> +
> +/*
> + * S_IFMT bits (12..15) carry the dtype value to set for RENAME_VFS_DTYPE.
> + *
> + * For example, code can call vfs_rename(..., RENAME_DT_WHT) to set the
> + * target dir entry type to DT_WHT, regardless of inode->i_mode.
> + * file systems that supports the new RENAME_VFS_DTYPE flag, would check
> + * for (flags & RENAME_VFS_DTYPE) and use RENAME_DT_MODE(flags) instead
> + * of inode->i_mode to determine the dtype to store in the directory entry.
> + */
> +#define RENAME_DT_BITS         4
> +#define RENAME_DT_SHIFT                12
> +#define RENAME_DT_MASK         S_IFMT
> +#define RENAME_DT(dt)          (((dt) << RENAME_DT_SHIFT) | RENAME_VFS_DTYPE)
> +#define RENAME_DT_UNKNOWN      RENAME_DT(DT_UNKNOWN)
> +#define RENAME_DT_WHT          RENAME_DT(DT_WHT)
> +/* mode to use instead of i_mode when setting dtype */
> +#define RENAME_DT_MODE(f)      ((f) & RENAME_DT_MASK)
> +
> +#define RENAME_VFS_MASK                (RENAME_FLAGS_MASK | RENAME_DT_MASK)
> +
>  extern int vfs_rename(struct inode *, struct dentry *, struct inode *, struct dentry *, struct inode **, unsigned int);
>  extern int vfs_whiteout(struct inode *, struct dentry *);
>
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index 36da93f..11f0c6b 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -38,10 +38,14 @@
>  #define SEEK_HOLE      4       /* seek to the next hole */
>  #define SEEK_MAX       SEEK_HOLE
>
> +/* renameat2(2) user api flags */
>  #define RENAME_NOREPLACE       (1 << 0)        /* Don't overwrite target */
>  #define RENAME_EXCHANGE                (1 << 1)        /* Exchange source and dest */
>  #define RENAME_WHITEOUT                (1 << 2)        /* Whiteout source */
>
> +#define RENAME_UAPI_BITS       3
> +#define RENAME_UAPI_MASK       ((1 << RENAME_UAPI_BITS) - 1)
> +
>  struct file_clone_range {
>         __s64 src_fd;
>         __u64 src_offset;
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
  2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
@ 2017-03-21 20:14   ` Amir Goldstein
  2017-03-27 16:23     ` Darrick J. Wong
  0 siblings, 1 reply; 9+ messages in thread
From: Amir Goldstein @ 2017-03-21 20:14 UTC (permalink / raw)
  To: Theodore Tso, Darrick J. Wong; +Cc: linux-fsdevel

On Thu, Dec 22, 2016 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> If caller provided the target dtype to use and indicated that with
> rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
> of inode->i_mode to determine the value of dtype to store in the
> directory entry.
>
> Adding this functionality to official xfs code will require to add
> a new feature flag to xfs directry naming on-disk format.
>
> Without that new feature flag, xfs_repair will report that custom
> dtype as a warning and set it back to the dtype value according to mode.
>
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>

Ted/Darrick,

That's the XFS POC implementation of the proposed API.
I am not re-posting the entire series that uses this for optimizing
overlay readdir.
It can be found here:
https://www.spinics.net/lists/linux-unionfs/msg01401.html

> ---
>  fs/xfs/libxfs/xfs_dir2.c |  1 +
>  fs/xfs/xfs_iops.c        | 11 ++++++++---
>  2 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> index 984530e..71c6b2b 100644
> --- a/fs/xfs/libxfs/xfs_dir2.c
> +++ b/fs/xfs/libxfs/xfs_dir2.c
> @@ -49,6 +49,7 @@ const unsigned char xfs_dtype_to_ftype[DT_MAX] = {
>         [DT_FIFO]   = XFS_DIR3_FT_FIFO,
>         [DT_SOCK]   = XFS_DIR3_FT_SOCK,
>         [DT_LNK]    = XFS_DIR3_FT_SYMLINK,
> +       [DT_WHT]    = XFS_DIR3_FT_WHT,
>  };
>
>  /*
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index d2da9ca..8574155 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -394,19 +394,24 @@ xfs_vn_rename(
>         unsigned int    flags)
>  {
>         struct inode    *new_inode = d_inode(ndentry);
> -       int             omode = 0;
> +       int             omode = 0, nmode = 0;
>         struct xfs_name oname;
>         struct xfs_name nname;
>
> -       if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT))
> +       if (flags & ~RENAME_VFS_MASK)
>                 return -EINVAL;
>
>         /* if we are exchanging files, we need to set i_mode of both files */
>         if (flags & RENAME_EXCHANGE)
>                 omode = d_inode(ndentry)->i_mode;
> +       /* if requested, use provided dtype for target */
> +       if (flags & RENAME_VFS_DTYPE)
> +               nmode = RENAME_DT_MODE(flags);
> +       else
> +               nmode = d_inode(odentry)->i_mode;
>
>         xfs_dentry_to_name(&oname, odentry, omode);
> -       xfs_dentry_to_name(&nname, ndentry, d_inode(odentry)->i_mode);
> +       xfs_dentry_to_name(&nname, ndentry, nmode);
>
>         return xfs_rename(XFS_I(odir), &oname, XFS_I(d_inode(odentry)),
>                           XFS_I(ndir), &nname,
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
  2017-03-21 20:14   ` Amir Goldstein
@ 2017-03-27 16:23     ` Darrick J. Wong
  2017-03-27 17:41       ` Amir Goldstein
  0 siblings, 1 reply; 9+ messages in thread
From: Darrick J. Wong @ 2017-03-27 16:23 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Theodore Tso, linux-fsdevel

On Tue, Mar 21, 2017 at 04:14:49PM -0400, Amir Goldstein wrote:
> On Thu, Dec 22, 2016 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
> > If caller provided the target dtype to use and indicated that with
> > rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
> > of inode->i_mode to determine the value of dtype to store in the
> > directory entry.
> >
> > Adding this functionality to official xfs code will require to add
> > a new feature flag to xfs directry naming on-disk format.
> >
> > Without that new feature flag, xfs_repair will report that custom
> > dtype as a warning and set it back to the dtype value according to mode.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> 
> Ted/Darrick,
> 
> That's the XFS POC implementation of the proposed API.
> I am not re-posting the entire series that uses this for optimizing
> overlay readdir.
> It can be found here:
> https://www.spinics.net/lists/linux-unionfs/msg01401.html

Welll.... at a bare minimum you ought to post this to the XFS list if
you're serious about trying to merge this.  I suspect there will be
pushback against allowing in-kernel(?) clients to set ftype arbitrarily,
since the details of that field weren't designed to be directly
accessible outside XFS.

Next, if you're only setting FT_WHT or FT_UNKNOWN (both of which at
least already exist as code points) then you'll have to supply a fix to
xfs_repair.  If you want to do more than that (i.e. use the unused ftype
bits) you'd have to implement kernel and userland patches to add an
incompat feature flag so that unaware kernels don't trip over the extra
bits.  You'll also have to send updates to the disk format documentation
to reflect the changes you're making, and some test cases to ensure that
it all actually works.

While I think I understand what you're trying to accomplish, you'll have
to argue for it on the XFS list. :)

--D

> 
> > ---
> >  fs/xfs/libxfs/xfs_dir2.c |  1 +
> >  fs/xfs/xfs_iops.c        | 11 ++++++++---
> >  2 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> > index 984530e..71c6b2b 100644
> > --- a/fs/xfs/libxfs/xfs_dir2.c
> > +++ b/fs/xfs/libxfs/xfs_dir2.c
> > @@ -49,6 +49,7 @@ const unsigned char xfs_dtype_to_ftype[DT_MAX] = {
> >         [DT_FIFO]   = XFS_DIR3_FT_FIFO,
> >         [DT_SOCK]   = XFS_DIR3_FT_SOCK,
> >         [DT_LNK]    = XFS_DIR3_FT_SYMLINK,
> > +       [DT_WHT]    = XFS_DIR3_FT_WHT,
> >  };
> >
> >  /*
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index d2da9ca..8574155 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -394,19 +394,24 @@ xfs_vn_rename(
> >         unsigned int    flags)
> >  {
> >         struct inode    *new_inode = d_inode(ndentry);
> > -       int             omode = 0;
> > +       int             omode = 0, nmode = 0;
> >         struct xfs_name oname;
> >         struct xfs_name nname;
> >
> > -       if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE | RENAME_WHITEOUT))
> > +       if (flags & ~RENAME_VFS_MASK)
> >                 return -EINVAL;
> >
> >         /* if we are exchanging files, we need to set i_mode of both files */
> >         if (flags & RENAME_EXCHANGE)
> >                 omode = d_inode(ndentry)->i_mode;
> > +       /* if requested, use provided dtype for target */
> > +       if (flags & RENAME_VFS_DTYPE)
> > +               nmode = RENAME_DT_MODE(flags);
> > +       else
> > +               nmode = d_inode(odentry)->i_mode;
> >
> >         xfs_dentry_to_name(&oname, odentry, omode);
> > -       xfs_dentry_to_name(&nname, ndentry, d_inode(odentry)->i_mode);
> > +       xfs_dentry_to_name(&nname, ndentry, nmode);
> >
> >         return xfs_rename(XFS_I(odir), &oname, XFS_I(d_inode(odentry)),
> >                           XFS_I(ndir), &nname,
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag
  2017-03-27 16:23     ` Darrick J. Wong
@ 2017-03-27 17:41       ` Amir Goldstein
  0 siblings, 0 replies; 9+ messages in thread
From: Amir Goldstein @ 2017-03-27 17:41 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Theodore Tso, linux-fsdevel, Steve French, Miklos Szeredi

On Mon, Mar 27, 2017 at 12:23 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Tue, Mar 21, 2017 at 04:14:49PM -0400, Amir Goldstein wrote:
>> On Thu, Dec 22, 2016 at 12:25 PM, Amir Goldstein <amir73il@gmail.com> wrote:
>> > If caller provided the target dtype to use and indicated that with
>> > rename() flag RENAME_VFS_DTYPE, use RENAME_DT_MODE(flags) instead
>> > of inode->i_mode to determine the value of dtype to store in the
>> > directory entry.
>> >
>> > Adding this functionality to official xfs code will require to add
>> > a new feature flag to xfs directry naming on-disk format.
>> >
>> > Without that new feature flag, xfs_repair will report that custom
>> > dtype as a warning and set it back to the dtype value according to mode.
>> >
>> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
>>
>> Ted/Darrick,
>>
>> That's the XFS POC implementation of the proposed API.
>> I am not re-posting the entire series that uses this for optimizing
>> overlay readdir.
>> It can be found here:
>> https://www.spinics.net/lists/linux-unionfs/msg01401.html
>
> Welll.... at a bare minimum you ought to post this to the XFS list if
> you're serious about trying to merge this.  I suspect there will be

I guess I did not send it to xfs list because it was just an "internal POC"
for overlayfs optimization and I picked xfs as a convenient target as
it is the only fs that already had FT_WHT type defined.
I should have posted to fsdevel though - I though I did..

> pushback against allowing in-kernel(?) clients to set ftype arbitrarily,
> since the details of that field weren't designed to be directly
> accessible outside XFS.

Well, I was going to propose an API to set private ftype flags
and it is up to fs to implement this (by allowing the RENAME_VFS_DTYPE
flag), but it was easier to modify xfs just for setting FT_WHT/FT_UNKNOWN
and it was enough for my POC.

>
> Next, if you're only setting FT_WHT or FT_UNKNOWN (both of which at
> least already exist as code points) then you'll have to supply a fix to
> xfs_repair.  If you want to do more than that (i.e. use the unused ftype
> bits) you'd have to implement kernel and userland patches to add an
> incompat feature flag so that unaware kernels don't trip over the extra

I know. even wrote this in commit message :-)

> bits.  You'll also have to send updates to the disk format documentation
> to reflect the changes you're making, and some test cases to ensure that
> it all actually works.
>
> While I think I understand what you're trying to accomplish, you'll have
> to argue for it on the XFS list. :)
>

The way this played out was:
- I did the POC a while ago
- I proposed to discuss it as part of 'container aware fs' slot with James
- I discussed this with Miklos he convinced me this was an over optimization
  for overlayfs (for the average use case)
- Steve French brought up the need to mark directory entries (for sparse files)
  without having to stat(x) them, so I mentioned this patch could be relevant

I can't say that I fully understand the network file system needs in
that regard,
so I can't say if this approach is any help - just wanted to put it out there.
That said, if there is interest, I can certainly try to push this
forward, including
tests on-disk format doc etc.

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-03-27 17:41 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-22 17:25 [RFC][PATCH 0/4] ovl: optimize dir iteration Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 1/4] vfs: add RENAME_VFS_DTYPE vfs_rename() flag Amir Goldstein
2017-03-21 20:11   ` Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 2/4] xfs: support RENAME_VFS_DTYPE flag Amir Goldstein
2017-03-21 20:14   ` Amir Goldstein
2017-03-27 16:23     ` Darrick J. Wong
2017-03-27 17:41       ` Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 3/4] ovl: use RENAME_DT_UNKNOWN to optimize stable d_inode Amir Goldstein
2016-12-22 17:25 ` [RFC][PATCH 4/4] ovl: use RENAME_DT_WHT to optimize ovl_dir_read_merged() Amir Goldstein

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.