linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 00/12] various vfs patches
@ 2019-11-28 15:59 Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
                   ` (12 more replies)
  0 siblings, 13 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

Hi Al,

This is a dump of my current vfs patch queue, all have been posted in one
form or another.

Also available as a git branch:

  git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#for-viro

Please apply.

Thanks,
Miklos
---

Miklos Szeredi (12):
  aio: fix async fsync creds
  fs_parse: fix fs_param_v_optional handling
  vfs: verify param type in vfs_parse_sb_flag()
  uapi: deprecate STATX_ALL
  statx: don't clear STATX_ATIME on SB_RDONLY
  utimensat: AT_EMPTY_PATH support
  f*xattr: allow O_PATH descriptors
  vfs: allow unprivileged whiteout creation
  fs_parser: "string" with missing value is a "flag"
  vfs: don't parse forbidden flags
  vfs: don't parse "posixacl" option
  vfs: don't parse "silent" option

 fs/aio.c                        |  8 ++++
 fs/char_dev.c                   |  3 ++
 fs/fs_context.c                 | 72 ++++++++++++---------------------
 fs/fs_parser.c                  | 19 ++++-----
 fs/namei.c                      | 17 ++------
 fs/stat.c                       |  4 +-
 fs/utimes.c                     |  6 ++-
 fs/xattr.c                      |  8 ++--
 include/linux/device_cgroup.h   |  3 ++
 include/linux/fs_parser.h       |  1 -
 include/uapi/linux/stat.h       | 11 ++++-
 samples/vfs/test-statx.c        |  2 +-
 tools/include/uapi/linux/stat.h | 11 ++++-
 13 files changed, 82 insertions(+), 83 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 37+ messages in thread

* [PATCH 01/12] aio: fix async fsync creds
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-12-13  9:32   ` Miklos Szeredi
  2020-05-04  8:05   ` Avi Kivity
  2019-11-28 15:59 ` [PATCH 02/12] fs_parse: fix fs_param_v_optional handling Miklos Szeredi
                   ` (11 subsequent siblings)
  12 siblings, 2 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, Avi Kivity, Giuseppe Scrivano, stable

Avi Kivity reports that on fuse filesystems running in a user namespace
asyncronous fsync fails with EOVERFLOW.

The reason is that f_ops->fsync() is called with the creds of the kthread
performing aio work instead of the creds of the process originally
submitting IOCB_CMD_FSYNC.

Fuse sends the creds of the caller in the request header and it needs to
translate the uid and gid into the server's user namespace.  Since the
kthread is running in init_user_ns, the translation will fail and the
operation returns an error.

It can be argued that fsync doesn't actually need any creds, but just
zeroing out those fields in the header (as with requests that currently
don't take creds) is a backward compatibility risk.

Instead of working around this issue in fuse, solve the core of the problem
by calling the filesystem with the proper creds.

Reported-by: Avi Kivity <avi@scylladb.com>
Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
Fixes: c9582eb0ff7d ("fuse: Fail all requests with invalid uids or gids")
Cc: stable@vger.kernel.org  # 4.18+
Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/aio.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/fs/aio.c b/fs/aio.c
index 0d9a559d488c..37828773e2fe 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -176,6 +176,7 @@ struct fsync_iocb {
 	struct file		*file;
 	struct work_struct	work;
 	bool			datasync;
+	struct cred		*creds;
 };
 
 struct poll_iocb {
@@ -1589,8 +1590,11 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
 static void aio_fsync_work(struct work_struct *work)
 {
 	struct aio_kiocb *iocb = container_of(work, struct aio_kiocb, fsync.work);
+	const struct cred *old_cred = override_creds(iocb->fsync.creds);
 
 	iocb->ki_res.res = vfs_fsync(iocb->fsync.file, iocb->fsync.datasync);
+	revert_creds(old_cred);
+	put_cred(iocb->fsync.creds);
 	iocb_put(iocb);
 }
 
@@ -1604,6 +1608,10 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
 	if (unlikely(!req->file->f_op->fsync))
 		return -EINVAL;
 
+	req->creds = prepare_creds();
+	if (!req->creds)
+		return -ENOMEM;
+
 	req->datasync = datasync;
 	INIT_WORK(&req->work, aio_fsync_work);
 	schedule_work(&req->work);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-29 11:31   ` Andrew Price
  2019-12-16 23:28   ` Al Viro
  2019-11-28 15:59 ` [PATCH 03/12] vfs: verify param type in vfs_parse_sb_flag() Miklos Szeredi
                   ` (10 subsequent siblings)
  12 siblings, 2 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, Andrew Price, David Howells, stable

String options always have parameters, hence the check for optional
parameter will never trigger.

Check for param type being a flag first (flag is the only type that does
not have a parameter) and report "Missing value" if the parameter is
mandatory.

Tested with gfs2's "quota" option, which is currently the only user of
fs_param_v_optional.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: Andrew Price <anprice@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Fixes: 31d921c7fb96 ("vfs: Add configuration parser helpers")
Cc: <stable@vger.kernel.org> # v5.4
---
 fs/fs_parser.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/fs/fs_parser.c b/fs/fs_parser.c
index d1930adce68d..5d8833d71b37 100644
--- a/fs/fs_parser.c
+++ b/fs/fs_parser.c
@@ -127,13 +127,15 @@ int fs_parse(struct fs_context *fc,
 	case fs_param_is_u64:
 	case fs_param_is_enum:
 	case fs_param_is_string:
-		if (param->type != fs_value_is_string)
-			goto bad_value;
-		if (!result->has_value) {
+		if (param->type == fs_value_is_flag) {
 			if (p->flags & fs_param_v_optional)
 				goto okay;
-			goto bad_value;
+
+			return invalf(fc, "%s: Missing value for '%s'",
+				      desc->name, param->key);
 		}
+		if (param->type != fs_value_is_string)
+			goto bad_value;
 		/* Fall through */
 	default:
 		break;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 03/12] vfs: verify param type in vfs_parse_sb_flag()
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 02/12] fs_parse: fix fs_param_v_optional handling Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 04/12] uapi: deprecate STATX_ALL Miklos Szeredi
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

vfs_parse_sb_flag() accepted any kind of param with a matching key, not
just a flag.  This is wrong, only allow flag type and return -EINVAL
otherwise.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fs_context.c | 31 +++++++++++++++----------------
 1 file changed, 15 insertions(+), 16 deletions(-)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 138b5b4d621d..66fd7d753e91 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -81,30 +81,29 @@ static const char *const forbidden_sb_flag[] = {
 /*
  * Check for a common mount option that manipulates s_flags.
  */
-static int vfs_parse_sb_flag(struct fs_context *fc, const char *key)
+static int vfs_parse_sb_flag(struct fs_context *fc, struct fs_parameter *param)
 {
-	unsigned int token;
+	const char *key = param->key;
+	unsigned int set, clear;
 	unsigned int i;
 
 	for (i = 0; i < ARRAY_SIZE(forbidden_sb_flag); i++)
 		if (strcmp(key, forbidden_sb_flag[i]) == 0)
 			return -EINVAL;
 
-	token = lookup_constant(common_set_sb_flag, key, 0);
-	if (token) {
-		fc->sb_flags |= token;
-		fc->sb_flags_mask |= token;
-		return 0;
-	}
+	set = lookup_constant(common_set_sb_flag, key, 0);
+	clear = lookup_constant(common_clear_sb_flag, key, 0);
+	if (!set && !clear)
+		return -ENOPARAM;
 
-	token = lookup_constant(common_clear_sb_flag, key, 0);
-	if (token) {
-		fc->sb_flags &= ~token;
-		fc->sb_flags_mask |= token;
-		return 0;
-	}
+	if (param->type != fs_value_is_flag)
+		return invalf(fc, "%s: Unexpected value for '%s'",
+			      fc->fs_type->name, param->key);
 
-	return -ENOPARAM;
+	fc->sb_flags |= set;
+	fc->sb_flags &= ~clear;
+	fc->sb_flags_mask |= set | clear;
+	return 0;
 }
 
 /**
@@ -130,7 +129,7 @@ int vfs_parse_fs_param(struct fs_context *fc, struct fs_parameter *param)
 	if (!param->key)
 		return invalf(fc, "Unnamed parameter\n");
 
-	ret = vfs_parse_sb_flag(fc, param->key);
+	ret = vfs_parse_sb_flag(fc, param);
 	if (ret != -ENOPARAM)
 		return ret;
 
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 04/12] uapi: deprecate STATX_ALL
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (2 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 03/12] vfs: verify param type in vfs_parse_sb_flag() Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 05/12] statx: don't clear STATX_ATIME on SB_RDONLY Miklos Szeredi
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, David Howells, Michael Kerrisk

Constants of the *_ALL type can be actively harmful due to the fact that
developers will usually fail to consider the possible effects of future
changes to the definition.

Deprecate STATX_ALL in the uapi, while no damage has been done yet.

We could keep something like this around in the kernel, but there's
actually no point, since all filesystems should be explicitly checking
flags that they support and not rely on the VFS masking unknown ones out: a
flag could be known to the VFS, yet not known to the filesystem.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Michael Kerrisk <mtk.manpages@gmail.com>
---
 fs/stat.c                       |  1 -
 include/uapi/linux/stat.h       | 11 ++++++++++-
 samples/vfs/test-statx.c        |  2 +-
 tools/include/uapi/linux/stat.h | 11 ++++++++++-
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/fs/stat.c b/fs/stat.c
index c38e4c2e1221..7899d15722a0 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -68,7 +68,6 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
 
 	memset(stat, 0, sizeof(*stat));
 	stat->result_mask |= STATX_BASIC_STATS;
-	request_mask &= STATX_ALL;
 	query_flags &= KSTAT_QUERY_FLAGS;
 
 	/* allow the fs to override these if it really wants to */
diff --git a/include/uapi/linux/stat.h b/include/uapi/linux/stat.h
index 7b35e98d3c58..ed456ac0f90d 100644
--- a/include/uapi/linux/stat.h
+++ b/include/uapi/linux/stat.h
@@ -148,9 +148,18 @@ struct statx {
 #define STATX_BLOCKS		0x00000400U	/* Want/got stx_blocks */
 #define STATX_BASIC_STATS	0x000007ffU	/* The stuff in the normal stat struct */
 #define STATX_BTIME		0x00000800U	/* Want/got stx_btime */
-#define STATX_ALL		0x00000fffU	/* All currently supported flags */
+
 #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
 
+#ifndef __KERNEL__
+/*
+ * This is deprecated, and shall remain the same value in the future.  To avoid
+ * confusion please use the equivalent (STATX_BASIC_STATS | STATX_BTIME)
+ * instead.
+ */
+#define STATX_ALL		0x00000fffU
+#endif
+
 /*
  * Attributes to be found in stx_attributes and masked in stx_attributes_mask.
  *
diff --git a/samples/vfs/test-statx.c b/samples/vfs/test-statx.c
index a3d68159fb51..76c577ea4fd8 100644
--- a/samples/vfs/test-statx.c
+++ b/samples/vfs/test-statx.c
@@ -216,7 +216,7 @@ int main(int argc, char **argv)
 	struct statx stx;
 	int ret, raw = 0, atflag = AT_SYMLINK_NOFOLLOW;
 
-	unsigned int mask = STATX_ALL;
+	unsigned int mask = STATX_BASIC_STATS | STATX_BTIME;
 
 	for (argv++; *argv; argv++) {
 		if (strcmp(*argv, "-F") == 0) {
diff --git a/tools/include/uapi/linux/stat.h b/tools/include/uapi/linux/stat.h
index 7b35e98d3c58..ed456ac0f90d 100644
--- a/tools/include/uapi/linux/stat.h
+++ b/tools/include/uapi/linux/stat.h
@@ -148,9 +148,18 @@ struct statx {
 #define STATX_BLOCKS		0x00000400U	/* Want/got stx_blocks */
 #define STATX_BASIC_STATS	0x000007ffU	/* The stuff in the normal stat struct */
 #define STATX_BTIME		0x00000800U	/* Want/got stx_btime */
-#define STATX_ALL		0x00000fffU	/* All currently supported flags */
+
 #define STATX__RESERVED		0x80000000U	/* Reserved for future struct statx expansion */
 
+#ifndef __KERNEL__
+/*
+ * This is deprecated, and shall remain the same value in the future.  To avoid
+ * confusion please use the equivalent (STATX_BASIC_STATS | STATX_BTIME)
+ * instead.
+ */
+#define STATX_ALL		0x00000fffU
+#endif
+
 /*
  * Attributes to be found in stx_attributes and masked in stx_attributes_mask.
  *
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 05/12] statx: don't clear STATX_ATIME on SB_RDONLY
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (3 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 04/12] uapi: deprecate STATX_ALL Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 06/12] utimensat: AT_EMPTY_PATH support Miklos Szeredi
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, David Howells

IS_NOATIME(inode) is defined as __IS_FLG(inode, SB_RDONLY|SB_NOATIME), so
generic_fillattr() will clear STATX_ATIME from the result_mask if the super
block is marked read only.

This was probably not the intention, so fix to only clear STATX_ATIME if
the fs doesn't support atime at all.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
Acked-by: David Howells <dhowells@redhat.com>
---
 fs/stat.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/fs/stat.c b/fs/stat.c
index 7899d15722a0..fc49f705a83c 100644
--- a/fs/stat.c
+++ b/fs/stat.c
@@ -71,7 +71,8 @@ int vfs_getattr_nosec(const struct path *path, struct kstat *stat,
 	query_flags &= KSTAT_QUERY_FLAGS;
 
 	/* allow the fs to override these if it really wants to */
-	if (IS_NOATIME(inode))
+	/* SB_NOATIME means filesystem supplies dummy atime value */
+	if (inode->i_sb->s_flags & SB_NOATIME)
 		stat->result_mask &= ~STATX_ATIME;
 	if (IS_AUTOMOUNT(inode))
 		stat->attributes |= STATX_ATTR_AUTOMOUNT;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 06/12] utimensat: AT_EMPTY_PATH support
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (4 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 05/12] statx: don't clear STATX_ATIME on SB_RDONLY Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 07/12] f*xattr: allow O_PATH descriptors Miklos Szeredi
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

This makes it possible to use utimensat on an O_PATH file (including
symlinks).

It supersedes the nonstandard utimensat(fd, NULL, ...) form.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/utimes.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/utimes.c b/fs/utimes.c
index 1ba3f7883870..c07cb0dddbb6 100644
--- a/fs/utimes.c
+++ b/fs/utimes.c
@@ -95,13 +95,13 @@ long do_utimes(int dfd, const char __user *filename, struct timespec64 *times,
 		goto out;
 	}
 
-	if (flags & ~AT_SYMLINK_NOFOLLOW)
+	if (flags & ~(AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH))
 		goto out;
 
 	if (filename == NULL && dfd != AT_FDCWD) {
 		struct fd f;
 
-		if (flags & AT_SYMLINK_NOFOLLOW)
+		if (flags)
 			goto out;
 
 		f = fdget(dfd);
@@ -117,6 +117,8 @@ long do_utimes(int dfd, const char __user *filename, struct timespec64 *times,
 
 		if (!(flags & AT_SYMLINK_NOFOLLOW))
 			lookup_flags |= LOOKUP_FOLLOW;
+		if (flags & AT_EMPTY_PATH)
+			lookup_flags |= LOOKUP_EMPTY;
 retry:
 		error = user_path_at(dfd, filename, lookup_flags, &path);
 		if (error)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 07/12] f*xattr: allow O_PATH descriptors
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (5 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 06/12] utimensat: AT_EMPTY_PATH support Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 08/12] vfs: allow unprivileged whiteout creation Miklos Szeredi
                   ` (5 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

This allows xattr ops on symlink/special files referenced by an O_PATH
descriptor without having to play games with /proc/self/fd/NN (which
doesn't work for symlinks anyway).

This capability is the same as would be given by introducing ...at()
variants with an AT_EMPTY_PATH argument.  Looking at getattr/setattr type
syscalls, this is allowed for fstatat() and fchownat(), but not for
fchmodat() and utimensat().  What's the logic?

While this carries a minute risk of someone relying on the property of
xattr syscalls rejecting O_PATH descriptors, it saves the trouble of
introducing another set of syscalls.

Only file->f_path and file->f_inode are accessed in these functions.

Current versions return EBADF, hence easy to detect the presense of this
feature and fall back in case it's missing.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/xattr.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/xattr.c b/fs/xattr.c
index 90dd78f0eb27..fd1335b86e60 100644
--- a/fs/xattr.c
+++ b/fs/xattr.c
@@ -495,7 +495,7 @@ SYSCALL_DEFINE5(lsetxattr, const char __user *, pathname,
 SYSCALL_DEFINE5(fsetxattr, int, fd, const char __user *, name,
 		const void __user *,value, size_t, size, int, flags)
 {
-	struct fd f = fdget(fd);
+	struct fd f = fdget_raw(fd);
 	int error = -EBADF;
 
 	if (!f.file)
@@ -587,7 +587,7 @@ SYSCALL_DEFINE4(lgetxattr, const char __user *, pathname,
 SYSCALL_DEFINE4(fgetxattr, int, fd, const char __user *, name,
 		void __user *, value, size_t, size)
 {
-	struct fd f = fdget(fd);
+	struct fd f = fdget_raw(fd);
 	ssize_t error = -EBADF;
 
 	if (!f.file)
@@ -662,7 +662,7 @@ SYSCALL_DEFINE3(llistxattr, const char __user *, pathname, char __user *, list,
 
 SYSCALL_DEFINE3(flistxattr, int, fd, char __user *, list, size_t, size)
 {
-	struct fd f = fdget(fd);
+	struct fd f = fdget_raw(fd);
 	ssize_t error = -EBADF;
 
 	if (!f.file)
@@ -727,7 +727,7 @@ SYSCALL_DEFINE2(lremovexattr, const char __user *, pathname,
 
 SYSCALL_DEFINE2(fremovexattr, int, fd, const char __user *, name)
 {
-	struct fd f = fdget(fd);
+	struct fd f = fdget_raw(fd);
 	int error = -EBADF;
 
 	if (!f.file)
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 08/12] vfs: allow unprivileged whiteout creation
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (6 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 07/12] f*xattr: allow O_PATH descriptors Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-12-17  3:51   ` Al Viro
  2019-11-28 15:59 ` [PATCH 09/12] fs_parser: "string" with missing value is a "flag" Miklos Szeredi
                   ` (4 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

Whiteouts are special, but unlike real device nodes they should not require
privileges to create.

The 0 char device number should already be reserved, but make this explicit
in cdev_add() to be on the safe side.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/char_dev.c                 |  3 +++
 fs/namei.c                    | 17 ++++-------------
 include/linux/device_cgroup.h |  3 +++
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/fs/char_dev.c b/fs/char_dev.c
index 00dfe17871ac..8bf66f40e5e0 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -483,6 +483,9 @@ int cdev_add(struct cdev *p, dev_t dev, unsigned count)
 	p->dev = dev;
 	p->count = count;
 
+	if (WARN_ON(dev == WHITEOUT_DEV))
+		return -EBUSY;
+
 	error = kobj_map(cdev_map, dev, count, NULL,
 			 exact_match, exact_lock, p);
 	if (error)
diff --git a/fs/namei.c b/fs/namei.c
index 671c3c1a3425..05ca98595b62 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3687,12 +3687,14 @@ EXPORT_SYMBOL(user_path_create);
 
 int vfs_mknod(struct inode *dir, struct dentry *dentry, umode_t mode, dev_t dev)
 {
+	bool is_whiteout = S_ISCHR(mode) && dev == WHITEOUT_DEV;
 	int error = may_create(dir, dentry);
 
 	if (error)
 		return error;
 
-	if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD))
+	if ((S_ISCHR(mode) || S_ISBLK(mode)) && !capable(CAP_MKNOD) &&
+	    !is_whiteout)
 		return -EPERM;
 
 	if (!dir->i_op->mknod)
@@ -4527,9 +4529,6 @@ static int do_renameat2(int olddfd, const char __user *oldname, int newdfd,
 	    (flags & RENAME_EXCHANGE))
 		return -EINVAL;
 
-	if ((flags & RENAME_WHITEOUT) && !capable(CAP_MKNOD))
-		return -EPERM;
-
 	if (flags & RENAME_EXCHANGE)
 		target_flags = 0;
 
@@ -4667,15 +4666,7 @@ SYSCALL_DEFINE2(rename, const char __user *, oldname, const char __user *, newna
 
 int vfs_whiteout(struct inode *dir, struct dentry *dentry)
 {
-	int error = may_create(dir, dentry);
-	if (error)
-		return error;
-
-	if (!dir->i_op->mknod)
-		return -EPERM;
-
-	return dir->i_op->mknod(dir, dentry,
-				S_IFCHR | WHITEOUT_MODE, WHITEOUT_DEV);
+	return vfs_mknod(dir, dentry, S_IFCHR | WHITEOUT_MODE, WHITEOUT_DEV);
 }
 EXPORT_SYMBOL(vfs_whiteout);
 
diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
index 8557efe096dc..fc989487c273 100644
--- a/include/linux/device_cgroup.h
+++ b/include/linux/device_cgroup.h
@@ -62,6 +62,9 @@ static inline int devcgroup_inode_mknod(int mode, dev_t dev)
 	if (!S_ISBLK(mode) && !S_ISCHR(mode))
 		return 0;
 
+	if (S_ISCHR(mode) && dev == WHITEOUT_DEV)
+		return 0;
+
 	if (S_ISBLK(mode))
 		type = DEVCG_DEV_BLOCK;
 	else
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 09/12] fs_parser: "string" with missing value is a "flag"
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (7 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 08/12] vfs: allow unprivileged whiteout creation Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-12-17 17:32   ` Al Viro
  2019-11-28 15:59 ` [PATCH 10/12] vfs: don't parse forbidden flags Miklos Szeredi
                   ` (3 subsequent siblings)
  12 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

There's no such thing as a NULL string value, the fsconfig(2) syscall
rejects that outright.

So get rid of that concept from the implementation.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fs_context.c           | 2 +-
 fs/fs_parser.c            | 9 ++-------
 include/linux/fs_parser.h | 1 -
 3 files changed, 3 insertions(+), 9 deletions(-)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 66fd7d753e91..7c4216156950 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -174,7 +174,7 @@ int vfs_parse_fs_string(struct fs_context *fc, const char *key,
 
 	struct fs_parameter param = {
 		.key	= key,
-		.type	= fs_value_is_string,
+		.type	= v_size ? fs_value_is_string : fs_value_is_flag,
 		.size	= v_size,
 	};
 
diff --git a/fs/fs_parser.c b/fs/fs_parser.c
index 5d8833d71b37..70f95a71f5aa 100644
--- a/fs/fs_parser.c
+++ b/fs/fs_parser.c
@@ -85,7 +85,6 @@ int fs_parse(struct fs_context *fc,
 	const struct fs_parameter_enum *e;
 	int ret = -ENOPARAM, b;
 
-	result->has_value = !!param->string;
 	result->negated = false;
 	result->uint_64 = 0;
 
@@ -95,7 +94,7 @@ int fs_parse(struct fs_context *fc,
 		 * "xxx" takes the "no"-form negative - but only if there
 		 * wasn't an value.
 		 */
-		if (result->has_value)
+		if (param->type != fs_value_is_flag)
 			goto unknown_parameter;
 		if (param->key[0] != 'n' || param->key[1] != 'o' || !param->key[2])
 			goto unknown_parameter;
@@ -146,8 +145,7 @@ int fs_parse(struct fs_context *fc,
 	 */
 	switch (p->type) {
 	case fs_param_is_flag:
-		if (param->type != fs_value_is_flag &&
-		    (param->type != fs_value_is_string || result->has_value))
+		if (param->type != fs_value_is_flag)
 			return invalf(fc, "%s: Unexpected value for '%s'",
 				      desc->name, param->key);
 		result->boolean = true;
@@ -208,9 +206,6 @@ int fs_parse(struct fs_context *fc,
 	case fs_param_is_fd: {
 		switch (param->type) {
 		case fs_value_is_string:
-			if (!result->has_value)
-				goto bad_value;
-
 			ret = kstrtouint(param->string, 0, &result->uint_32);
 			break;
 		case fs_value_is_file:
diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h
index dee140db6240..45323203128b 100644
--- a/include/linux/fs_parser.h
+++ b/include/linux/fs_parser.h
@@ -72,7 +72,6 @@ struct fs_parameter_description {
  */
 struct fs_parse_result {
 	bool			negated;	/* T if param was "noxxx" */
-	bool			has_value;	/* T if value supplied to param */
 	union {
 		bool		boolean;	/* For spec_bool */
 		int		int_32;		/* For spec_s32/spec_enum */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 10/12] vfs: don't parse forbidden flags
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (8 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 09/12] fs_parser: "string" with missing value is a "flag" Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-11-28 15:59 ` [PATCH 11/12] vfs: don't parse "posixacl" option Miklos Szeredi
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

Makes little sense to keep this blacklist synced with what mount(8) parses
and what it doesn't.  E.g. it has various forms of "*atime" options, but
not "atime"...

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fs_context.c | 28 ----------------------------
 1 file changed, 28 deletions(-)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 7c4216156950..394a05bc03d5 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -55,29 +55,6 @@ static const struct constant_table common_clear_sb_flag[] = {
 	{ "silent",	SB_SILENT },
 };
 
-static const char *const forbidden_sb_flag[] = {
-	"bind",
-	"dev",
-	"exec",
-	"move",
-	"noatime",
-	"nodev",
-	"nodiratime",
-	"noexec",
-	"norelatime",
-	"nostrictatime",
-	"nosuid",
-	"private",
-	"rec",
-	"relatime",
-	"remount",
-	"shared",
-	"slave",
-	"strictatime",
-	"suid",
-	"unbindable",
-};
-
 /*
  * Check for a common mount option that manipulates s_flags.
  */
@@ -85,11 +62,6 @@ static int vfs_parse_sb_flag(struct fs_context *fc, struct fs_parameter *param)
 {
 	const char *key = param->key;
 	unsigned int set, clear;
-	unsigned int i;
-
-	for (i = 0; i < ARRAY_SIZE(forbidden_sb_flag); i++)
-		if (strcmp(key, forbidden_sb_flag[i]) == 0)
-			return -EINVAL;
 
 	set = lookup_constant(common_set_sb_flag, key, 0);
 	clear = lookup_constant(common_clear_sb_flag, key, 0);
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 11/12] vfs: don't parse "posixacl" option
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (9 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 10/12] vfs: don't parse forbidden flags Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-12-17  3:42   ` Al Viro
  2019-11-28 15:59 ` [PATCH 12/12] vfs: don't parse "silent" option Miklos Szeredi
  2019-12-13  9:33 ` [PATCH 00/12] various vfs patches Miklos Szeredi
  12 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

Unlike the others, this is _not_ a standard option accepted by mount(8).

In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
mount(2) interface is possibly a bug.

The only filesystem that apparently wants to handle the "posixacl" option
is 9p, but it has special handling of that option besides setting
SB_POSIXACL.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fs_context.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 394a05bc03d5..738f59b6c06a 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -42,7 +42,6 @@ static const struct constant_table common_set_sb_flag[] = {
 	{ "dirsync",	SB_DIRSYNC },
 	{ "lazytime",	SB_LAZYTIME },
 	{ "mand",	SB_MANDLOCK },
-	{ "posixacl",	SB_POSIXACL },
 	{ "ro",		SB_RDONLY },
 	{ "sync",	SB_SYNCHRONOUS },
 };
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* [PATCH 12/12] vfs: don't parse "silent" option
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (10 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 11/12] vfs: don't parse "posixacl" option Miklos Szeredi
@ 2019-11-28 15:59 ` Miklos Szeredi
  2019-12-17  3:37   ` Al Viro
  2019-12-13  9:33 ` [PATCH 00/12] various vfs patches Miklos Szeredi
  12 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-28 15:59 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel

While this is a standard option as documented in mount(8), it is ignored by
most filesystems.  So reject, unless filesystem explicitly wants to handle
it.

The exception is unconverted filesystems, where it is unknown if the
filesystem handles this or not.

Any implementation, such as mount(8), that needs to parse this option
without failing can simply ignore the return value from fsconfig().

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/fs_context.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/fs_context.c b/fs/fs_context.c
index 738f59b6c06a..b37ce07ee230 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -51,7 +51,6 @@ static const struct constant_table common_clear_sb_flag[] = {
 	{ "nolazytime",	SB_LAZYTIME },
 	{ "nomand",	SB_MANDLOCK },
 	{ "rw",		SB_RDONLY },
-	{ "silent",	SB_SILENT },
 };
 
 /*
@@ -530,6 +529,15 @@ static int legacy_parse_param(struct fs_context *fc, struct fs_parameter *param)
 	unsigned int size = ctx->data_size;
 	size_t len = 0;
 
+	if (strcmp(param->key, "silent") == 0) {
+		if (param->type != fs_value_is_flag)
+			return invalf(fc, "%s: Unexpected value for '%s'",
+				      fc->fs_type->name, param->key);
+
+		fc->sb_flags |= SB_SILENT;
+		return 0;
+	}
+
 	if (strcmp(param->key, "source") == 0) {
 		if (param->type != fs_value_is_string)
 			return invalf(fc, "VFS: Legacy: Non-string source");
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-11-28 15:59 ` [PATCH 02/12] fs_parse: fix fs_param_v_optional handling Miklos Szeredi
@ 2019-11-29 11:31   ` Andrew Price
  2019-11-29 14:43     ` Miklos Szeredi
  2019-12-16 23:28   ` Al Viro
  1 sibling, 1 reply; 37+ messages in thread
From: Andrew Price @ 2019-11-29 11:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-fsdevel, David Howells, stable

On 28/11/2019 15:59, Miklos Szeredi wrote:
> String options always have parameters, hence the check for optional
> parameter will never trigger.
> 
> Check for param type being a flag first (flag is the only type that does
> not have a parameter) and report "Missing value" if the parameter is
> mandatory.
> 
> Tested with gfs2's "quota" option, which is currently the only user of
> fs_param_v_optional.

It's not clear to me what the bug is here. My tests with the quota 
option are giving expected results. Perhaps I missed a case?

Andy

> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> Cc: Andrew Price <anprice@redhat.com>
> Cc: David Howells <dhowells@redhat.com>
> Fixes: 31d921c7fb96 ("vfs: Add configuration parser helpers")
> Cc: <stable@vger.kernel.org> # v5.4
> ---
>   fs/fs_parser.c | 10 ++++++----
>   1 file changed, 6 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/fs_parser.c b/fs/fs_parser.c
> index d1930adce68d..5d8833d71b37 100644
> --- a/fs/fs_parser.c
> +++ b/fs/fs_parser.c
> @@ -127,13 +127,15 @@ int fs_parse(struct fs_context *fc,
>   	case fs_param_is_u64:
>   	case fs_param_is_enum:
>   	case fs_param_is_string:
> -		if (param->type != fs_value_is_string)
> -			goto bad_value;
> -		if (!result->has_value) {
> +		if (param->type == fs_value_is_flag) {
>   			if (p->flags & fs_param_v_optional)
>   				goto okay;
> -			goto bad_value;
> +
> +			return invalf(fc, "%s: Missing value for '%s'",
> +				      desc->name, param->key);
>   		}
> +		if (param->type != fs_value_is_string)
> +			goto bad_value;
>   		/* Fall through */
>   	default:
>   		break;
> 


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-11-29 11:31   ` Andrew Price
@ 2019-11-29 14:43     ` Miklos Szeredi
  2019-11-29 15:56       ` Andrew Price
  0 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-11-29 14:43 UTC (permalink / raw)
  To: Andrew Price; +Cc: Al Viro, linux-fsdevel, David Howells, stable

On Fri, Nov 29, 2019 at 12:31 PM Andrew Price <anprice@redhat.com> wrote:
>
> On 28/11/2019 15:59, Miklos Szeredi wrote:
> > String options always have parameters, hence the check for optional
> > parameter will never trigger.
> >
> > Check for param type being a flag first (flag is the only type that does
> > not have a parameter) and report "Missing value" if the parameter is
> > mandatory.
> >
> > Tested with gfs2's "quota" option, which is currently the only user of
> > fs_param_v_optional.
>
> It's not clear to me what the bug is here. My tests with the quota
> option are giving expected results. Perhaps I missed a case?

fsopen-test-2: fsconfig(3, FSCONFIG_SET_FLAG, "quota", NULL, 0):
Invalid argument
fsopen-test-2: context log: <e gfs2: Bad value for 'quota'>

kernel: 5.4.0-08836-g81b6b96475ac

Thanks,
Miklos


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-11-29 14:43     ` Miklos Szeredi
@ 2019-11-29 15:56       ` Andrew Price
  0 siblings, 0 replies; 37+ messages in thread
From: Andrew Price @ 2019-11-29 15:56 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-fsdevel, David Howells, stable

On 29/11/2019 14:43, Miklos Szeredi wrote:
> On Fri, Nov 29, 2019 at 12:31 PM Andrew Price <anprice@redhat.com> wrote:
>>
>> On 28/11/2019 15:59, Miklos Szeredi wrote:
>>> String options always have parameters, hence the check for optional
>>> parameter will never trigger.
>>>
>>> Check for param type being a flag first (flag is the only type that does
>>> not have a parameter) and report "Missing value" if the parameter is
>>> mandatory.
>>>
>>> Tested with gfs2's "quota" option, which is currently the only user of
>>> fs_param_v_optional.
>>
>> It's not clear to me what the bug is here. My tests with the quota
>> option are giving expected results. Perhaps I missed a case?
> 
> fsopen-test-2: fsconfig(3, FSCONFIG_SET_FLAG, "quota", NULL, 0):
> Invalid argument
> fsopen-test-2: context log: <e gfs2: Bad value for 'quota'>
> 
> kernel: 5.4.0-08836-g81b6b96475ac

Ah right, gotcha. My tests were relying on the same codepaths being used 
from the legacy/monolithic parsing code.

Reviewed-by: Andrew Price <anprice@redhat.com>

Andy


^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/12] aio: fix async fsync creds
  2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
@ 2019-12-13  9:32   ` Miklos Szeredi
  2020-05-04  8:05   ` Avi Kivity
  1 sibling, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-13  9:32 UTC (permalink / raw)
  To: Miklos Szeredi
  Cc: Al Viro, linux-fsdevel, Avi Kivity, Giuseppe Scrivano, stable

Hi Al,

Could you please review/apply this patch?

Thanks,
Miklos

On Thu, Nov 28, 2019 at 4:59 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> Avi Kivity reports that on fuse filesystems running in a user namespace
> asyncronous fsync fails with EOVERFLOW.
>
> The reason is that f_ops->fsync() is called with the creds of the kthread
> performing aio work instead of the creds of the process originally
> submitting IOCB_CMD_FSYNC.
>
> Fuse sends the creds of the caller in the request header and it needs to
> translate the uid and gid into the server's user namespace.  Since the
> kthread is running in init_user_ns, the translation will fail and the
> operation returns an error.
>
> It can be argued that fsync doesn't actually need any creds, but just
> zeroing out those fields in the header (as with requests that currently
> don't take creds) is a backward compatibility risk.
>
> Instead of working around this issue in fuse, solve the core of the problem
> by calling the filesystem with the proper creds.
>
> Reported-by: Avi Kivity <avi@scylladb.com>
> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
> Fixes: c9582eb0ff7d ("fuse: Fail all requests with invalid uids or gids")
> Cc: stable@vger.kernel.org  # 4.18+
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/aio.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 0d9a559d488c..37828773e2fe 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -176,6 +176,7 @@ struct fsync_iocb {
>         struct file             *file;
>         struct work_struct      work;
>         bool                    datasync;
> +       struct cred             *creds;
>  };
>
>  struct poll_iocb {
> @@ -1589,8 +1590,11 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
>  static void aio_fsync_work(struct work_struct *work)
>  {
>         struct aio_kiocb *iocb = container_of(work, struct aio_kiocb, fsync.work);
> +       const struct cred *old_cred = override_creds(iocb->fsync.creds);
>
>         iocb->ki_res.res = vfs_fsync(iocb->fsync.file, iocb->fsync.datasync);
> +       revert_creds(old_cred);
> +       put_cred(iocb->fsync.creds);
>         iocb_put(iocb);
>  }
>
> @@ -1604,6 +1608,10 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
>         if (unlikely(!req->file->f_op->fsync))
>                 return -EINVAL;
>
> +       req->creds = prepare_creds();
> +       if (!req->creds)
> +               return -ENOMEM;
> +
>         req->datasync = datasync;
>         INIT_WORK(&req->work, aio_fsync_work);
>         schedule_work(&req->work);
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/12] various vfs patches
  2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
                   ` (11 preceding siblings ...)
  2019-11-28 15:59 ` [PATCH 12/12] vfs: don't parse "silent" option Miklos Szeredi
@ 2019-12-13  9:33 ` Miklos Szeredi
  2019-12-16 23:13   ` Al Viro
  12 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-13  9:33 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Al Viro, linux-fsdevel

Hi Al,

Could you please review/apply these patches?

Thanks,
Miklos

On Thu, Nov 28, 2019 at 4:59 PM Miklos Szeredi <mszeredi@redhat.com> wrote:
>
> Hi Al,
>
> This is a dump of my current vfs patch queue, all have been posted in one
> form or another.
>
> Also available as a git branch:
>
>   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git#for-viro
>
> Please apply.
>
> Thanks,
> Miklos
> ---
>
> Miklos Szeredi (12):
>   aio: fix async fsync creds
>   fs_parse: fix fs_param_v_optional handling
>   vfs: verify param type in vfs_parse_sb_flag()
>   uapi: deprecate STATX_ALL
>   statx: don't clear STATX_ATIME on SB_RDONLY
>   utimensat: AT_EMPTY_PATH support
>   f*xattr: allow O_PATH descriptors
>   vfs: allow unprivileged whiteout creation
>   fs_parser: "string" with missing value is a "flag"
>   vfs: don't parse forbidden flags
>   vfs: don't parse "posixacl" option
>   vfs: don't parse "silent" option
>
>  fs/aio.c                        |  8 ++++
>  fs/char_dev.c                   |  3 ++
>  fs/fs_context.c                 | 72 ++++++++++++---------------------
>  fs/fs_parser.c                  | 19 ++++-----
>  fs/namei.c                      | 17 ++------
>  fs/stat.c                       |  4 +-
>  fs/utimes.c                     |  6 ++-
>  fs/xattr.c                      |  8 ++--
>  include/linux/device_cgroup.h   |  3 ++
>  include/linux/fs_parser.h       |  1 -
>  include/uapi/linux/stat.h       | 11 ++++-
>  samples/vfs/test-statx.c        |  2 +-
>  tools/include/uapi/linux/stat.h | 11 ++++-
>  13 files changed, 82 insertions(+), 83 deletions(-)
>
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 00/12] various vfs patches
  2019-12-13  9:33 ` [PATCH 00/12] various vfs patches Miklos Szeredi
@ 2019-12-16 23:13   ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2019-12-16 23:13 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, linux-fsdevel

On Fri, Dec 13, 2019 at 10:33:25AM +0100, Miklos Szeredi wrote:
> Hi Al,
> 
> Could you please review/apply these patches?

Looking through those right now...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-11-28 15:59 ` [PATCH 02/12] fs_parse: fix fs_param_v_optional handling Miklos Szeredi
  2019-11-29 11:31   ` Andrew Price
@ 2019-12-16 23:28   ` Al Viro
  2019-12-17  1:18     ` Al Viro
  1 sibling, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-16 23:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, Andrew Price, David Howells, stable

On Thu, Nov 28, 2019 at 04:59:30PM +0100, Miklos Szeredi wrote:
> String options always have parameters, hence the check for optional
> parameter will never trigger.

What do you mean, always have parameters?  Granted, for fsconfig(2) it's
(currently) true, but I see at least two other pathways that do not impose
such requirement - vfs_parse_fs_string() and rbd_parse_options().

You seem to deal with the former later in the patchset, but I don't see
anything for the latter...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-12-16 23:28   ` Al Viro
@ 2019-12-17  1:18     ` Al Viro
  2019-12-17  3:27       ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17  1:18 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, Andrew Price, David Howells, stable

On Mon, Dec 16, 2019 at 11:28:45PM +0000, Al Viro wrote:
> On Thu, Nov 28, 2019 at 04:59:30PM +0100, Miklos Szeredi wrote:
> > String options always have parameters, hence the check for optional
> > parameter will never trigger.
> 
> What do you mean, always have parameters?  Granted, for fsconfig(2) it's
> (currently) true, but I see at least two other pathways that do not impose
> such requirement - vfs_parse_fs_string() and rbd_parse_options().
> 
> You seem to deal with the former later in the patchset, but I don't see
> anything for the latter...

FWIW, I strongly dislike fs_param_v_optional.  I mean, look at the
gfs2 usecase:
	quota			->uint_64 = 0		->negated = false
	quota=off		->uint_32 = 1		->negated = false
	quota=account		->uint_32 = 2		->negated = false
	quota=on		->uint_32 = 3		->negated = false
	noquota			->boolean = false	->negated = true
with gfs2 postprocessing for that thing being
                if (result.negated)
                        args->ar_quota = GFS2_QUOTA_OFF;
                else if (result.int_32 > 0)
                        args->ar_quota = opt_quota_values[result.int_32];
                else   
                        args->ar_quota = GFS2_QUOTA_ON;
                break;
and that relies upon having enum opt_quota members associated with
off/account/on starting from 1.  I mean, WTF?  What we really want is
	quota		GFS2_QUOTA_ON
	quota=on	GFS2_QUOTA_ON
	quota=account	GFS2_QUOTA_ACCOUNT
	quota=off	GFS2_QUOTA_OFF
	noquota		GFS2_QUOTA_OFF

I certainly agree that flag/NULL string is ugly; do we even want to keep
fs_value_is_flag?  It's internal-only, so we can bloody well turn it
into fs_value_is_string and ->string is NULL...  And sure, ->has_value
is redundant - if nothing else, it would make a lot more sense as
static inline bool param_has_value(const struct fs_parameter *param)
{
	return !!param->string;
}
But I really wonder if we should keep breeding kludges.  Look at the
use cases, including the yet-to-be-merged ones.
	1) GFS2: see above
	2) ceph: fsc/nofsc/fsc=...
	3) ext4: init_itable/noinit_itable/init_itable=<number>
	4) nfs: fsc/nofsc/fsc=...

All of that is trivially handled by splitting the opt=... and opt
cases.  We have two such in the tree and two more in posted patchsets.
Plus one more that ext4 patchset breaks, AFAICS (barrier).  Out of
several hundreds.  Everything else either requires = in all cases
or rejects it in all cases.

So how about a flag for "takes no arguments", set automatically by
fsparam_flag()/fsparam_flag_no(), with fs_lookup_key() taking an
extra "comes with argument" flag and filtering according to it?
Rules:
	foo		=> "foo", true
	foo=		=> "foo", false
	foo=bar		=> "foo", false
And to hell with the "optional" flag; for gfs2 we'd end up with
	fsparam_flag_no("quota", Opt_quota_flag),			// quota|noquota
	fsparam_flag_enum("quota", Opt_quota, gfs2_param_quota),	// quota={on|account|off}
Postprocessing won't be any harder, really - we could bloody well do
	case Opt_quota_flag:
		result.int_32 = result.negated ? GFS2_QUOTA_OFF : GFS2_QUOTA_ON;
		/* fallthru */
	case Opt_quota:
		args->ar_quota = result.int_32;
                break;
with gfs2_param_quota having the right values in it, instead of
that intermediate enum.

All ->has_value checks go away that way, AFAICS.  With minimal
impact on yet-to-be-merged series...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 02/12] fs_parse: fix fs_param_v_optional handling
  2019-12-17  1:18     ` Al Viro
@ 2019-12-17  3:27       ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2019-12-17  3:27 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, Andrew Price, David Howells, stable

On Tue, Dec 17, 2019 at 01:18:13AM +0000, Al Viro wrote:

> So how about a flag for "takes no arguments", set automatically by
> fsparam_flag()/fsparam_flag_no(), with fs_lookup_key() taking an
> extra "comes with argument" flag and filtering according to it?
> Rules:
> 	foo		=> "foo", true
> 	foo=		=> "foo", false
> 	foo=bar		=> "foo", false
> And to hell with the "optional" flag; for gfs2 we'd end up with
> 	fsparam_flag_no("quota", Opt_quota_flag),			// quota|noquota
> 	fsparam_flag_enum("quota", Opt_quota, gfs2_param_quota),	// quota={on|account|off}
> Postprocessing won't be any harder, really - we could bloody well do
> 	case Opt_quota_flag:
> 		result.int_32 = result.negated ? GFS2_QUOTA_OFF : GFS2_QUOTA_ON;
> 		/* fallthru */
> 	case Opt_quota:
> 		args->ar_quota = result.int_32;
>                 break;
> with gfs2_param_quota having the right values in it, instead of
> that intermediate enum.
> 
> All ->has_value checks go away that way, AFAICS.  With minimal
> impact on yet-to-be-merged series...

FWIW, we have the following types right now:
fs_param_is_flag	no argument
fs_param_is_bool	no argument or string argument (1 instance in mainline[*])
fs_param_is_u32		string argument [**]
fs_param_is_u32_octal	string argument
fs_param_is_u32_hex	string argument
fs_param_is_s32		string argument
fs_param_is_u64		string argument
fs_param_is_enum	string argument; gfs2 has the quota/noquota/quota=... mess
fs_param_is_string	string argument; ceph (and nfs) have fsc/nofsc/fsc=...
fs_param_is_fd		unused at the moment; eventually - string argument.
fs_param_is_blob	fsconfig-only
fs_param_is_blockdev	fsconfig-only
fs_param_is_path	fsconfig-only

[*] no_disconnect in gadgetfs; buggered, since it used to require a numeric
argument.  Now it quietly treats no_disconnect as no_disconnect=1 and
rejects e.g. no_disconnect=2.  There are two more in ext4, also buggered
as far as I can tell.
[**] ext4 has init_itable/init_itable=%u/noinit_itable

The total over all filesystems with conversions posted or already in mainline:

* no_disconnect [drivers/usb/gadget/function/f_fs.c, broken]
* init_itable/init_itable=%u/noinit_itable [ext4]
* fsc/fsc=%s/nofsc [ceph]
* fsc/fsc=%s/nofsc [nfs]
* quota/quota=on/quota=off/quota=account/noquota [gfs2]
* barrier/barrier=%u/nobarrier [ext4]
* auto_da_alloc/auto_da_alloc=%u/noauto_da_alloc [ext4]

It's rare.  So much that I don't believe that fs_param_v_optional has any
reason to exist.  Let's split those entries and be done with that.

Hmm...  Deciding whether we have an argument-bearing or no-argument case
is somewhat inconvenient for fsconfig-generated calls - check for
NULL param->string is, strictly speaking, in nasal daemon territory
for types other than flag and string...  OK, sold - let's check for
fs_value_is_flag.  So I'm combining your patch with #9/12 (to avoid bisect
hazard) + correction for rbd.  And putting that in the very beginning of
fs_parser reorg series, with elimination of fs_param_v_optional closer
to the end.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-11-28 15:59 ` [PATCH 12/12] vfs: don't parse "silent" option Miklos Szeredi
@ 2019-12-17  3:37   ` Al Viro
  2019-12-17  4:12     ` Miklos Szeredi
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17  3:37 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> While this is a standard option as documented in mount(8), it is ignored by
> most filesystems.  So reject, unless filesystem explicitly wants to handle
> it.
>
> The exception is unconverted filesystems, where it is unknown if the
> filesystem handles this or not.
> 
> Any implementation, such as mount(8), that needs to parse this option
> without failing can simply ignore the return value from fsconfig().

Unless I'm missing something, that will mean that having it in /etc/fstab
for a converted filesystem (xfs, for example) will fail when booting
new kernel with existing /sbin/mount.  Doesn't sound like a good idea...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 11/12] vfs: don't parse "posixacl" option
  2019-11-28 15:59 ` [PATCH 11/12] vfs: don't parse "posixacl" option Miklos Szeredi
@ 2019-12-17  3:42   ` Al Viro
  2019-12-17  4:18     ` Miklos Szeredi
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17  3:42 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Thu, Nov 28, 2019 at 04:59:39PM +0100, Miklos Szeredi wrote:
> Unlike the others, this is _not_ a standard option accepted by mount(8).
> 
> In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
> mount(2) interface is possibly a bug.
> 
> The only filesystem that apparently wants to handle the "posixacl" option
> is 9p, but it has special handling of that option besides setting
> SB_POSIXACL.

Huh?  For e.g. ceph having -o posixacl and -o acl are currently equivalent;
your patch would seem to break that, wouldn't it?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 08/12] vfs: allow unprivileged whiteout creation
  2019-11-28 15:59 ` [PATCH 08/12] vfs: allow unprivileged whiteout creation Miklos Szeredi
@ 2019-12-17  3:51   ` Al Viro
  2019-12-17  4:22     ` Miklos Szeredi
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17  3:51 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Thu, Nov 28, 2019 at 04:59:36PM +0100, Miklos Szeredi wrote:
> Whiteouts are special, but unlike real device nodes they should not require
> privileges to create.

More detailed analysis, please.  Maybe I'm missing something obvious,
but I don't see off-hand why it's safe.

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  3:37   ` Al Viro
@ 2019-12-17  4:12     ` Miklos Szeredi
  2019-12-17  4:16       ` Miklos Szeredi
  2019-12-17  4:17       ` Al Viro
  0 siblings, 2 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:12 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > While this is a standard option as documented in mount(8), it is ignored by
> > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > it.
> >
> > The exception is unconverted filesystems, where it is unknown if the
> > filesystem handles this or not.
> >
> > Any implementation, such as mount(8), that needs to parse this option
> > without failing can simply ignore the return value from fsconfig().
>
> Unless I'm missing something, that will mean that having it in /etc/fstab
> for a converted filesystem (xfs, for example) will fail when booting
> new kernel with existing /sbin/mount.  Doesn't sound like a good idea...

Nope, the mount(2) case is not changed (see second hunk).

When mount(8) is converted to the new API, it can just handle such
back compat issues (ignore error from fconfig()) in that case.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  4:12     ` Miklos Szeredi
@ 2019-12-17  4:16       ` Miklos Szeredi
  2019-12-17  4:19         ` Al Viro
  2019-12-17  4:17       ` Al Viro
  1 sibling, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:16 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 5:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > > While this is a standard option as documented in mount(8), it is ignored by
> > > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > > it.
> > >
> > > The exception is unconverted filesystems, where it is unknown if the
> > > filesystem handles this or not.
> > >
> > > Any implementation, such as mount(8), that needs to parse this option
> > > without failing can simply ignore the return value from fsconfig().
> >
> > Unless I'm missing something, that will mean that having it in /etc/fstab
> > for a converted filesystem (xfs, for example) will fail when booting
> > new kernel with existing /sbin/mount.  Doesn't sound like a good idea...
>
> Nope, the mount(2) case is not changed (see second hunk).

Wrong, this has nothing to do with mount(2).  The second hunk is about
unconverted filesystems...

When a filesystem that really needs to handle "silent" is converted,
it can handle that option itself.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  4:12     ` Miklos Szeredi
  2019-12-17  4:16       ` Miklos Szeredi
@ 2019-12-17  4:17       ` Al Viro
  1 sibling, 0 replies; 37+ messages in thread
From: Al Viro @ 2019-12-17  4:17 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 05:12:05AM +0100, Miklos Szeredi wrote:
> On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > > While this is a standard option as documented in mount(8), it is ignored by
> > > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > > it.
> > >
> > > The exception is unconverted filesystems, where it is unknown if the
> > > filesystem handles this or not.
> > >
> > > Any implementation, such as mount(8), that needs to parse this option
> > > without failing can simply ignore the return value from fsconfig().
> >
> > Unless I'm missing something, that will mean that having it in /etc/fstab
> > for a converted filesystem (xfs, for example) will fail when booting
> > new kernel with existing /sbin/mount.  Doesn't sound like a good idea...
> 
> Nope, the mount(2) case is not changed (see second hunk).

How would mounting XFS end up calling legacy_parse_param(), rather than
the expected generic_parse_monolithic() -> vfs_parse_fs_string() ->
vfs_parse_fs_param() -> xfs_fc_parse_param()?

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 11/12] vfs: don't parse "posixacl" option
  2019-12-17  3:42   ` Al Viro
@ 2019-12-17  4:18     ` Miklos Szeredi
  2019-12-17  4:28       ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:18 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 4:42 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Nov 28, 2019 at 04:59:39PM +0100, Miklos Szeredi wrote:
> > Unlike the others, this is _not_ a standard option accepted by mount(8).
> >
> > In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
> > mount(2) interface is possibly a bug.
> >
> > The only filesystem that apparently wants to handle the "posixacl" option
> > is 9p, but it has special handling of that option besides setting
> > SB_POSIXACL.
>
> Huh?  For e.g. ceph having -o posixacl and -o acl are currently equivalent;
> your patch would seem to break that, wouldn't it?

Yet again, this has nothing to do with mount(2) behavior.  Also note
that mount(8) does *not* handle "posixacl" and does *not* ever set
MS_POSIXACL.

So this has exactly zero chance of breaking anything.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  4:16       ` Miklos Szeredi
@ 2019-12-17  4:19         ` Al Viro
  2019-12-17  4:23           ` Miklos Szeredi
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17  4:19 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 05:16:58AM +0100, Miklos Szeredi wrote:
> On Tue, Dec 17, 2019 at 5:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > > > While this is a standard option as documented in mount(8), it is ignored by
> > > > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > > > it.
> > > >
> > > > The exception is unconverted filesystems, where it is unknown if the
> > > > filesystem handles this or not.
> > > >
> > > > Any implementation, such as mount(8), that needs to parse this option
> > > > without failing can simply ignore the return value from fsconfig().
> > >
> > > Unless I'm missing something, that will mean that having it in /etc/fstab
> > > for a converted filesystem (xfs, for example) will fail when booting
> > > new kernel with existing /sbin/mount.  Doesn't sound like a good idea...
> >
> > Nope, the mount(2) case is not changed (see second hunk).
> 
> Wrong, this has nothing to do with mount(2).  The second hunk is about
> unconverted filesystems...
> 
> When a filesystem that really needs to handle "silent" is converted,
> it can handle that option itself.

You know, I had a specific reason to mention XFS...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 08/12] vfs: allow unprivileged whiteout creation
  2019-12-17  3:51   ` Al Viro
@ 2019-12-17  4:22     ` Miklos Szeredi
  0 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:22 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 4:51 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Thu, Nov 28, 2019 at 04:59:36PM +0100, Miklos Szeredi wrote:
> > Whiteouts are special, but unlike real device nodes they should not require
> > privileges to create.
>
> More detailed analysis, please.  Maybe I'm missing something obvious,
> but I don't see off-hand why it's safe.

The general concern with device nodes is that opening them can have
side effects.  The kernel already avoids zero major, so opening will
result in ENODEV, but the patch also makes sure that registering a
device with 0/0 number is forbidden (fs/char_dev.c).

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  4:19         ` Al Viro
@ 2019-12-17  4:23           ` Miklos Szeredi
  2019-12-17  4:28             ` Miklos Szeredi
  0 siblings, 1 reply; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:23 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 5:19 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Tue, Dec 17, 2019 at 05:16:58AM +0100, Miklos Szeredi wrote:
> > On Tue, Dec 17, 2019 at 5:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > >
> > > > On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > > > > While this is a standard option as documented in mount(8), it is ignored by
> > > > > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > > > > it.
> > > > >
> > > > > The exception is unconverted filesystems, where it is unknown if the
> > > > > filesystem handles this or not.
> > > > >
> > > > > Any implementation, such as mount(8), that needs to parse this option
> > > > > without failing can simply ignore the return value from fsconfig().
> > > >
> > > > Unless I'm missing something, that will mean that having it in /etc/fstab
> > > > for a converted filesystem (xfs, for example) will fail when booting
> > > > new kernel with existing /sbin/mount.  Doesn't sound like a good idea...
> > >
> > > Nope, the mount(2) case is not changed (see second hunk).
> >
> > Wrong, this has nothing to do with mount(2).  The second hunk is about
> > unconverted filesystems...
> >
> > When a filesystem that really needs to handle "silent" is converted,
> > it can handle that option itself.
>
> You know, I had a specific reason to mention XFS...

Will fix.  My bad, I did check filesystems at the time of writing the
patch, but not when resending...

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 11/12] vfs: don't parse "posixacl" option
  2019-12-17  4:18     ` Miklos Szeredi
@ 2019-12-17  4:28       ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2019-12-17  4:28 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 05:18:16AM +0100, Miklos Szeredi wrote:
> On Tue, Dec 17, 2019 at 4:42 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Thu, Nov 28, 2019 at 04:59:39PM +0100, Miklos Szeredi wrote:
> > > Unlike the others, this is _not_ a standard option accepted by mount(8).
> > >
> > > In fact SB_POSIXACL is an internal flag, and accepting MS_POSIXACL on the
> > > mount(2) interface is possibly a bug.
> > >
> > > The only filesystem that apparently wants to handle the "posixacl" option
> > > is 9p, but it has special handling of that option besides setting
> > > SB_POSIXACL.
> >
> > Huh?  For e.g. ceph having -o posixacl and -o acl are currently equivalent;
> > your patch would seem to break that, wouldn't it?
> 
> Yet again, this has nothing to do with mount(2) behavior.  Also note
> that mount(8) does *not* handle "posixacl" and does *not* ever set
> MS_POSIXACL.
> 
> So this has exactly zero chance of breaking anything.

Point.  OK, I'm crawling in direction of bed right now - it's that or grab more
coffee, and I'll have to get up before 7am tomorrow ;-/

Later...

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 12/12] vfs: don't parse "silent" option
  2019-12-17  4:23           ` Miklos Szeredi
@ 2019-12-17  4:28             ` Miklos Szeredi
  0 siblings, 0 replies; 37+ messages in thread
From: Miklos Szeredi @ 2019-12-17  4:28 UTC (permalink / raw)
  To: Al Viro; +Cc: Miklos Szeredi, linux-fsdevel

On Tue, Dec 17, 2019 at 5:23 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Tue, Dec 17, 2019 at 5:19 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Tue, Dec 17, 2019 at 05:16:58AM +0100, Miklos Szeredi wrote:
> > > On Tue, Dec 17, 2019 at 5:12 AM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Tue, Dec 17, 2019 at 4:37 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > > >
> > > > > On Thu, Nov 28, 2019 at 04:59:40PM +0100, Miklos Szeredi wrote:
> > > > > > While this is a standard option as documented in mount(8), it is ignored by
> > > > > > most filesystems.  So reject, unless filesystem explicitly wants to handle
> > > > > > it.
> > > > > >
> > > > > > The exception is unconverted filesystems, where it is unknown if the
> > > > > > filesystem handles this or not.
> > > > > >
> > > > > > Any implementation, such as mount(8), that needs to parse this option
> > > > > > without failing can simply ignore the return value from fsconfig().
> > > > >
> > > > > Unless I'm missing something, that will mean that having it in /etc/fstab
> > > > > for a converted filesystem (xfs, for example) will fail when booting
> > > > > new kernel with existing /sbin/mount.  Doesn't sound like a good idea...
> > > >
> > > > Nope, the mount(2) case is not changed (see second hunk).
> > >
> > > Wrong, this has nothing to do with mount(2).  The second hunk is about
> > > unconverted filesystems...
> > >
> > > When a filesystem that really needs to handle "silent" is converted,
> > > it can handle that option itself.
> >
> > You know, I had a specific reason to mention XFS...
>
> Will fix.  My bad, I did check filesystems at the time of writing the
> patch, but not when resending...

And BTW this is still not breakage of mount(2) since that code path is
unaffected.  Would need to think  how much the "silent" option makes
sense on the new interface for individual cases specifically.

Thanks,
Miklos

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/12] fs_parser: "string" with missing value is a "flag"
  2019-11-28 15:59 ` [PATCH 09/12] fs_parser: "string" with missing value is a "flag" Miklos Szeredi
@ 2019-12-17 17:32   ` Al Viro
  2019-12-17 18:31     ` Al Viro
  0 siblings, 1 reply; 37+ messages in thread
From: Al Viro @ 2019-12-17 17:32 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel

On Thu, Nov 28, 2019 at 04:59:37PM +0100, Miklos Szeredi wrote:
> There's no such thing as a NULL string value, the fsconfig(2) syscall
> rejects that outright.
> 
> So get rid of that concept from the implementation.
> 
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>  fs/fs_context.c           | 2 +-
>  fs/fs_parser.c            | 9 ++-------
>  include/linux/fs_parser.h | 1 -
>  3 files changed, 3 insertions(+), 9 deletions(-)
> 
> diff --git a/fs/fs_context.c b/fs/fs_context.c
> index 66fd7d753e91..7c4216156950 100644
> --- a/fs/fs_context.c
> +++ b/fs/fs_context.c
> @@ -174,7 +174,7 @@ int vfs_parse_fs_string(struct fs_context *fc, const char *key,
>  
>  	struct fs_parameter param = {
>  		.key	= key,
> -		.type	= fs_value_is_string,
> +		.type	= v_size ? fs_value_is_string : fs_value_is_flag,
>  		.size	= v_size,
>  	};

No.  This is simply wrong - as it is, there's no difference between
"foo" and "foo=".  Passing NULL in the latter case is wrong, but
this is not a good fix.

This
        if (v_size > 0) {
                param.string = kmemdup_nul(value, v_size, GFP_KERNEL);
                if (!param.string)
                        return -ENOMEM;
        }
should really be
	if (value) {
                param.string = kmemdup_nul(value, v_size, GFP_KERNEL);
                if (!param.string)
                        return -ENOMEM;
        }
and your chunk should be conditional upon value, not v_size.  The
same problem exists for rbd.c

^ permalink raw reply	[flat|nested] 37+ messages in thread

* Re: [PATCH 09/12] fs_parser: "string" with missing value is a "flag"
  2019-12-17 17:32   ` Al Viro
@ 2019-12-17 18:31     ` Al Viro
  0 siblings, 0 replies; 37+ messages in thread
From: Al Viro @ 2019-12-17 18:31 UTC (permalink / raw)
  To: Miklos Szeredi; +Cc: linux-fsdevel, David Howells

On Tue, Dec 17, 2019 at 05:32:59PM +0000, Al Viro wrote:

> No.  This is simply wrong - as it is, there's no difference between
> "foo" and "foo=".  Passing NULL in the latter case is wrong, but
> this is not a good fix.
> 
> This
>         if (v_size > 0) {
>                 param.string = kmemdup_nul(value, v_size, GFP_KERNEL);
>                 if (!param.string)
>                         return -ENOMEM;
>         }
> should really be
> 	if (value) {
>                 param.string = kmemdup_nul(value, v_size, GFP_KERNEL);
>                 if (!param.string)
>                         return -ENOMEM;
>         }
> and your chunk should be conditional upon value, not v_size.  The
> same problem exists for rbd.c

How about this (completely untested):

Pass consistent param->type to fs_parse()

As it is, vfs_parse_fs_string() makes "foo" and "foo=" indistinguishable;
both get fs_value_is_string for ->type and NULL for ->string.  To make
it even more unpleasant, that combination is impossible to produce with
fsconfig().

Much saner rules would be
	"foo"		=> fs_value_is_flag, NULL
	"foo="		=> fs_value_is_string, ""
	"foo=bar"	=> fs_value_is_string, "bar"
All cases are distinguishable, all results are expressable by fsconfig(),
->has_value checks are much simpler that way (to the point of the field
being useless) and quite a few regressions go away (gfs2 has no business
accepting -o nodebug=, for example).

Partially based upon patches from Miklos.

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 2b184563cd32..9fc686be81ca 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -6433,7 +6433,7 @@ static int rbd_parse_options(char *options, struct rbd_parse_opts_ctx *pctx)
 		if (*key) {
 			struct fs_parameter param = {
 				.key	= key,
-				.type	= fs_value_is_string,
+				.type	= fs_value_is_flag,
 			};
 			char *value = strchr(key, '=');
 			size_t v_len = 0;
@@ -6443,14 +6443,11 @@ static int rbd_parse_options(char *options, struct rbd_parse_opts_ctx *pctx)
 					continue;
 				*value++ = 0;
 				v_len = strlen(value);
-			}
-
-
-			if (v_len > 0) {
 				param.string = kmemdup_nul(value, v_len,
 							   GFP_KERNEL);
 				if (!param.string)
 					return -ENOMEM;
+				param.type = fs_value_is_string;
 			}
 			param.size = v_len;
 
diff --git a/fs/fs_context.c b/fs/fs_context.c
index 138b5b4d621d..9097421cbba5 100644
--- a/fs/fs_context.c
+++ b/fs/fs_context.c
@@ -175,14 +175,15 @@ int vfs_parse_fs_string(struct fs_context *fc, const char *key,
 
 	struct fs_parameter param = {
 		.key	= key,
-		.type	= fs_value_is_string,
+		.type	= fs_value_is_flag,
 		.size	= v_size,
 	};
 
-	if (v_size > 0) {
+	if (value) {
 		param.string = kmemdup_nul(value, v_size, GFP_KERNEL);
 		if (!param.string)
 			return -ENOMEM;
+		param.type = fs_value_is_string;
 	}
 
 	ret = vfs_parse_fs_param(fc, &param);
diff --git a/fs/fs_parser.c b/fs/fs_parser.c
index d1930adce68d..65842cd2e320 100644
--- a/fs/fs_parser.c
+++ b/fs/fs_parser.c
@@ -85,7 +85,6 @@ int fs_parse(struct fs_context *fc,
 	const struct fs_parameter_enum *e;
 	int ret = -ENOPARAM, b;
 
-	result->has_value = !!param->string;
 	result->negated = false;
 	result->uint_64 = 0;
 
@@ -95,7 +94,7 @@ int fs_parse(struct fs_context *fc,
 		 * "xxx" takes the "no"-form negative - but only if there
 		 * wasn't an value.
 		 */
-		if (result->has_value)
+		if (param->type != fs_value_is_flag)
 			goto unknown_parameter;
 		if (param->key[0] != 'n' || param->key[1] != 'o' || !param->key[2])
 			goto unknown_parameter;
@@ -127,14 +126,13 @@ int fs_parse(struct fs_context *fc,
 	case fs_param_is_u64:
 	case fs_param_is_enum:
 	case fs_param_is_string:
-		if (param->type != fs_value_is_string)
-			goto bad_value;
-		if (!result->has_value) {
+		if (param->type == fs_value_is_string)
+			break;
+		if (param->type == fs_value_is_flag) {
 			if (p->flags & fs_param_v_optional)
 				goto okay;
-			goto bad_value;
 		}
-		/* Fall through */
+		goto bad_value;
 	default:
 		break;
 	}
@@ -144,8 +142,7 @@ int fs_parse(struct fs_context *fc,
 	 */
 	switch (p->type) {
 	case fs_param_is_flag:
-		if (param->type != fs_value_is_flag &&
-		    (param->type != fs_value_is_string || result->has_value))
+		if (param->type != fs_value_is_flag)
 			return invalf(fc, "%s: Unexpected value for '%s'",
 				      desc->name, param->key);
 		result->boolean = true;
@@ -206,9 +203,6 @@ int fs_parse(struct fs_context *fc,
 	case fs_param_is_fd: {
 		switch (param->type) {
 		case fs_value_is_string:
-			if (!result->has_value)
-				goto bad_value;
-
 			ret = kstrtouint(param->string, 0, &result->uint_32);
 			break;
 		case fs_value_is_file:
diff --git a/include/linux/fs_parser.h b/include/linux/fs_parser.h
index dee140db6240..45323203128b 100644
--- a/include/linux/fs_parser.h
+++ b/include/linux/fs_parser.h
@@ -72,7 +72,6 @@ struct fs_parameter_description {
  */
 struct fs_parse_result {
 	bool			negated;	/* T if param was "noxxx" */
-	bool			has_value;	/* T if value supplied to param */
 	union {
 		bool		boolean;	/* For spec_bool */
 		int		int_32;		/* For spec_s32/spec_enum */

^ permalink raw reply related	[flat|nested] 37+ messages in thread

* Re: [PATCH 01/12] aio: fix async fsync creds
  2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
  2019-12-13  9:32   ` Miklos Szeredi
@ 2020-05-04  8:05   ` Avi Kivity
  1 sibling, 0 replies; 37+ messages in thread
From: Avi Kivity @ 2020-05-04  8:05 UTC (permalink / raw)
  To: Miklos Szeredi, Al Viro; +Cc: linux-fsdevel, Giuseppe Scrivano, stable

Ping on this unapplied patch. Its lack makes it hard to use containers 
for development of storage systems.


On 28/11/2019 17.59, Miklos Szeredi wrote:
> Avi Kivity reports that on fuse filesystems running in a user namespace
> asyncronous fsync fails with EOVERFLOW.
>
> The reason is that f_ops->fsync() is called with the creds of the kthread
> performing aio work instead of the creds of the process originally
> submitting IOCB_CMD_FSYNC.
>
> Fuse sends the creds of the caller in the request header and it needs to
> translate the uid and gid into the server's user namespace.  Since the
> kthread is running in init_user_ns, the translation will fail and the
> operation returns an error.
>
> It can be argued that fsync doesn't actually need any creds, but just
> zeroing out those fields in the header (as with requests that currently
> don't take creds) is a backward compatibility risk.
>
> Instead of working around this issue in fuse, solve the core of the problem
> by calling the filesystem with the proper creds.
>
> Reported-by: Avi Kivity <avi@scylladb.com>
> Tested-by: Giuseppe Scrivano <gscrivan@redhat.com>
> Fixes: c9582eb0ff7d ("fuse: Fail all requests with invalid uids or gids")
> Cc: stable@vger.kernel.org  # 4.18+
> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
> ---
>   fs/aio.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 0d9a559d488c..37828773e2fe 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -176,6 +176,7 @@ struct fsync_iocb {
>   	struct file		*file;
>   	struct work_struct	work;
>   	bool			datasync;
> +	struct cred		*creds;
>   };
>   
>   struct poll_iocb {
> @@ -1589,8 +1590,11 @@ static int aio_write(struct kiocb *req, const struct iocb *iocb,
>   static void aio_fsync_work(struct work_struct *work)
>   {
>   	struct aio_kiocb *iocb = container_of(work, struct aio_kiocb, fsync.work);
> +	const struct cred *old_cred = override_creds(iocb->fsync.creds);
>   
>   	iocb->ki_res.res = vfs_fsync(iocb->fsync.file, iocb->fsync.datasync);
> +	revert_creds(old_cred);
> +	put_cred(iocb->fsync.creds);
>   	iocb_put(iocb);
>   }
>   
> @@ -1604,6 +1608,10 @@ static int aio_fsync(struct fsync_iocb *req, const struct iocb *iocb,
>   	if (unlikely(!req->file->f_op->fsync))
>   		return -EINVAL;
>   
> +	req->creds = prepare_creds();
> +	if (!req->creds)
> +		return -ENOMEM;
> +
>   	req->datasync = datasync;
>   	INIT_WORK(&req->work, aio_fsync_work);
>   	schedule_work(&req->work);

^ permalink raw reply	[flat|nested] 37+ messages in thread

end of thread, other threads:[~2020-05-04  8:05 UTC | newest]

Thread overview: 37+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-28 15:59 [PATCH 00/12] various vfs patches Miklos Szeredi
2019-11-28 15:59 ` [PATCH 01/12] aio: fix async fsync creds Miklos Szeredi
2019-12-13  9:32   ` Miklos Szeredi
2020-05-04  8:05   ` Avi Kivity
2019-11-28 15:59 ` [PATCH 02/12] fs_parse: fix fs_param_v_optional handling Miklos Szeredi
2019-11-29 11:31   ` Andrew Price
2019-11-29 14:43     ` Miklos Szeredi
2019-11-29 15:56       ` Andrew Price
2019-12-16 23:28   ` Al Viro
2019-12-17  1:18     ` Al Viro
2019-12-17  3:27       ` Al Viro
2019-11-28 15:59 ` [PATCH 03/12] vfs: verify param type in vfs_parse_sb_flag() Miklos Szeredi
2019-11-28 15:59 ` [PATCH 04/12] uapi: deprecate STATX_ALL Miklos Szeredi
2019-11-28 15:59 ` [PATCH 05/12] statx: don't clear STATX_ATIME on SB_RDONLY Miklos Szeredi
2019-11-28 15:59 ` [PATCH 06/12] utimensat: AT_EMPTY_PATH support Miklos Szeredi
2019-11-28 15:59 ` [PATCH 07/12] f*xattr: allow O_PATH descriptors Miklos Szeredi
2019-11-28 15:59 ` [PATCH 08/12] vfs: allow unprivileged whiteout creation Miklos Szeredi
2019-12-17  3:51   ` Al Viro
2019-12-17  4:22     ` Miklos Szeredi
2019-11-28 15:59 ` [PATCH 09/12] fs_parser: "string" with missing value is a "flag" Miklos Szeredi
2019-12-17 17:32   ` Al Viro
2019-12-17 18:31     ` Al Viro
2019-11-28 15:59 ` [PATCH 10/12] vfs: don't parse forbidden flags Miklos Szeredi
2019-11-28 15:59 ` [PATCH 11/12] vfs: don't parse "posixacl" option Miklos Szeredi
2019-12-17  3:42   ` Al Viro
2019-12-17  4:18     ` Miklos Szeredi
2019-12-17  4:28       ` Al Viro
2019-11-28 15:59 ` [PATCH 12/12] vfs: don't parse "silent" option Miklos Szeredi
2019-12-17  3:37   ` Al Viro
2019-12-17  4:12     ` Miklos Szeredi
2019-12-17  4:16       ` Miklos Szeredi
2019-12-17  4:19         ` Al Viro
2019-12-17  4:23           ` Miklos Szeredi
2019-12-17  4:28             ` Miklos Szeredi
2019-12-17  4:17       ` Al Viro
2019-12-13  9:33 ` [PATCH 00/12] various vfs patches Miklos Szeredi
2019-12-16 23:13   ` Al Viro

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).