linux-m68k.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Re: Add a new fchmodat4() syscall, v2
       [not found] <20190717012719.5524-1-palmer@sifive.com>
@ 2020-06-09 13:52 ` Florian Weimer
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Florian Weimer @ 2020-06-09 13:52 UTC (permalink / raw)
  To: Palmer Dabbelt
  Cc: viro, linux-kernel, linux-fsdevel, linux-api, Arnd Bergmann, rth,
	ink, mattst88, linux, catalin.marinas, will, tony.luck,
	fenghua.yu, geert, monstr, ralf, paul.burton, jhogan,
	James.Bottomley, deller, benh, paulus, mpe, heiko.carstens, gor,
	borntraeger, ysato, dalias, davem, luto, tglx, mingo, bp, hpa,
	x86, peterz, acme, alexander.shishkin, jolsa, namhyung, dhowells,
	firoz.khan, stefan, schwidefsky, axboe, christian, hare,
	deepa.kernel, tycho, kim.phillips, linux-alpha, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linuxppc-dev,
	linux-s390, linux-sh, sparclinux, linux-arch

* Palmer Dabbelt:

> This patch set adds fchmodat4(), a new syscall. The actual
> implementation is super simple: essentially it's just the same as
> fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> I've attempted to make this match "man 2 fchmodat" as closely as
> possible, which says EINVAL is returned for invalid flags (as opposed to
> ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
> I have a sketch of a glibc patch that I haven't even compiled yet, but
> seems fairly straight-forward:

What's the status here?  We'd really like to see this system call in the
kernel because our emulation in glibc has its problems (especially if
/proc is not mounted).

Thanks,
Florian


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 0/5] Add a new fchmodat4() syscall
  2020-06-09 13:52 ` Add a new fchmodat4() syscall, v2 Florian Weimer
@ 2023-07-11 11:25   ` Alexey Gladkov
  2023-07-11 11:25     ` [PATCH v3 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
                       ` (7 more replies)
  0 siblings, 8 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, palmer, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

This patch set adds fchmodat4(), a new syscall. The actual
implementation is super simple: essentially it's just the same as
fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
I've attempted to make this match "man 2 fchmodat" as closely as
possible, which says EINVAL is returned for invalid flags (as opposed to
ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
I have a sketch of a glibc patch that I haven't even compiled yet, but
seems fairly straight-forward:

    diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c
    index 6d9cbc1ce9e0..b1beab76d56c 100644
    --- a/sysdeps/unix/sysv/linux/fchmodat.c
    +++ b/sysdeps/unix/sysv/linux/fchmodat.c
    @@ -29,12 +29,36 @@
     int
     fchmodat (int fd, const char *file, mode_t mode, int flag)
     {
    -  if (flag & ~AT_SYMLINK_NOFOLLOW)
    -    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
    -#ifndef __NR_lchmod		/* Linux so far has no lchmod syscall.  */
    +  /* There are four paths through this code:
    +      - The flags are zero.  In this case it's fine to call fchmodat.
    +      - The flags are non-zero and glibc doesn't have access to
    +	__NR_fchmodat4.  In this case all we can do is emulate the error codes
    +	defined by the glibc interface from userspace.
    +      - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has
    +	fchmodat4.  This is the simplest case, as the fchmodat4 syscall exactly
    +	matches glibc's library interface so it can be called directly.
    +      - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does
    +	not.  In this case we must respect the error codes defined by the glibc
    +	interface instead of returning ENOSYS.
    +    The intent here is to ensure that the kernel is called at most once per
    +    library call, and that the error types defined by glibc are always
    +    respected.  */
    +
    +#ifdef __NR_fchmodat4
    +  long result;
    +#endif
    +
    +  if (flag == 0)
    +    return INLINE_SYSCALL (fchmodat, 3, fd, file, mode);
    +
    +#ifdef __NR_fchmodat4
    +  result = INLINE_SYSCALL (fchmodat4, 4, fd, file, mode, flag);
    +  if (result == 0 || errno != ENOSYS)
    +    return result;
    +#endif
    +
       if (flag & AT_SYMLINK_NOFOLLOW)
         return INLINE_SYSCALL_ERROR_RETURN_VALUE (ENOTSUP);
    -#endif

    -  return INLINE_SYSCALL (fchmodat, 3, fd, file, mode);
    +  return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
     }

I've never added a new syscall before so I'm not really sure what the
proper procedure to follow is.  Based on the feedback from my v1 patch
set it seems this is somewhat uncontroversial.  At this point I don't
think there's anything I'm missing, though note that I haven't gotten
around to testing it this time because the diff from v1 is trivial for
any platform I could reasonably test on.  The v1 patches suggest a
simple test case, but I didn't re-run it because I don't want to reboot
my laptop.

Changes since v2 [20190717012719.5524-1-palmer@sifive.com]:

* Rebased to master.
* The lookup_flags passed to sys_fchmodat4 as suggested by Al Viro.
* Selftest added.

Changes since v1 [20190531191204.4044-1-palmer@sifive.com]:

* All architectures are now supported, which support squashed into a
  single patch.
* The do_fchmodat() helper function has been removed, in favor of directly
  calling do_fchmodat4().
* The patches are based on 5.2 instead of 5.1.

---

Alexey Gladkov (1):
  selftests: add fchmodat4(2) selftest

Palmer Dabbelt (4):
  Non-functional cleanup of a "__user * filename"
  fs: Add fchmodat4()
  arch: Register fchmodat4, usually as syscall 451
  tools headers UAPI: Sync files changed by new fchmodat4 syscall

 arch/alpha/kernel/syscalls/syscall.tbl        |   1 +
 arch/arm/tools/syscall.tbl                    |   1 +
 arch/arm64/include/asm/unistd32.h             |   2 +
 arch/ia64/kernel/syscalls/syscall.tbl         |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl         |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl     |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl       |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl      |   1 +
 arch/s390/kernel/syscalls/syscall.tbl         |   1 +
 arch/sh/kernel/syscalls/syscall.tbl           |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl        |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl       |   1 +
 fs/open.c                                     |  18 ++-
 include/linux/syscalls.h                      |   4 +-
 include/uapi/asm-generic/unistd.h             |   5 +-
 tools/include/uapi/asm-generic/unistd.h       |   5 +-
 .../arch/mips/entry/syscalls/syscall_n64.tbl  |   1 +
 .../arch/powerpc/entry/syscalls/syscall.tbl   |   1 +
 .../perf/arch/s390/entry/syscalls/syscall.tbl |   1 +
 .../arch/x86/entry/syscalls/syscall_64.tbl    |   1 +
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/fchmodat4/.gitignore  |   2 +
 tools/testing/selftests/fchmodat4/Makefile    |   6 +
 .../selftests/fchmodat4/fchmodat4_test.c      | 151 ++++++++++++++++++
 29 files changed, 207 insertions(+), 7 deletions(-)
 create mode 100644 tools/testing/selftests/fchmodat4/.gitignore
 create mode 100644 tools/testing/selftests/fchmodat4/Makefile
 create mode 100644 tools/testing/selftests/fchmodat4/fchmodat4_test.c

-- 
2.33.8


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v3 1/5] Non-functional cleanup of a "__user * filename"
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
@ 2023-07-11 11:25     ` Alexey Gladkov
  2023-07-11 11:32       ` Arnd Bergmann
  2023-07-11 11:25     ` [PATCH v3 2/5] fs: Add fchmodat4() Alexey Gladkov
                       ` (6 subsequent siblings)
  7 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

The next patch defines a very similar interface, which I copied from
this definition.  Since I'm touching it anyway I don't see any reason
not to just go fix this one up.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
---
 include/linux/syscalls.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 33a0ee3bcb2e..497bdd968c32 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -464,7 +464,7 @@ asmlinkage long sys_chdir(const char __user *filename);
 asmlinkage long sys_fchdir(unsigned int fd);
 asmlinkage long sys_chroot(const char __user *filename);
 asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
-asmlinkage long sys_fchmodat(int dfd, const char __user * filename,
+asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
 			     umode_t mode);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 			     gid_t group, int flag);
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
  2023-07-11 11:25     ` [PATCH v3 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
@ 2023-07-11 11:25     ` Alexey Gladkov
  2023-07-11 11:42       ` Arnd Bergmann
  2023-07-11 12:28       ` Matthew Wilcox
  2023-07-11 11:25     ` [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451 Alexey Gladkov
                       ` (5 subsequent siblings)
  7 siblings, 2 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].

There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.

The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
[2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 fs/open.c                | 18 ++++++++++++++----
 include/linux/syscalls.h |  2 ++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 4478adcc4f3a..58bb88c6afb6 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
 	return err;
 }
 
-static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
+static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
 {
 	struct path path;
 	int error;
-	unsigned int lookup_flags = LOOKUP_FOLLOW;
+
 retry:
 	error = user_path_at(dfd, filename, lookup_flags, &path);
 	if (!error) {
@@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
 	return error;
 }
 
+SYSCALL_DEFINE4(fchmodat4, int, dfd, const char __user *, filename,
+		umode_t, mode, int, flags)
+{
+	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
+		return -EINVAL;
+
+	return do_fchmodat4(dfd, filename, mode,
+			flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW);
+}
+
 SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
 		umode_t, mode)
 {
-	return do_fchmodat(dfd, filename, mode);
+	return do_fchmodat4(dfd, filename, mode, LOOKUP_FOLLOW);
 }
 
 SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 {
-	return do_fchmodat(AT_FDCWD, filename, mode);
+	return do_fchmodat4(AT_FDCWD, filename, mode, LOOKUP_FOLLOW);
 }
 
 /**
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 497bdd968c32..b17d37d2bad6 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -466,6 +466,8 @@ asmlinkage long sys_chroot(const char __user *filename);
 asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
 asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
 			     umode_t mode);
+asmlinkage long sys_fchmodat4(int dfd, const char __user *filename,
+			     umode_t mode, int flags);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 			     gid_t group, int flag);
 asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
  2023-07-11 11:25     ` [PATCH v3 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
  2023-07-11 11:25     ` [PATCH v3 2/5] fs: Add fchmodat4() Alexey Gladkov
@ 2023-07-11 11:25     ` Alexey Gladkov
  2023-07-11 11:31       ` Arnd Bergmann
  2023-07-11 11:25     ` [PATCH v3 4/5] tools headers UAPI: Sync files changed by new fchmodat4 syscall Alexey Gladkov
                       ` (4 subsequent siblings)
  7 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

This registers the new fchmodat4 syscall in most places as nuber 451,
with alpha being the exception where it's 561.  I found all these sites
by grepping for fspick, which I assume has found me everything.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 arch/alpha/kernel/syscalls/syscall.tbl      | 1 +
 arch/arm/tools/syscall.tbl                  | 1 +
 arch/arm64/include/asm/unistd32.h           | 2 ++
 arch/ia64/kernel/syscalls/syscall.tbl       | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl       | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl     | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    | 1 +
 arch/s390/kernel/syscalls/syscall.tbl       | 1 +
 arch/sh/kernel/syscalls/syscall.tbl         | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl      | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     | 1 +
 include/uapi/asm-generic/unistd.h           | 5 ++++-
 18 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 8ebacf37a8cf..00ceeffec7ff 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -490,3 +490,4 @@
 558	common	process_mrelease		sys_process_mrelease
 559	common  futex_waitv                     sys_futex_waitv
 560	common	set_mempolicy_home_node		sys_ni_syscall
+561	common	fchmodat4			sys_fchmodat4
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index ac964612d8b0..0b9702d5c425 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -464,3 +464,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index 604a2053d006..49c65d935049 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -907,6 +907,8 @@ __SYSCALL(__NR_process_mrelease, sys_process_mrelease)
 __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
 #define __NR_set_mempolicy_home_node 450
 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
+#define __NR_fchmodat4 451
+__SYSCALL(__NR_fchmodat4, sys_fchmodat4)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index 72c929d9902b..b35225c64781 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -371,3 +371,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index b1f3940bc298..4d80cd87e089 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -450,3 +450,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 820145e47350..306bd18e5b52 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -456,3 +456,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 253ff994ed2e..2ef47a546fd3 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -389,3 +389,4 @@
 448	n32	process_mrelease		sys_process_mrelease
 449	n32	futex_waitv			sys_futex_waitv
 450	n32	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	n32	fchmodat4			sys_fchmodat4
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index 3f1886ad9d80..6356c0a6cda0 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -365,3 +365,4 @@
 448	n64	process_mrelease		sys_process_mrelease
 449	n64	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	n64	fchmodat4			sys_fchmodat4
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 8f243e35a7b2..d1111edba47e 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -438,3 +438,4 @@
 448	o32	process_mrelease		sys_process_mrelease
 449	o32	futex_waitv			sys_futex_waitv
 450	o32	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	o32	fchmodat4			sys_fchmodat4
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index 0e42fceb2d5e..0a1c13744c32 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -448,3 +448,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index a0be127475b1..ee23866fa1c8 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -537,3 +537,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index b68f47541169..d5ce80065ece 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -453,3 +453,4 @@
 448  common	process_mrelease	sys_process_mrelease		sys_process_mrelease
 449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
 450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
+451  common	fchmodat4		sys_fchmodat4			sys_fchmodat4
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 2de85c977f54..697b914f5c33 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -453,3 +453,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index 4398cc6fb68d..79be56831f39 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -496,3 +496,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index 320480a8db4f..3ab07bd87ea9 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -455,3 +455,4 @@
 448	i386	process_mrelease	sys_process_mrelease
 449	i386	futex_waitv		sys_futex_waitv
 450	i386	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	i386	fchmodat4		sys_fchmodat4
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index c84d12608cd2..17047878293c 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -372,6 +372,7 @@
 448	common	process_mrelease	sys_process_mrelease
 449	common	futex_waitv		sys_futex_waitv
 450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
+451	common	fchmodat4		sys_fchmodat4
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 52c94ab5c205..ed47fd3b2293 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -421,3 +421,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index 45fa180cc56a..b7978b3ce3f1 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -886,8 +886,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
 #define __NR_set_mempolicy_home_node 450
 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
 
+#define __NR_fchmodat4 451
+__SYSCALL(__NR_fchmodat4, sys_fchmodat4)
+
 #undef __NR_syscalls
-#define __NR_syscalls 451
+#define __NR_syscalls 452
 
 /*
  * 32 bit systems traditionally used different
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 4/5] tools headers UAPI: Sync files changed by new fchmodat4 syscall
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
                       ` (2 preceding siblings ...)
  2023-07-11 11:25     ` [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451 Alexey Gladkov
@ 2023-07-11 11:25     ` Alexey Gladkov
  2023-07-11 11:25     ` [PATCH v3 5/5] selftests: add fchmodat4(2) selftest Alexey Gladkov
                       ` (3 subsequent siblings)
  7 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

That add support for this new syscall in tools such as 'perf trace'.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 tools/include/uapi/asm-generic/unistd.h             | 5 ++++-
 tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 1 +
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl  | 1 +
 tools/perf/arch/s390/entry/syscalls/syscall.tbl     | 1 +
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl   | 1 +
 5 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index 45fa180cc56a..b7978b3ce3f1 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -886,8 +886,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
 #define __NR_set_mempolicy_home_node 450
 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
 
+#define __NR_fchmodat4 451
+__SYSCALL(__NR_fchmodat4, sys_fchmodat4)
+
 #undef __NR_syscalls
-#define __NR_syscalls 451
+#define __NR_syscalls 452
 
 /*
  * 32 bit systems traditionally used different
diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
index 3f1886ad9d80..6356c0a6cda0 100644
--- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
+++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
@@ -365,3 +365,4 @@
 448	n64	process_mrelease		sys_process_mrelease
 449	n64	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	n64	fchmodat4			sys_fchmodat4
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index a0be127475b1..ee23866fa1c8 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -537,3 +537,4 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
+451	common	fchmodat4			sys_fchmodat4
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index b68f47541169..d5ce80065ece 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -453,3 +453,4 @@
 448  common	process_mrelease	sys_process_mrelease		sys_process_mrelease
 449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
 450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
+451  common	fchmodat4		sys_fchmodat4			sys_fchmodat4
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index c84d12608cd2..17047878293c 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -372,6 +372,7 @@
 448	common	process_mrelease	sys_process_mrelease
 449	common	futex_waitv		sys_futex_waitv
 450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
+451	common	fchmodat4		sys_fchmodat4
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v3 5/5] selftests: add fchmodat4(2) selftest
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
                       ` (3 preceding siblings ...)
  2023-07-11 11:25     ` [PATCH v3 4/5] tools headers UAPI: Sync files changed by new fchmodat4 syscall Alexey Gladkov
@ 2023-07-11 11:25     ` Alexey Gladkov
  2023-07-11 12:10       ` Florian Weimer
  2023-07-11 12:24     ` [PATCH v3 0/5] Add a new fchmodat4() syscall Florian Weimer
                       ` (2 subsequent siblings)
  7 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 11:25 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, palmer, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

The test marks as skipped if a syscall with the AT_SYMLINK_NOFOLLOW flag
fails. This is because not all filesystems support changing the mode
bits of symlinks properly. These filesystems return an error but change
the mode bits:

newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, "symlink", {st_mode=S_IFLNK|0777, st_size=7, ...}, AT_SYMLINK_NOFOLLOW) = 0
syscall_0x1c3(0x4, 0x55fa1f244396, 0x180, 0x100, 0x55fa1f24438e, 0x34) = -1 EOPNOTSUPP (Operation not supported)
newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0

This happens with btrfs and xfs:

 $ /kernel/tools/testing/selftests/fchmodat4/fchmodat4_test
 TAP version 13
 1..1
 ok 1 # SKIP fchmodat4(symlink)
 # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0

 $ stat /tmp/ksft-fchmodat4.*/symlink
   File: /tmp/ksft-fchmodat4.3NCqlE/symlink -> regfile
   Size: 7               Blocks: 0          IO Block: 4096   symbolic link
 Device: 7,0     Inode: 133         Links: 1
 Access: (0600/lrw-------)  Uid: (    0/    root)   Gid: (    0/    root)

Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/fchmodat4/.gitignore  |   2 +
 tools/testing/selftests/fchmodat4/Makefile    |   6 +
 .../selftests/fchmodat4/fchmodat4_test.c      | 151 ++++++++++++++++++
 4 files changed, 160 insertions(+)
 create mode 100644 tools/testing/selftests/fchmodat4/.gitignore
 create mode 100644 tools/testing/selftests/fchmodat4/Makefile
 create mode 100644 tools/testing/selftests/fchmodat4/fchmodat4_test.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 90a62cf75008..fe61fa55412d 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -17,6 +17,7 @@ TARGETS += drivers/net/bonding
 TARGETS += drivers/net/team
 TARGETS += efivarfs
 TARGETS += exec
+TARGETS += fchmodat4
 TARGETS += filesystems
 TARGETS += filesystems/binderfs
 TARGETS += filesystems/epoll
diff --git a/tools/testing/selftests/fchmodat4/.gitignore b/tools/testing/selftests/fchmodat4/.gitignore
new file mode 100644
index 000000000000..82a4846cbc4b
--- /dev/null
+++ b/tools/testing/selftests/fchmodat4/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+/*_test
diff --git a/tools/testing/selftests/fchmodat4/Makefile b/tools/testing/selftests/fchmodat4/Makefile
new file mode 100644
index 000000000000..3d38a69c3c12
--- /dev/null
+++ b/tools/testing/selftests/fchmodat4/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+CFLAGS += -Wall -O2 -g -fsanitize=address -fsanitize=undefined
+TEST_GEN_PROGS := fchmodat4_test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/fchmodat4/fchmodat4_test.c b/tools/testing/selftests/fchmodat4/fchmodat4_test.c
new file mode 100644
index 000000000000..50beb731d8ba
--- /dev/null
+++ b/tools/testing/selftests/fchmodat4/fchmodat4_test.c
@@ -0,0 +1,151 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <syscall.h>
+#include <unistd.h>
+
+#include "../kselftest.h"
+
+#ifndef __NR_fchmodat4
+	#if defined __alpha__
+		#define __NR_fchmodat4 561
+	#elif defined _MIPS_SIM
+		#if _MIPS_SIM == _MIPS_SIM_ABI32	/* o32 */
+			#define __NR_fchmodat4 (451 + 4000)
+		#endif
+		#if _MIPS_SIM == _MIPS_SIM_NABI32	/* n32 */
+			#define __NR_fchmodat4 (451 + 6000)
+		#endif
+		#if _MIPS_SIM == _MIPS_SIM_ABI64	/* n64 */
+			#define __NR_fchmodat4 (451 + 5000)
+		#endif
+	#elif defined __ia64__
+		#define __NR_fchmodat4 (451 + 1024)
+	#else
+		#define __NR_fchmodat4 451
+	#endif
+#endif
+
+int sys_fchmodat4(int dfd, const char *filename, mode_t mode, int flags)
+{
+	int ret = syscall(__NR_fchmodat4, dfd, filename, mode, flags);
+	return ret >= 0 ? ret : -errno;
+}
+
+int setup_testdir(void)
+{
+	int dfd, ret;
+	char dirname[] = "/tmp/ksft-fchmodat4.XXXXXX";
+
+	/* Make the top-level directory. */
+	if (!mkdtemp(dirname))
+		ksft_exit_fail_msg("setup_testdir: failed to create tmpdir\n");
+
+	dfd = open(dirname, O_PATH | O_DIRECTORY);
+	if (dfd < 0)
+		ksft_exit_fail_msg("setup_testdir: failed to open tmpdir\n");
+
+	ret = openat(dfd, "regfile", O_CREAT | O_WRONLY | O_TRUNC, 0644);
+	if (ret < 0)
+		ksft_exit_fail_msg("setup_testdir: failed to create file in tmpdir\n");
+	close(ret);
+
+	ret = symlinkat("regfile", dfd, "symlink");
+	if (ret < 0)
+		ksft_exit_fail_msg("setup_testdir: failed to create symlink in tmpdir\n");
+
+	return dfd;
+}
+
+int expect_mode(int dfd, const char *filename, mode_t expect_mode)
+{
+	struct stat st;
+	int ret = fstatat(dfd, filename, &st, AT_SYMLINK_NOFOLLOW);
+
+	if (ret)
+		ksft_exit_fail_msg("expect_mode: %s: fstatat failed\n", filename);
+
+	return (st.st_mode == expect_mode);
+}
+
+void test_regfile(void)
+{
+	int dfd, ret;
+
+	dfd = setup_testdir();
+
+	ret = sys_fchmodat4(dfd, "regfile", 0640, 0);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("test_regfile: fchmodat4(noflag) failed\n");
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("test_regfile: wrong file mode bits after fchmodat4\n");
+
+	ret = sys_fchmodat4(dfd, "regfile", 0600, AT_SYMLINK_NOFOLLOW);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("test_regfile: fchmodat4(AT_SYMLINK_NOFOLLOW) failed\n");
+
+	if (!expect_mode(dfd, "regfile", 0100600))
+		ksft_exit_fail_msg("test_regfile: wrong file mode bits after fchmodat4 with nofollow\n");
+
+	ksft_test_result_pass("fchmodat4(regfile)\n");
+}
+
+void test_symlink(void)
+{
+	int dfd, ret;
+
+	dfd = setup_testdir();
+
+	ret = sys_fchmodat4(dfd, "symlink", 0640, 0);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("test_symlink: fchmodat4(noflag) failed\n");
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("test_symlink: wrong file mode bits after fchmodat4\n");
+
+	if (!expect_mode(dfd, "symlink", 0120777))
+		ksft_exit_fail_msg("test_symlink: wrong symlink mode bits after fchmodat4\n");
+
+	ret = sys_fchmodat4(dfd, "symlink", 0600, AT_SYMLINK_NOFOLLOW);
+
+	/*
+	 * On certain filesystems (xfs or btrfs), chmod operation fails. So we
+	 * first check the symlink target but if the operation fails we mark the
+	 * test as skipped.
+	 *
+	 * https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html
+	 */
+	if (ret == 0 && !expect_mode(dfd, "symlink", 0120600))
+		ksft_exit_fail_msg("test_symlink: wrong symlink mode bits after fchmodat4 with nofollow\n");
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("test_symlink: wrong file mode bits after fchmodat4 with nofollow\n");
+
+	if (ret != 0)
+		ksft_test_result_skip("fchmodat4(symlink)\n");
+	else
+		ksft_test_result_pass("fchmodat4(symlink)\n");
+}
+
+#define NUM_TESTS 2
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(NUM_TESTS);
+
+	test_regfile();
+	test_symlink();
+
+	if (ksft_get_fail_cnt() + ksft_get_error_cnt() > 0)
+		ksft_exit_fail();
+	else
+		ksft_exit_pass();
+}
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451
  2023-07-11 11:25     ` [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451 Alexey Gladkov
@ 2023-07-11 11:31       ` Arnd Bergmann
  0 siblings, 0 replies; 59+ messages in thread
From: Arnd Bergmann @ 2023-07-11 11:31 UTC (permalink / raw)
  To: Alexey Gladkov, LKML, linux-api, linux-fsdevel, Alexander Viro
  Cc: Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> This registers the new fchmodat4 syscall in most places as nuber 451,
> with alpha being the exception where it's 561.  I found all these sites
> by grepping for fspick, which I assume has found me everything.
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

In linux-6.5-rc1, number 451 is used for __NR_cachestat, the
next free one at the moment is 452.

>  arch/arm/tools/syscall.tbl                  | 1 +
>  arch/arm64/include/asm/unistd32.h           | 2 ++

Unfortunately, you still also need to change __NR_compat_syscalls
in arch/arm64/include/asm/unistd.h. Aside from these two issues,
your patch is the correct way to hook up a new syscall.

   Arnd

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 1/5] Non-functional cleanup of a "__user * filename"
  2023-07-11 11:25     ` [PATCH v3 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
@ 2023-07-11 11:32       ` Arnd Bergmann
  0 siblings, 0 replies; 59+ messages in thread
From: Arnd Bergmann @ 2023-07-11 11:32 UTC (permalink / raw)
  To: Alexey Gladkov, LKML, linux-api, linux-fsdevel, Alexander Viro
  Cc: Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> The next patch defines a very similar interface, which I copied from
> this definition.  Since I'm touching it anyway I don't see any reason
> not to just go fix this one up.
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 11:25     ` [PATCH v3 2/5] fs: Add fchmodat4() Alexey Gladkov
@ 2023-07-11 11:42       ` Arnd Bergmann
  2023-07-11 11:52         ` Christian Brauner
  2023-07-11 12:28       ` Matthew Wilcox
  1 sibling, 1 reply; 59+ messages in thread
From: Arnd Bergmann @ 2023-07-11 11:42 UTC (permalink / raw)
  To: Alexey Gladkov, LKML, linux-api, linux-fsdevel, Alexander Viro
  Cc: Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> On the userspace side fchmodat(3) is implemented as a wrapper
> function which implements the POSIX-specified interface. This
> interface differs from the underlying kernel system call, which does not
> have a flags argument. Most implementations require procfs [1][2].
>
> There doesn't appear to be a good userspace workaround for this issue
> but the implementation in the kernel is pretty straight-forward.
>
> The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> unlike existing fchmodat.
>
> [1] 
> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> [2] 
> https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

I don't know the history of why we ended up with the different
interface, or whether this was done intentionally in the kernel
or if we want this syscall.

Assuming this is in fact needed, I double-checked that the
implementation looks correct to me and is portable to all the
architectures, without the need for a compat wrapper.

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 11:42       ` Arnd Bergmann
@ 2023-07-11 11:52         ` Christian Brauner
  2023-07-11 12:51           ` Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Christian Brauner @ 2023-07-11 11:52 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Alexey Gladkov, LKML, linux-api, linux-fsdevel, Alexander Viro,
	Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > From: Palmer Dabbelt <palmer@sifive.com>
> >
> > On the userspace side fchmodat(3) is implemented as a wrapper
> > function which implements the POSIX-specified interface. This
> > interface differs from the underlying kernel system call, which does not
> > have a flags argument. Most implementations require procfs [1][2].
> >
> > There doesn't appear to be a good userspace workaround for this issue
> > but the implementation in the kernel is pretty straight-forward.
> >
> > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > unlike existing fchmodat.
> >
> > [1] 
> > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > [2] 
> > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> >
> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> 
> I don't know the history of why we ended up with the different
> interface, or whether this was done intentionally in the kernel
> or if we want this syscall.
> 
> Assuming this is in fact needed, I double-checked that the
> implementation looks correct to me and is portable to all the
> architectures, without the need for a compat wrapper.
> 
> Acked-by: Arnd Bergmann <arnd@arndb.de>

The system call itself is useful afaict. But please,

s/fchmodat4/fchmodat2/

With very few exceptions we don't version by argument number but by
revision and we should stick to one scheme:

openat()->openat2()
eventfd()->eventfd2()
clone()/clone2()->clone3()
dup()->dup2()->dup3() // coincides with nr of arguments
pipe()->pipe2() // coincides with nr of arguments
renameat()->renameat2()

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 5/5] selftests: add fchmodat4(2) selftest
  2023-07-11 11:25     ` [PATCH v3 5/5] selftests: add fchmodat4(2) selftest Alexey Gladkov
@ 2023-07-11 12:10       ` Florian Weimer
  2023-07-11 13:38         ` Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Florian Weimer @ 2023-07-11 12:10 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, geert,
	glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, palmer, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

* Alexey Gladkov:

> The test marks as skipped if a syscall with the AT_SYMLINK_NOFOLLOW flag
> fails. This is because not all filesystems support changing the mode
> bits of symlinks properly. These filesystems return an error but change
> the mode bits:
>
> newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
> newfstatat(4, "symlink", {st_mode=S_IFLNK|0777, st_size=7, ...}, AT_SYMLINK_NOFOLLOW) = 0
> syscall_0x1c3(0x4, 0x55fa1f244396, 0x180, 0x100, 0x55fa1f24438e, 0x34) = -1 EOPNOTSUPP (Operation not supported)
> newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
>
> This happens with btrfs and xfs:
>
>  $ /kernel/tools/testing/selftests/fchmodat4/fchmodat4_test
>  TAP version 13
>  1..1
>  ok 1 # SKIP fchmodat4(symlink)
>  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0
>
>  $ stat /tmp/ksft-fchmodat4.*/symlink
>    File: /tmp/ksft-fchmodat4.3NCqlE/symlink -> regfile
>    Size: 7               Blocks: 0          IO Block: 4096   symbolic link
>  Device: 7,0     Inode: 133         Links: 1
>  Access: (0600/lrw-------)  Uid: (    0/    root)   Gid: (    0/    root)
>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

This looks like a bug in those file systems?

As an extra test, “echo 3 > /proc/sys/vm/drop_caches” sometimes has
strange effects in such cases because the bits are not actually stored
on disk, only in the dentry cache.

Thanks,
Florian


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 0/5] Add a new fchmodat4() syscall
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
                       ` (4 preceding siblings ...)
  2023-07-11 11:25     ` [PATCH v3 5/5] selftests: add fchmodat4(2) selftest Alexey Gladkov
@ 2023-07-11 12:24     ` Florian Weimer
  2023-07-11 15:14       ` Christian Brauner
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
  7 siblings, 1 reply; 59+ messages in thread
From: Florian Weimer @ 2023-07-11 12:24 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-arm-kernel, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paul.burton, paulus,
	peterz, ralf, rth, sparclinux, stefan, tglx, tony.luck, tycho,
	will, x86, ysato

* Alexey Gladkov:

> This patch set adds fchmodat4(), a new syscall. The actual
> implementation is super simple: essentially it's just the same as
> fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> I've attempted to make this match "man 2 fchmodat" as closely as
> possible, which says EINVAL is returned for invalid flags (as opposed to
> ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
> I have a sketch of a glibc patch that I haven't even compiled yet, but
> seems fairly straight-forward:
>
>     diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c
>     index 6d9cbc1ce9e0..b1beab76d56c 100644
>     --- a/sysdeps/unix/sysv/linux/fchmodat.c
>     +++ b/sysdeps/unix/sysv/linux/fchmodat.c
>     @@ -29,12 +29,36 @@
>      int
>      fchmodat (int fd, const char *file, mode_t mode, int flag)
>      {
>     -  if (flag & ~AT_SYMLINK_NOFOLLOW)
>     -    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
>     -#ifndef __NR_lchmod		/* Linux so far has no lchmod syscall.  */
>     +  /* There are four paths through this code:
>     +      - The flags are zero.  In this case it's fine to call fchmodat.
>     +      - The flags are non-zero and glibc doesn't have access to
>     +	__NR_fchmodat4.  In this case all we can do is emulate the error codes
>     +	defined by the glibc interface from userspace.
>     +      - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has
>     +	fchmodat4.  This is the simplest case, as the fchmodat4 syscall exactly
>     +	matches glibc's library interface so it can be called directly.
>     +      - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does

If you define __NR_fchmodat4 on all architectures, we can use these
constants directly in glibc.  We no longer depend on the UAPI
definitions of those constants, to cut down the number of code variants,
and to make glibc's system call profile independent of the kernel header
version at build time.

Your version is based on 2.31, more recent versions have some reasonable
emulation for fchmodat based on /proc/self/fd.  I even wrote a comment
describing the same buggy behavior that you witnessed:

+      /* Some Linux versions with some file systems can actually
+        change symbolic link permissions via /proc, but this is not
+        intentional, and it gives inconsistent results (e.g., error
+        return despite mode change).  The expected behavior is that
+        symbolic link modes cannot be changed at all, and this check
+        enforces that.  */
+      if (S_ISLNK (st.st_mode))
+       {
+         __close_nocancel (pathfd);
+         __set_errno (EOPNOTSUPP);
+         return -1;
+       }

I think there was some kernel discussion about that behavior before, but
apparently, it hasn't led to fixes.

I wonder if it makes sense to add a similar error return to the system
call implementation?

>     +	not.  In this case we must respect the error codes defined by the glibc
>     +	interface instead of returning ENOSYS.
>     +    The intent here is to ensure that the kernel is called at most once per
>     +    library call, and that the error types defined by glibc are always
>     +    respected.  */
>     +
>     +#ifdef __NR_fchmodat4
>     +  long result;
>     +#endif
>     +
>     +  if (flag == 0)
>     +    return INLINE_SYSCALL (fchmodat, 3, fd, file, mode);
>     +
>     +#ifdef __NR_fchmodat4
>     +  result = INLINE_SYSCALL (fchmodat4, 4, fd, file, mode, flag);
>     +  if (result == 0 || errno != ENOSYS)
>     +    return result;
>     +#endif

The last if condition is the recommended approach, but in the past, it
broke container host compatibility pretty badly due to seccomp filters
that return EPERM instead of ENOSYS.  I guess we'll learn soon enough if
that's been fixed by now. 8-P

Thanks,
Florian


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 11:25     ` [PATCH v3 2/5] fs: Add fchmodat4() Alexey Gladkov
  2023-07-11 11:42       ` Arnd Bergmann
@ 2023-07-11 12:28       ` Matthew Wilcox
  2023-07-11 12:49         ` Alexey Gladkov
  1 sibling, 1 reply; 59+ messages in thread
From: Matthew Wilcox @ 2023-07-11 12:28 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

On Tue, Jul 11, 2023 at 01:25:43PM +0200, Alexey Gladkov wrote:
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)

This function can still be called do_fchmodat(); we don't need to
version internal functions.


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 12:28       ` Matthew Wilcox
@ 2023-07-11 12:49         ` Alexey Gladkov
  0 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 12:49 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, fweimer,
	geert, glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

On Tue, Jul 11, 2023 at 01:28:04PM +0100, Matthew Wilcox wrote:
> On Tue, Jul 11, 2023 at 01:25:43PM +0200, Alexey Gladkov wrote:
> > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > +static int do_fchmodat4(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
> 
> This function can still be called do_fchmodat(); we don't need to
> version internal functions.

Yes. I tried not to change too much when adopting a patch. In the new
version, I will return the old name. Thanks.

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 11:52         ` Christian Brauner
@ 2023-07-11 12:51           ` Alexey Gladkov
  2023-07-11 14:01             ` Christian Brauner
  0 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 12:51 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Arnd Bergmann, LKML, linux-api, linux-fsdevel, Alexander Viro,
	Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > From: Palmer Dabbelt <palmer@sifive.com>
> > >
> > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > function which implements the POSIX-specified interface. This
> > > interface differs from the underlying kernel system call, which does not
> > > have a flags argument. Most implementations require procfs [1][2].
> > >
> > > There doesn't appear to be a good userspace workaround for this issue
> > > but the implementation in the kernel is pretty straight-forward.
> > >
> > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > unlike existing fchmodat.
> > >
> > > [1] 
> > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > [2] 
> > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > >
> > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > 
> > I don't know the history of why we ended up with the different
> > interface, or whether this was done intentionally in the kernel
> > or if we want this syscall.
> > 
> > Assuming this is in fact needed, I double-checked that the
> > implementation looks correct to me and is portable to all the
> > architectures, without the need for a compat wrapper.
> > 
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> 
> The system call itself is useful afaict. But please,
> 
> s/fchmodat4/fchmodat2/

Sure. I will.

> With very few exceptions we don't version by argument number but by
> revision and we should stick to one scheme:
> 
> openat()->openat2()
> eventfd()->eventfd2()
> clone()/clone2()->clone3()
> dup()->dup2()->dup3() // coincides with nr of arguments
> pipe()->pipe2() // coincides with nr of arguments
> renameat()->renameat2()
> 

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 5/5] selftests: add fchmodat4(2) selftest
  2023-07-11 12:10       ` Florian Weimer
@ 2023-07-11 13:38         ` Alexey Gladkov
  0 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 13:38 UTC (permalink / raw)
  To: Florian Weimer
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, firoz.khan, geert,
	glebfm, gor, hare, heiko.carstens, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, palmer, paul.burton, paulus, peterz, ralf, rth,
	schwidefsky, sparclinux, stefan, tglx, tony.luck, tycho, will,
	x86, ysato

On Tue, Jul 11, 2023 at 02:10:58PM +0200, Florian Weimer wrote:
> * Alexey Gladkov:
> 
> > The test marks as skipped if a syscall with the AT_SYMLINK_NOFOLLOW flag
> > fails. This is because not all filesystems support changing the mode
> > bits of symlinks properly. These filesystems return an error but change
> > the mode bits:
> >
> > newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
> > newfstatat(4, "symlink", {st_mode=S_IFLNK|0777, st_size=7, ...}, AT_SYMLINK_NOFOLLOW) = 0
> > syscall_0x1c3(0x4, 0x55fa1f244396, 0x180, 0x100, 0x55fa1f24438e, 0x34) = -1 EOPNOTSUPP (Operation not supported)
> > newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
> >
> > This happens with btrfs and xfs:
> >
> >  $ /kernel/tools/testing/selftests/fchmodat4/fchmodat4_test
> >  TAP version 13
> >  1..1
> >  ok 1 # SKIP fchmodat4(symlink)
> >  # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0
> >
> >  $ stat /tmp/ksft-fchmodat4.*/symlink
> >    File: /tmp/ksft-fchmodat4.3NCqlE/symlink -> regfile
> >    Size: 7               Blocks: 0          IO Block: 4096   symbolic link
> >  Device: 7,0     Inode: 133         Links: 1
> >  Access: (0600/lrw-------)  Uid: (    0/    root)   Gid: (    0/    root)
> >
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> 
> This looks like a bug in those file systems?

To me this looks like a bug. I'm fine if the operation ends with
EOPNOTSUPP, but in that case the mode bits shouldn't change.

> As an extra test, “echo 3 > /proc/sys/vm/drop_caches” sometimes has
> strange effects in such cases because the bits are not actually stored
> on disk, only in the dentry cache.

tmpfs
syscall_0x1c3(0xffffff9c, 0x7ffd58758574, 0, 0x100, 0x7f6cf18adc70, 0x7ffd58756ad8) = 0
+++ exited with 0 +++
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f
=== dropping caches ===
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f

ext4
syscall_0x1c3(0xffffff9c, 0x7ffedfdb4574, 0, 0x100, 0x7f7f40b45c70, 0x7ffedfdb3ae8) = -1 EOPNOTSUPP (Operation not supported)
+++ exited with 1 +++
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f
=== dropping caches ===
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f

xfs
syscall_0x1c3(0xffffff9c, 0x7ffcd03ce574, 0, 0x100, 0x7ff2f2980c70, 0x7ffcd03cdd38) = -1 EOPNOTSUPP (Operation not supported)
+++ exited with 1 +++
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f
=== dropping caches ===
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f

btrfs
syscall_0x1c3(0xffffff9c, 0x7fff13d2e574, 0, 0x100, 0x7f9b67f59c70, 0x7fff13d2ca88) = -1 EOPNOTSUPP (Operation not supported)
+++ exited with 1 +++
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f
=== dropping caches ===
l--------- 1 root root 1 Jul 11 16:36 /tmp/dir/link -> f

reiserfs
syscall_0x1c3(0xffffff9c, 0x7ffdf75af574, 0, 0x100, 0x7f7ad0634c70, 0x7ffdf75ae478) = 0
+++ exited with 0 +++
l--------- 1 root root 1 Jul 11 16:43 /tmp/dir/link -> f
=== dropping caches ===
l--------- 1 root root 1 Jul 11 16:43 /tmp/dir/link -> f

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 12:51           ` Alexey Gladkov
@ 2023-07-11 14:01             ` Christian Brauner
  2023-07-11 15:23               ` Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Christian Brauner @ 2023-07-11 14:01 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Arnd Bergmann, LKML, linux-api, linux-fsdevel, Alexander Viro,
	Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023 at 02:51:01PM +0200, Alexey Gladkov wrote:
> On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> > On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > > From: Palmer Dabbelt <palmer@sifive.com>
> > > >
> > > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > > function which implements the POSIX-specified interface. This
> > > > interface differs from the underlying kernel system call, which does not
> > > > have a flags argument. Most implementations require procfs [1][2].
> > > >
> > > > There doesn't appear to be a good userspace workaround for this issue
> > > > but the implementation in the kernel is pretty straight-forward.
> > > >
> > > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > > unlike existing fchmodat.
> > > >
> > > > [1] 
> > > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > > [2] 
> > > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > >
> > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > 
> > > I don't know the history of why we ended up with the different
> > > interface, or whether this was done intentionally in the kernel
> > > or if we want this syscall.
> > > 
> > > Assuming this is in fact needed, I double-checked that the
> > > implementation looks correct to me and is portable to all the
> > > architectures, without the need for a compat wrapper.
> > > 
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > 
> > The system call itself is useful afaict. But please,
> > 
> > s/fchmodat4/fchmodat2/
> 
> Sure. I will.

Thanks. Can you also wire this up for every architecture, please?
I don't see that this has been done in this series.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 0/5] Add a new fchmodat4() syscall
  2023-07-11 12:24     ` [PATCH v3 0/5] Add a new fchmodat4() syscall Florian Weimer
@ 2023-07-11 15:14       ` Christian Brauner
  2023-07-25 11:05         ` Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Christian Brauner @ 2023-07-11 15:14 UTC (permalink / raw)
  To: Florian Weimer
  Cc: Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-arm-kernel, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paul.burton, paulus,
	peterz, ralf, rth, sparclinux, stefan, tglx, tony.luck, tycho,
	will, x86, ysato

On Tue, Jul 11, 2023 at 02:24:51PM +0200, Florian Weimer wrote:
> * Alexey Gladkov:
> 
> > This patch set adds fchmodat4(), a new syscall. The actual
> > implementation is super simple: essentially it's just the same as
> > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> > I've attempted to make this match "man 2 fchmodat" as closely as
> > possible, which says EINVAL is returned for invalid flags (as opposed to
> > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
> > I have a sketch of a glibc patch that I haven't even compiled yet, but
> > seems fairly straight-forward:
> >
> >     diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c
> >     index 6d9cbc1ce9e0..b1beab76d56c 100644
> >     --- a/sysdeps/unix/sysv/linux/fchmodat.c
> >     +++ b/sysdeps/unix/sysv/linux/fchmodat.c
> >     @@ -29,12 +29,36 @@
> >      int
> >      fchmodat (int fd, const char *file, mode_t mode, int flag)
> >      {
> >     -  if (flag & ~AT_SYMLINK_NOFOLLOW)
> >     -    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
> >     -#ifndef __NR_lchmod		/* Linux so far has no lchmod syscall.  */
> >     +  /* There are four paths through this code:
> >     +      - The flags are zero.  In this case it's fine to call fchmodat.
> >     +      - The flags are non-zero and glibc doesn't have access to
> >     +	__NR_fchmodat4.  In this case all we can do is emulate the error codes
> >     +	defined by the glibc interface from userspace.
> >     +      - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has
> >     +	fchmodat4.  This is the simplest case, as the fchmodat4 syscall exactly
> >     +	matches glibc's library interface so it can be called directly.
> >     +      - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does
> 
> If you define __NR_fchmodat4 on all architectures, we can use these
> constants directly in glibc.  We no longer depend on the UAPI
> definitions of those constants, to cut down the number of code variants,
> and to make glibc's system call profile independent of the kernel header
> version at build time.
> 
> Your version is based on 2.31, more recent versions have some reasonable
> emulation for fchmodat based on /proc/self/fd.  I even wrote a comment
> describing the same buggy behavior that you witnessed:
> 
> +      /* Some Linux versions with some file systems can actually
> +        change symbolic link permissions via /proc, but this is not
> +        intentional, and it gives inconsistent results (e.g., error
> +        return despite mode change).  The expected behavior is that
> +        symbolic link modes cannot be changed at all, and this check
> +        enforces that.  */
> +      if (S_ISLNK (st.st_mode))
> +       {
> +         __close_nocancel (pathfd);
> +         __set_errno (EOPNOTSUPP);
> +         return -1;
> +       }
> 
> I think there was some kernel discussion about that behavior before, but
> apparently, it hasn't led to fixes.

I think I've explained this somewhere else a couple of months ago but
just in case you weren't on that thread or don't remember and apologies
if you should already know.

A lot of filesystem will happily update the mode of a symlink. The VFS
doesn't do anything to prevent this from happening. This is filesystem
specific.

The EOPNOTSUPP you're seeing very likely comes from POSIX ACLs.
Specifically it comes from filesystems that call posix_acl_chmod(),
e.g., btrfs via

        if (!err && attr->ia_valid & ATTR_MODE)
                err = posix_acl_chmod(idmap, dentry, inode->i_mode);

Most filesystems don't implement i_op->set_acl() for POSIX ACLs.
So posix_acl_chmod() will report EOPNOTSUPP. By the time
posix_acl_chmod() is called, most filesystems will have finished
updating the inode. POSIX ACLs also often aren't integrated into
transactions so a rollback wouldn't even be possible on some
filesystems.

Any filesystem that doesn't implement POSIX ACLs at all will obviously
never fail unless it blocks mode changes on symlinks. Or filesystems
that do have a way to rollback failures from posix_acl_chmod(), or
filesystems that do return an error on chmod() on symlinks such as 9p,
ntfs, ocfs2.

> 
> I wonder if it makes sense to add a similar error return to the system
> call implementation?

Hm, blocking symlink mode changes is pretty regression prone. And just
blocking it through one interface seems weird and makes things even more
inconsistent.

So two options I see:
(1) minimally invasive:
    Filesystems that do call posix_acl_chmod() on symlinks need to be
    changed to stop doing that.
(2) might hit us on the head invasive:
    Try and block symlink mode changes in chmod_common().

Thoughts?

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 2/5] fs: Add fchmodat4()
  2023-07-11 14:01             ` Christian Brauner
@ 2023-07-11 15:23               ` Alexey Gladkov
  0 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 15:23 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Arnd Bergmann, LKML, linux-api, linux-fsdevel, Alexander Viro,
	Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	firoz.khan, Florian Weimer, Geert Uytterhoeven, glebfm, gor,
	hare, heiko.carstens, H. Peter Anvin, Ivan Kokshaysky, jhogan,
	Kim Phillips, ldv, linux-alpha, Linux-Arch, linux-arm-kernel,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, Russell King, linuxppc-dev, Andy Lutomirski,
	Matt Turner, Ingo Molnar, Michal Simek, Michael Ellerman,
	Namhyung Kim, paul.burton, Paul Mackerras, Peter Zijlstra, ralf,
	rth, schwidefsky, sparclinux, stefan, Thomas Gleixner, Tony Luck,
	tycho, Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023 at 04:01:03PM +0200, Christian Brauner wrote:
> On Tue, Jul 11, 2023 at 02:51:01PM +0200, Alexey Gladkov wrote:
> > On Tue, Jul 11, 2023 at 01:52:01PM +0200, Christian Brauner wrote:
> > > On Tue, Jul 11, 2023 at 01:42:19PM +0200, Arnd Bergmann wrote:
> > > > On Tue, Jul 11, 2023, at 13:25, Alexey Gladkov wrote:
> > > > > From: Palmer Dabbelt <palmer@sifive.com>
> > > > >
> > > > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > > > function which implements the POSIX-specified interface. This
> > > > > interface differs from the underlying kernel system call, which does not
> > > > > have a flags argument. Most implementations require procfs [1][2].
> > > > >
> > > > > There doesn't appear to be a good userspace workaround for this issue
> > > > > but the implementation in the kernel is pretty straight-forward.
> > > > >
> > > > > The new fchmodat4() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > > > unlike existing fchmodat.
> > > > >
> > > > > [1] 
> > > > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > > > [2] 
> > > > > https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > > >
> > > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > 
> > > > I don't know the history of why we ended up with the different
> > > > interface, or whether this was done intentionally in the kernel
> > > > or if we want this syscall.
> > > > 
> > > > Assuming this is in fact needed, I double-checked that the
> > > > implementation looks correct to me and is portable to all the
> > > > architectures, without the need for a compat wrapper.
> > > > 
> > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > 
> > > The system call itself is useful afaict. But please,
> > > 
> > > s/fchmodat4/fchmodat2/
> > 
> > Sure. I will.
> 
> Thanks. Can you also wire this up for every architecture, please?
> I don't see that this has been done in this series.

Sure. I have already added in all architectures as far as I can tell:

$ diff -s <(find arch/ -name '*.tbl' |sort -u) <(git grep -lw fchmodat2 arch/ |sort -u)
Files /dev/fd/63 and /dev/fd/62 are identical

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 0/5] Add a new fchmodat2() syscall
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
                       ` (5 preceding siblings ...)
  2023-07-11 12:24     ` [PATCH v3 0/5] Add a new fchmodat4() syscall Florian Weimer
@ 2023-07-11 16:16     ` Alexey Gladkov
  2023-07-11 16:16       ` [PATCH v4 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
                         ` (6 more replies)
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
  7 siblings, 7 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

In glibc, the fchmodat(3) function has a flags argument according to the
POSIX specification [1], but kernel syscalls has no such argument.
Therefore, libc implementations do workarounds using /proc. However,
this requires procfs to be mounted and accessible.

This patch set adds fchmodat2(), a new syscall. The syscall allows to
pass the AT_SYMLINK_NOFOLLOW flag to disable LOOKUP_FOLLOW. In all other
respects, this syscall is no different from fchmodat().

[1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/chmod.html

Changes since v3 [cover.1689074739.git.legion@kernel.org]:

* Rebased to master because a new syscall has appeared in master.
* Increased __NR_compat_syscalls as pointed out by Arnd Bergmann.
* Syscall renamed fchmodat4 -> fchmodat2 as suggested by Christian Brauner.
* Returned do_fchmodat4() the original name. We don't need to version
  internal functions.
* Fixed warnings found by checkpatch.pl.

Changes since v2 [20190717012719.5524-1-palmer@sifive.com]:

* Rebased to master.
* The lookup_flags passed to sys_fchmodat4 as suggested by Al Viro.
* Selftest added.

Changes since v1 [20190531191204.4044-1-palmer@sifive.com]:

* All architectures are now supported, which support squashed into a
  single patch.
* The do_fchmodat() helper function has been removed, in favor of directly
  calling do_fchmodat4().
* The patches are based on 5.2 instead of 5.1.

---

Alexey Gladkov (2):
  fs: Add fchmodat2()
  selftests: Add fchmodat2 selftest

Palmer Dabbelt (3):
  Non-functional cleanup of a "__user * filename"
  arch: Register fchmodat2, usually as syscall 452
  tools headers UAPI: Sync files changed by new fchmodat2 syscall

 arch/alpha/kernel/syscalls/syscall.tbl        |   1 +
 arch/arm/tools/syscall.tbl                    |   1 +
 arch/arm64/include/asm/unistd.h               |   2 +-
 arch/arm64/include/asm/unistd32.h             |   2 +
 arch/ia64/kernel/syscalls/syscall.tbl         |   1 +
 arch/m68k/kernel/syscalls/syscall.tbl         |   1 +
 arch/microblaze/kernel/syscalls/syscall.tbl   |   1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl     |   1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl     |   1 +
 arch/parisc/kernel/syscalls/syscall.tbl       |   1 +
 arch/powerpc/kernel/syscalls/syscall.tbl      |   1 +
 arch/s390/kernel/syscalls/syscall.tbl         |   1 +
 arch/sh/kernel/syscalls/syscall.tbl           |   1 +
 arch/sparc/kernel/syscalls/syscall.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_32.tbl        |   1 +
 arch/x86/entry/syscalls/syscall_64.tbl        |   1 +
 arch/xtensa/kernel/syscalls/syscall.tbl       |   1 +
 fs/open.c                                     |  18 +-
 include/linux/syscalls.h                      |   4 +-
 include/uapi/asm-generic/unistd.h             |   5 +-
 tools/include/uapi/asm-generic/unistd.h       |   5 +-
 .../arch/mips/entry/syscalls/syscall_n64.tbl  |   2 +
 .../arch/powerpc/entry/syscalls/syscall.tbl   |   2 +
 .../perf/arch/s390/entry/syscalls/syscall.tbl |   2 +
 .../arch/x86/entry/syscalls/syscall_64.tbl    |   2 +
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/fchmodat2/.gitignore  |   2 +
 tools/testing/selftests/fchmodat2/Makefile    |   6 +
 .../selftests/fchmodat2/fchmodat2_test.c      | 162 ++++++++++++++++++
 30 files changed, 223 insertions(+), 8 deletions(-)
 create mode 100644 tools/testing/selftests/fchmodat2/.gitignore
 create mode 100644 tools/testing/selftests/fchmodat2/Makefile
 create mode 100644 tools/testing/selftests/fchmodat2/fchmodat2_test.c

-- 
2.33.8


^ permalink raw reply	[flat|nested] 59+ messages in thread

* [PATCH v4 1/5] Non-functional cleanup of a "__user * filename"
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
@ 2023-07-11 16:16       ` Alexey Gladkov
  2023-07-11 16:16       ` [PATCH v4 2/5] fs: Add fchmodat2() Alexey Gladkov
                         ` (5 subsequent siblings)
  6 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

The next patch defines a very similar interface, which I copied from
this definition.  Since I'm touching it anyway I don't see any reason
not to just go fix this one up.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 include/linux/syscalls.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 03e3d0121d5e..584f404bf868 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -438,7 +438,7 @@ asmlinkage long sys_chdir(const char __user *filename);
 asmlinkage long sys_fchdir(unsigned int fd);
 asmlinkage long sys_chroot(const char __user *filename);
 asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
-asmlinkage long sys_fchmodat(int dfd, const char __user * filename,
+asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
 			     umode_t mode);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 			     gid_t group, int flag);
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
  2023-07-11 16:16       ` [PATCH v4 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
@ 2023-07-11 16:16       ` Alexey Gladkov
  2023-07-11 17:05         ` Christian Brauner
  2023-07-25 16:36         ` Aleksa Sarai
  2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
                         ` (4 subsequent siblings)
  6 siblings, 2 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

On the userspace side fchmodat(3) is implemented as a wrapper
function which implements the POSIX-specified interface. This
interface differs from the underlying kernel system call, which does not
have a flags argument. Most implementations require procfs [1][2].

There doesn't appear to be a good userspace workaround for this issue
but the implementation in the kernel is pretty straight-forward.

The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
unlike existing fchmodat.

[1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
[2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28

Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
---
 fs/open.c                | 18 ++++++++++++++----
 include/linux/syscalls.h |  2 ++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/fs/open.c b/fs/open.c
index 0c55c8e7f837..39a7939f0d00 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
 	return err;
 }
 
-static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
+static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
 {
 	struct path path;
 	int error;
-	unsigned int lookup_flags = LOOKUP_FOLLOW;
+
 retry:
 	error = user_path_at(dfd, filename, lookup_flags, &path);
 	if (!error) {
@@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
 	return error;
 }
 
+SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
+		umode_t, mode, int, flags)
+{
+	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
+		return -EINVAL;
+
+	return do_fchmodat(dfd, filename, mode,
+			flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW);
+}
+
 SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
 		umode_t, mode)
 {
-	return do_fchmodat(dfd, filename, mode);
+	return do_fchmodat(dfd, filename, mode, LOOKUP_FOLLOW);
 }
 
 SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
 {
-	return do_fchmodat(AT_FDCWD, filename, mode);
+	return do_fchmodat(AT_FDCWD, filename, mode, LOOKUP_FOLLOW);
 }
 
 /*
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index 584f404bf868..6e852279fbc3 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -440,6 +440,8 @@ asmlinkage long sys_chroot(const char __user *filename);
 asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
 asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
 			     umode_t mode);
+asmlinkage long sys_fchmodat2(int dfd, const char __user *filename,
+			     umode_t mode, int flags);
 asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
 			     gid_t group, int flag);
 asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
  2023-07-11 16:16       ` [PATCH v4 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
  2023-07-11 16:16       ` [PATCH v4 2/5] fs: Add fchmodat2() Alexey Gladkov
@ 2023-07-11 16:16       ` Alexey Gladkov
  2023-07-11 16:26         ` Arnd Bergmann
                           ` (2 more replies)
  2023-07-11 16:16       ` [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall Alexey Gladkov
                         ` (3 subsequent siblings)
  6 siblings, 3 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

This registers the new fchmodat2 syscall in most places as nuber 452,
with alpha being the exception where it's 562.  I found all these sites
by grepping for fspick, which I assume has found me everything.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 arch/alpha/kernel/syscalls/syscall.tbl      | 1 +
 arch/arm/tools/syscall.tbl                  | 1 +
 arch/arm64/include/asm/unistd.h             | 2 +-
 arch/arm64/include/asm/unistd32.h           | 2 ++
 arch/ia64/kernel/syscalls/syscall.tbl       | 1 +
 arch/m68k/kernel/syscalls/syscall.tbl       | 1 +
 arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
 arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
 arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
 arch/parisc/kernel/syscalls/syscall.tbl     | 1 +
 arch/powerpc/kernel/syscalls/syscall.tbl    | 1 +
 arch/s390/kernel/syscalls/syscall.tbl       | 1 +
 arch/sh/kernel/syscalls/syscall.tbl         | 1 +
 arch/sparc/kernel/syscalls/syscall.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_32.tbl      | 1 +
 arch/x86/entry/syscalls/syscall_64.tbl      | 1 +
 arch/xtensa/kernel/syscalls/syscall.tbl     | 1 +
 include/uapi/asm-generic/unistd.h           | 5 ++++-
 19 files changed, 23 insertions(+), 2 deletions(-)

diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
index 1f13995d00d7..ad37569d0507 100644
--- a/arch/alpha/kernel/syscalls/syscall.tbl
+++ b/arch/alpha/kernel/syscalls/syscall.tbl
@@ -491,3 +491,4 @@
 559	common  futex_waitv                     sys_futex_waitv
 560	common	set_mempolicy_home_node		sys_ni_syscall
 561	common	cachestat			sys_cachestat
+562	common	fchmodat2			sys_fchmodat2
diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
index 8ebed8a13874..c572d6c3dee0 100644
--- a/arch/arm/tools/syscall.tbl
+++ b/arch/arm/tools/syscall.tbl
@@ -465,3 +465,4 @@
 449	common	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
index 64a514f90131..bd77253b62e0 100644
--- a/arch/arm64/include/asm/unistd.h
+++ b/arch/arm64/include/asm/unistd.h
@@ -39,7 +39,7 @@
 #define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE + 5)
 #define __ARM_NR_COMPAT_END		(__ARM_NR_COMPAT_BASE + 0x800)
 
-#define __NR_compat_syscalls		452
+#define __NR_compat_syscalls		453
 #endif
 
 #define __ARCH_WANT_SYS_CLONE
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
index d952a28463e0..78b68311ec81 100644
--- a/arch/arm64/include/asm/unistd32.h
+++ b/arch/arm64/include/asm/unistd32.h
@@ -909,6 +909,8 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
 #define __NR_cachestat 451
 __SYSCALL(__NR_cachestat, sys_cachestat)
+#define __NR_fchmodat2 452
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
 
 /*
  * Please add new compat syscalls above this comment and update
diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
index f8c74ffeeefb..83d8609aec03 100644
--- a/arch/ia64/kernel/syscalls/syscall.tbl
+++ b/arch/ia64/kernel/syscalls/syscall.tbl
@@ -372,3 +372,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
index 4f504783371f..259ceb125367 100644
--- a/arch/m68k/kernel/syscalls/syscall.tbl
+++ b/arch/m68k/kernel/syscalls/syscall.tbl
@@ -451,3 +451,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
index 858d22bf275c..a3798c2637fd 100644
--- a/arch/microblaze/kernel/syscalls/syscall.tbl
+++ b/arch/microblaze/kernel/syscalls/syscall.tbl
@@ -457,3 +457,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
index 1976317d4e8b..152034b8e0a0 100644
--- a/arch/mips/kernel/syscalls/syscall_n32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
@@ -390,3 +390,4 @@
 449	n32	futex_waitv			sys_futex_waitv
 450	n32	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	n32	cachestat			sys_cachestat
+452	n32	fchmodat2			sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
index cfda2511badf..cb5e757f6621 100644
--- a/arch/mips/kernel/syscalls/syscall_n64.tbl
+++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
@@ -366,3 +366,4 @@
 449	n64	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	n64	cachestat			sys_cachestat
+452	n64	fchmodat2			sys_fchmodat2
diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
index 7692234c3768..1a646813afdc 100644
--- a/arch/mips/kernel/syscalls/syscall_o32.tbl
+++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
@@ -439,3 +439,4 @@
 449	o32	futex_waitv			sys_futex_waitv
 450	o32	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	o32	cachestat			sys_cachestat
+452	o32	fchmodat2			sys_fchmodat2
diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
index a0a9145b6dd4..e97c175b56f9 100644
--- a/arch/parisc/kernel/syscalls/syscall.tbl
+++ b/arch/parisc/kernel/syscalls/syscall.tbl
@@ -450,3 +450,4 @@
 449	common	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
index 8c0b08b7a80e..20e50586e8a2 100644
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@ -538,3 +538,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
index a6935af2235c..0122cc156952 100644
--- a/arch/s390/kernel/syscalls/syscall.tbl
+++ b/arch/s390/kernel/syscalls/syscall.tbl
@@ -454,3 +454,4 @@
 449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
 450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
 451  common	cachestat		sys_cachestat			sys_cachestat
+452  common	fchmodat2		sys_fchmodat2			sys_fchmodat2
diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
index 97377e8c5025..e90d585c4d3e 100644
--- a/arch/sh/kernel/syscalls/syscall.tbl
+++ b/arch/sh/kernel/syscalls/syscall.tbl
@@ -454,3 +454,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
index faa835f3c54a..4ed06c71c43f 100644
--- a/arch/sparc/kernel/syscalls/syscall.tbl
+++ b/arch/sparc/kernel/syscalls/syscall.tbl
@@ -497,3 +497,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
index bc0a3c941b35..2d0b1bd866ea 100644
--- a/arch/x86/entry/syscalls/syscall_32.tbl
+++ b/arch/x86/entry/syscalls/syscall_32.tbl
@@ -456,3 +456,4 @@
 449	i386	futex_waitv		sys_futex_waitv
 450	i386	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	i386	cachestat		sys_cachestat
+452	i386	fchmodat2		sys_fchmodat2
diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
index 227538b0ce80..814768249eae 100644
--- a/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/arch/x86/entry/syscalls/syscall_64.tbl
@@ -373,6 +373,7 @@
 449	common	futex_waitv		sys_futex_waitv
 450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
 451	common	cachestat		sys_cachestat
+452	common	fchmodat2		sys_fchmodat2
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
index 2b69c3c035b6..fc1a4f3c81d9 100644
--- a/arch/xtensa/kernel/syscalls/syscall.tbl
+++ b/arch/xtensa/kernel/syscalls/syscall.tbl
@@ -422,3 +422,4 @@
 449	common  futex_waitv                     sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
 451	common	cachestat			sys_cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
index fd6c1cb585db..abe087c53b4b 100644
--- a/include/uapi/asm-generic/unistd.h
+++ b/include/uapi/asm-generic/unistd.h
@@ -820,8 +820,11 @@ __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
 #define __NR_cachestat 451
 __SYSCALL(__NR_cachestat, sys_cachestat)
 
+#define __NR_fchmodat2 452
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
+
 #undef __NR_syscalls
-#define __NR_syscalls 452
+#define __NR_syscalls 453
 
 /*
  * 32 bit systems traditionally used different
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
                         ` (2 preceding siblings ...)
  2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
@ 2023-07-11 16:16       ` Alexey Gladkov
  2023-07-11 17:19         ` Namhyung Kim
  2023-07-11 16:16       ` [PATCH v4 5/5] selftests: Add fchmodat2 selftest Alexey Gladkov
                         ` (2 subsequent siblings)
  6 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

From: Palmer Dabbelt <palmer@sifive.com>

That add support for this new syscall in tools such as 'perf trace'.

Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 tools/include/uapi/asm-generic/unistd.h             | 5 ++++-
 tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 2 ++
 tools/perf/arch/powerpc/entry/syscalls/syscall.tbl  | 2 ++
 tools/perf/arch/s390/entry/syscalls/syscall.tbl     | 2 ++
 tools/perf/arch/x86/entry/syscalls/syscall_64.tbl   | 2 ++
 5 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
index dd7d8e10f16d..76b5922b0d39 100644
--- a/tools/include/uapi/asm-generic/unistd.h
+++ b/tools/include/uapi/asm-generic/unistd.h
@@ -817,8 +817,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
 #define __NR_set_mempolicy_home_node 450
 __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
 
+#define __NR_fchmodat2 452
+__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
+
 #undef __NR_syscalls
-#define __NR_syscalls 451
+#define __NR_syscalls 453
 
 /*
  * 32 bit systems traditionally used different
diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
index 3f1886ad9d80..434728af4eaa 100644
--- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
+++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
@@ -365,3 +365,5 @@
 448	n64	process_mrelease		sys_process_mrelease
 449	n64	futex_waitv			sys_futex_waitv
 450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
+# 451 reserved for cachestat
+452	n64	fchmodat2			sys_fchmodat2
diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
index a0be127475b1..6b70b6705bd7 100644
--- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
@@ -537,3 +537,5 @@
 448	common	process_mrelease		sys_process_mrelease
 449	common  futex_waitv                     sys_futex_waitv
 450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
+# 451 reserved for cachestat
+452	common	fchmodat2			sys_fchmodat2
diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
index b68f47541169..0ed90c9535b0 100644
--- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
+++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
@@ -453,3 +453,5 @@
 448  common	process_mrelease	sys_process_mrelease		sys_process_mrelease
 449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
 450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
+# 451 reserved for cachestat
+452  common	fchmodat2		sys_fchmodat2			sys_fchmodat2
diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
index c84d12608cd2..a008724a1f48 100644
--- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
+++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
@@ -372,6 +372,8 @@
 448	common	process_mrelease	sys_process_mrelease
 449	common	futex_waitv		sys_futex_waitv
 450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
+# 451 reserved for cachestat
+452	common	fchmodat2		sys_fchmodat2
 
 #
 # Due to a historical design error, certain syscalls are numbered differently
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* [PATCH v4 5/5] selftests: Add fchmodat2 selftest
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
                         ` (3 preceding siblings ...)
  2023-07-11 16:16       ` [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall Alexey Gladkov
@ 2023-07-11 16:16       ` Alexey Gladkov
  2023-07-11 17:36       ` (subset) [PATCH v4 0/5] Add a new fchmodat2() syscall Christian Brauner
  2023-07-12  2:42       ` Rich Felker
  6 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 16:16 UTC (permalink / raw)
  To: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro
  Cc: James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

The test marks as skipped if a syscall with the AT_SYMLINK_NOFOLLOW flag
fails. This is because not all filesystems support changing the mode
bits of symlinks properly. These filesystems return an error but change
the mode bits:

newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(4, "symlink", {st_mode=S_IFLNK|0777, st_size=7, ...}, AT_SYMLINK_NOFOLLOW) = 0
syscall_0x1c3(0x4, 0x55fa1f244396, 0x180, 0x100, 0x55fa1f24438e, 0x34) = -1 EOPNOTSUPP (Operation not supported)
newfstatat(4, "regfile", {st_mode=S_IFREG|0640, st_size=0, ...}, AT_SYMLINK_NOFOLLOW) = 0

This happens with btrfs and xfs:

 $ tools/testing/selftests/fchmodat2/fchmodat2_test
 TAP version 13
 1..1
 ok 1 # SKIP fchmodat2(symlink)
 # Totals: pass:0 fail:0 xfail:0 xpass:0 skip:1 error:0

 $ stat /tmp/ksft-fchmodat2.*/symlink
   File: /tmp/ksft-fchmodat2.3NCqlE/symlink -> regfile
   Size: 7               Blocks: 0          IO Block: 4096   symbolic link
 Device: 7,0     Inode: 133         Links: 1
 Access: (0600/lrw-------)  Uid: (    0/    root)   Gid: (    0/    root)

Signed-off-by: Alexey Gladkov <legion@kernel.org>
---
 tools/testing/selftests/Makefile              |   1 +
 tools/testing/selftests/fchmodat2/.gitignore  |   2 +
 tools/testing/selftests/fchmodat2/Makefile    |   6 +
 .../selftests/fchmodat2/fchmodat2_test.c      | 162 ++++++++++++++++++
 4 files changed, 171 insertions(+)
 create mode 100644 tools/testing/selftests/fchmodat2/.gitignore
 create mode 100644 tools/testing/selftests/fchmodat2/Makefile
 create mode 100644 tools/testing/selftests/fchmodat2/fchmodat2_test.c

diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 666b56f22a41..8dca8acdb671 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -18,6 +18,7 @@ TARGETS += drivers/net/bonding
 TARGETS += drivers/net/team
 TARGETS += efivarfs
 TARGETS += exec
+TARGETS += fchmodat2
 TARGETS += filesystems
 TARGETS += filesystems/binderfs
 TARGETS += filesystems/epoll
diff --git a/tools/testing/selftests/fchmodat2/.gitignore b/tools/testing/selftests/fchmodat2/.gitignore
new file mode 100644
index 000000000000..82a4846cbc4b
--- /dev/null
+++ b/tools/testing/selftests/fchmodat2/.gitignore
@@ -0,0 +1,2 @@
+# SPDX-License-Identifier: GPL-2.0-only
+/*_test
diff --git a/tools/testing/selftests/fchmodat2/Makefile b/tools/testing/selftests/fchmodat2/Makefile
new file mode 100644
index 000000000000..45b519eab851
--- /dev/null
+++ b/tools/testing/selftests/fchmodat2/Makefile
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0-or-later
+
+CFLAGS += -Wall -O2 -g -fsanitize=address -fsanitize=undefined
+TEST_GEN_PROGS := fchmodat2_test
+
+include ../lib.mk
diff --git a/tools/testing/selftests/fchmodat2/fchmodat2_test.c b/tools/testing/selftests/fchmodat2/fchmodat2_test.c
new file mode 100644
index 000000000000..2d98eb215bc6
--- /dev/null
+++ b/tools/testing/selftests/fchmodat2/fchmodat2_test.c
@@ -0,0 +1,162 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+#define _GNU_SOURCE
+#include <fcntl.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <syscall.h>
+#include <unistd.h>
+
+#include "../kselftest.h"
+
+#ifndef __NR_fchmodat2
+	#if defined __alpha__
+		#define __NR_fchmodat2 562
+	#elif defined _MIPS_SIM
+		#if _MIPS_SIM == _MIPS_SIM_ABI32	/* o32 */
+			#define __NR_fchmodat2 (452 + 4000)
+		#endif
+		#if _MIPS_SIM == _MIPS_SIM_NABI32	/* n32 */
+			#define __NR_fchmodat2 (452 + 6000)
+		#endif
+		#if _MIPS_SIM == _MIPS_SIM_ABI64	/* n64 */
+			#define __NR_fchmodat2 (452 + 5000)
+		#endif
+	#elif defined __ia64__
+		#define __NR_fchmodat2 (452 + 1024)
+	#else
+		#define __NR_fchmodat2 452
+	#endif
+#endif
+
+int sys_fchmodat2(int dfd, const char *filename, mode_t mode, int flags)
+{
+	int ret = syscall(__NR_fchmodat2, dfd, filename, mode, flags);
+
+	return ret >= 0 ? ret : -errno;
+}
+
+int setup_testdir(void)
+{
+	int dfd, ret;
+	char dirname[] = "/tmp/ksft-fchmodat2.XXXXXX";
+
+	/* Make the top-level directory. */
+	if (!mkdtemp(dirname))
+		ksft_exit_fail_msg("%s: failed to create tmpdir\n", __func__);
+
+	dfd = open(dirname, O_PATH | O_DIRECTORY);
+	if (dfd < 0)
+		ksft_exit_fail_msg("%s: failed to open tmpdir\n", __func__);
+
+	ret = openat(dfd, "regfile", O_CREAT | O_WRONLY | O_TRUNC, 0644);
+	if (ret < 0)
+		ksft_exit_fail_msg("%s: failed to create file in tmpdir\n",
+				__func__);
+	close(ret);
+
+	ret = symlinkat("regfile", dfd, "symlink");
+	if (ret < 0)
+		ksft_exit_fail_msg("%s: failed to create symlink in tmpdir\n",
+				__func__);
+
+	return dfd;
+}
+
+int expect_mode(int dfd, const char *filename, mode_t expect_mode)
+{
+	struct stat st;
+	int ret = fstatat(dfd, filename, &st, AT_SYMLINK_NOFOLLOW);
+
+	if (ret)
+		ksft_exit_fail_msg("%s: %s: fstatat failed\n",
+				__func__, filename);
+
+	return (st.st_mode == expect_mode);
+}
+
+void test_regfile(void)
+{
+	int dfd, ret;
+
+	dfd = setup_testdir();
+
+	ret = sys_fchmodat2(dfd, "regfile", 0640, 0);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("%s: fchmodat2(noflag) failed\n", __func__);
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("%s: wrong file mode bits after fchmodat2\n",
+				__func__);
+
+	ret = sys_fchmodat2(dfd, "regfile", 0600, AT_SYMLINK_NOFOLLOW);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("%s: fchmodat2(AT_SYMLINK_NOFOLLOW) failed\n",
+				__func__);
+
+	if (!expect_mode(dfd, "regfile", 0100600))
+		ksft_exit_fail_msg("%s: wrong file mode bits after fchmodat2 with nofollow\n",
+				__func__);
+
+	ksft_test_result_pass("fchmodat2(regfile)\n");
+}
+
+void test_symlink(void)
+{
+	int dfd, ret;
+
+	dfd = setup_testdir();
+
+	ret = sys_fchmodat2(dfd, "symlink", 0640, 0);
+
+	if (ret < 0)
+		ksft_exit_fail_msg("%s: fchmodat2(noflag) failed\n", __func__);
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("%s: wrong file mode bits after fchmodat2\n",
+				__func__);
+
+	if (!expect_mode(dfd, "symlink", 0120777))
+		ksft_exit_fail_msg("%s: wrong symlink mode bits after fchmodat2\n",
+				__func__);
+
+	ret = sys_fchmodat2(dfd, "symlink", 0600, AT_SYMLINK_NOFOLLOW);
+
+	/*
+	 * On certain filesystems (xfs or btrfs), chmod operation fails. So we
+	 * first check the symlink target but if the operation fails we mark the
+	 * test as skipped.
+	 *
+	 * https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html
+	 */
+	if (ret == 0 && !expect_mode(dfd, "symlink", 0120600))
+		ksft_exit_fail_msg("%s: wrong symlink mode bits after fchmodat2 with nofollow\n",
+				__func__);
+
+	if (!expect_mode(dfd, "regfile", 0100640))
+		ksft_exit_fail_msg("%s: wrong file mode bits after fchmodat2 with nofollow\n",
+				__func__);
+
+	if (ret != 0)
+		ksft_test_result_skip("fchmodat2(symlink)\n");
+	else
+		ksft_test_result_pass("fchmodat2(symlink)\n");
+}
+
+#define NUM_TESTS 2
+
+int main(int argc, char **argv)
+{
+	ksft_print_header();
+	ksft_set_plan(NUM_TESTS);
+
+	test_regfile();
+	test_symlink();
+
+	if (ksft_get_fail_cnt() + ksft_get_error_cnt() > 0)
+		ksft_exit_fail();
+	else
+		ksft_exit_pass();
+}
-- 
2.33.8


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
@ 2023-07-11 16:26         ` Arnd Bergmann
  2023-07-25  7:16         ` Geert Uytterhoeven
  2023-07-25 16:43         ` Aleksa Sarai
  2 siblings, 0 replies; 59+ messages in thread
From: Arnd Bergmann @ 2023-07-11 16:26 UTC (permalink / raw)
  To: Alexey Gladkov, LKML, linux-api, linux-fsdevel, Alexander Viro
  Cc: Palmer Dabbelt, James E . J . Bottomley,
	Arnaldo Carvalho de Melo, Alexander Shishkin, Jens Axboe,
	Benjamin Herrenschmidt, Christian Borntraeger, Borislav Petkov,
	Catalin Marinas, christian, Rich Felker, David S . Miller,
	Deepa Dinamani, Helge Deller, David Howells, fenghua.yu,
	Florian Weimer, Geert Uytterhoeven, glebfm, gor, hare,
	H. Peter Anvin, Ivan Kokshaysky, jhogan, Kim Phillips, ldv,
	linux-alpha, Linux-Arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, Russell King, linuxppc-dev,
	Andy Lutomirski, Matt Turner, Ingo Molnar, Michal Simek,
	Michael Ellerman, Namhyung Kim, Paul Mackerras, Peter Zijlstra,
	ralf, sparclinux, stefan, Thomas Gleixner, Tony Luck, tycho,
	Will Deacon, x86, Yoshinori Sato

On Tue, Jul 11, 2023, at 18:16, Alexey Gladkov wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> This registers the new fchmodat2 syscall in most places as nuber 452,
> with alpha being the exception where it's 562.  I found all these sites
> by grepping for fspick, which I assume has found me everything.
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-11 16:16       ` [PATCH v4 2/5] fs: Add fchmodat2() Alexey Gladkov
@ 2023-07-11 17:05         ` Christian Brauner
  2023-07-25 16:36         ` Aleksa Sarai
  1 sibling, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-11 17:05 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

On Tue, Jul 11, 2023 at 06:16:04PM +0200, Alexey Gladkov wrote:
> On the userspace side fchmodat(3) is implemented as a wrapper
> function which implements the POSIX-specified interface. This
> interface differs from the underlying kernel system call, which does not
> have a flags argument. Most implementations require procfs [1][2].
> 
> There doesn't appear to be a good userspace workaround for this issue
> but the implementation in the kernel is pretty straight-forward.
> 
> The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> unlike existing fchmodat.
> 
> [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> 
> Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  fs/open.c                | 18 ++++++++++++++----
>  include/linux/syscalls.h |  2 ++
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/open.c b/fs/open.c
> index 0c55c8e7f837..39a7939f0d00 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
>  	return err;
>  }
>  
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)

Should all be unsigned instead of int here for flags. We also had a
documentation update to that effect but smh never sent it.
user_path_at() itself takes an unsigned as well.

I'll fix that up though.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall
  2023-07-11 16:16       ` [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall Alexey Gladkov
@ 2023-07-11 17:19         ` Namhyung Kim
  2023-07-11 17:23           ` Alexey Gladkov
  0 siblings, 1 reply; 59+ messages in thread
From: Namhyung Kim @ 2023-07-11 17:19 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, paulus, peterz, ralf, sparclinux,
	stefan, tglx, tony.luck, tycho, will, x86, ysato

Hello,

On Tue, Jul 11, 2023 at 9:18 AM Alexey Gladkov <legion@kernel.org> wrote:
>
> From: Palmer Dabbelt <palmer@sifive.com>
>
> That add support for this new syscall in tools such as 'perf trace'.
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>
> ---
>  tools/include/uapi/asm-generic/unistd.h             | 5 ++++-
>  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 2 ++
>  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl  | 2 ++
>  tools/perf/arch/s390/entry/syscalls/syscall.tbl     | 2 ++
>  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl   | 2 ++
>  5 files changed, 12 insertions(+), 1 deletion(-)

It'd be nice if you route this patch separately through the
perf tools tree.  We can add this after the kernel change
is accepted.

Thanks,
Namhyung


>
> diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
> index dd7d8e10f16d..76b5922b0d39 100644
> --- a/tools/include/uapi/asm-generic/unistd.h
> +++ b/tools/include/uapi/asm-generic/unistd.h
> @@ -817,8 +817,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
>  #define __NR_set_mempolicy_home_node 450
>  __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
>
> +#define __NR_fchmodat2 452
> +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
> +
>  #undef __NR_syscalls
> -#define __NR_syscalls 451
> +#define __NR_syscalls 453
>
>  /*
>   * 32 bit systems traditionally used different
> diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> index 3f1886ad9d80..434728af4eaa 100644
> --- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> +++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> @@ -365,3 +365,5 @@
>  448    n64     process_mrelease                sys_process_mrelease
>  449    n64     futex_waitv                     sys_futex_waitv
>  450    common  set_mempolicy_home_node         sys_set_mempolicy_home_node
> +# 451 reserved for cachestat
> +452    n64     fchmodat2                       sys_fchmodat2
> diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> index a0be127475b1..6b70b6705bd7 100644
> --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> @@ -537,3 +537,5 @@
>  448    common  process_mrelease                sys_process_mrelease
>  449    common  futex_waitv                     sys_futex_waitv
>  450    nospu   set_mempolicy_home_node         sys_set_mempolicy_home_node
> +# 451 reserved for cachestat
> +452    common  fchmodat2                       sys_fchmodat2
> diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> index b68f47541169..0ed90c9535b0 100644
> --- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> +++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> @@ -453,3 +453,5 @@
>  448  common    process_mrelease        sys_process_mrelease            sys_process_mrelease
>  449  common    futex_waitv             sys_futex_waitv                 sys_futex_waitv
>  450  common    set_mempolicy_home_node sys_set_mempolicy_home_node     sys_set_mempolicy_home_node
> +# 451 reserved for cachestat
> +452  common    fchmodat2               sys_fchmodat2                   sys_fchmodat2
> diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> index c84d12608cd2..a008724a1f48 100644
> --- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -372,6 +372,8 @@
>  448    common  process_mrelease        sys_process_mrelease
>  449    common  futex_waitv             sys_futex_waitv
>  450    common  set_mempolicy_home_node sys_set_mempolicy_home_node
> +# 451 reserved for cachestat
> +452    common  fchmodat2               sys_fchmodat2
>
>  #
>  # Due to a historical design error, certain syscalls are numbered differently
> --
> 2.33.8
>

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall
  2023-07-11 17:19         ` Namhyung Kim
@ 2023-07-11 17:23           ` Alexey Gladkov
  0 siblings, 0 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-11 17:23 UTC (permalink / raw)
  To: Namhyung Kim
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, paulus, peterz, ralf, sparclinux,
	stefan, tglx, tony.luck, tycho, will, x86, ysato

On Tue, Jul 11, 2023 at 10:19:35AM -0700, Namhyung Kim wrote:
> Hello,
> 
> On Tue, Jul 11, 2023 at 9:18 AM Alexey Gladkov <legion@kernel.org> wrote:
> >
> > From: Palmer Dabbelt <palmer@sifive.com>
> >
> > That add support for this new syscall in tools such as 'perf trace'.
> >
> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > ---
> >  tools/include/uapi/asm-generic/unistd.h             | 5 ++++-
> >  tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl | 2 ++
> >  tools/perf/arch/powerpc/entry/syscalls/syscall.tbl  | 2 ++
> >  tools/perf/arch/s390/entry/syscalls/syscall.tbl     | 2 ++
> >  tools/perf/arch/x86/entry/syscalls/syscall_64.tbl   | 2 ++
> >  5 files changed, 12 insertions(+), 1 deletion(-)
> 
> It'd be nice if you route this patch separately through the
> perf tools tree.  We can add this after the kernel change
> is accepted.

Sure. No problem.

> >
> > diff --git a/tools/include/uapi/asm-generic/unistd.h b/tools/include/uapi/asm-generic/unistd.h
> > index dd7d8e10f16d..76b5922b0d39 100644
> > --- a/tools/include/uapi/asm-generic/unistd.h
> > +++ b/tools/include/uapi/asm-generic/unistd.h
> > @@ -817,8 +817,11 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
> >  #define __NR_set_mempolicy_home_node 450
> >  __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
> >
> > +#define __NR_fchmodat2 452
> > +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
> > +
> >  #undef __NR_syscalls
> > -#define __NR_syscalls 451
> > +#define __NR_syscalls 453
> >
> >  /*
> >   * 32 bit systems traditionally used different
> > diff --git a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> > index 3f1886ad9d80..434728af4eaa 100644
> > --- a/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> > +++ b/tools/perf/arch/mips/entry/syscalls/syscall_n64.tbl
> > @@ -365,3 +365,5 @@
> >  448    n64     process_mrelease                sys_process_mrelease
> >  449    n64     futex_waitv                     sys_futex_waitv
> >  450    common  set_mempolicy_home_node         sys_set_mempolicy_home_node
> > +# 451 reserved for cachestat
> > +452    n64     fchmodat2                       sys_fchmodat2
> > diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> > index a0be127475b1..6b70b6705bd7 100644
> > --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> > +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl
> > @@ -537,3 +537,5 @@
> >  448    common  process_mrelease                sys_process_mrelease
> >  449    common  futex_waitv                     sys_futex_waitv
> >  450    nospu   set_mempolicy_home_node         sys_set_mempolicy_home_node
> > +# 451 reserved for cachestat
> > +452    common  fchmodat2                       sys_fchmodat2
> > diff --git a/tools/perf/arch/s390/entry/syscalls/syscall.tbl b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> > index b68f47541169..0ed90c9535b0 100644
> > --- a/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> > +++ b/tools/perf/arch/s390/entry/syscalls/syscall.tbl
> > @@ -453,3 +453,5 @@
> >  448  common    process_mrelease        sys_process_mrelease            sys_process_mrelease
> >  449  common    futex_waitv             sys_futex_waitv                 sys_futex_waitv
> >  450  common    set_mempolicy_home_node sys_set_mempolicy_home_node     sys_set_mempolicy_home_node
> > +# 451 reserved for cachestat
> > +452  common    fchmodat2               sys_fchmodat2                   sys_fchmodat2
> > diff --git a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> > index c84d12608cd2..a008724a1f48 100644
> > --- a/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> > +++ b/tools/perf/arch/x86/entry/syscalls/syscall_64.tbl
> > @@ -372,6 +372,8 @@
> >  448    common  process_mrelease        sys_process_mrelease
> >  449    common  futex_waitv             sys_futex_waitv
> >  450    common  set_mempolicy_home_node sys_set_mempolicy_home_node
> > +# 451 reserved for cachestat
> > +452    common  fchmodat2               sys_fchmodat2
> >
> >  #
> >  # Due to a historical design error, certain syscalls are numbered differently
> > --
> > 2.33.8
> >
> 

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: (subset) [PATCH v4 0/5] Add a new fchmodat2() syscall
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
                         ` (4 preceding siblings ...)
  2023-07-11 16:16       ` [PATCH v4 5/5] selftests: Add fchmodat2 selftest Alexey Gladkov
@ 2023-07-11 17:36       ` Christian Brauner
  2023-07-12  2:42       ` Rich Felker
  6 siblings, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-11 17:36 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Christian Brauner, James.Bottomley, acme, alexander.shishkin,
	axboe, benh, bp, catalin.marinas, dalias, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, peterz, ralf, sparclinux, stefan, tglx, tony.luck,
	will, x86, ysato, Christian Borntraeger, Paul Mackerras,
	Tycho Andersen, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro

On Tue, 11 Jul 2023 18:16:02 +0200, Alexey Gladkov wrote:
> In glibc, the fchmodat(3) function has a flags argument according to the
> POSIX specification [1], but kernel syscalls has no such argument.
> Therefore, libc implementations do workarounds using /proc. However,
> this requires procfs to be mounted and accessible.
> 
> This patch set adds fchmodat2(), a new syscall. The syscall allows to
> pass the AT_SYMLINK_NOFOLLOW flag to disable LOOKUP_FOLLOW. In all other
> respects, this syscall is no different from fchmodat().
> 
> [...]

Tools updates usually go separately.
Flags argument ported to unsigned int; otherwise unchanged.

---

Applied to the master branch of the vfs/vfs.git tree.
Patches in the master branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: master

[1/5] Non-functional cleanup of a "__user * filename"
      https://git.kernel.org/vfs/vfs/c/0f05a6af6b7e
[2/5] fs: Add fchmodat2()
      https://git.kernel.org/vfs/vfs/c/8d593559ec09
[3/5] arch: Register fchmodat2, usually as syscall 452
      https://git.kernel.org/vfs/vfs/c/2ee63b04f206
[5/5] selftests: Add fchmodat2 selftest
      https://git.kernel.org/vfs/vfs/c/f175b92081ec

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 0/5] Add a new fchmodat2() syscall
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
                         ` (5 preceding siblings ...)
  2023-07-11 17:36       ` (subset) [PATCH v4 0/5] Add a new fchmodat2() syscall Christian Brauner
@ 2023-07-12  2:42       ` Rich Felker
  6 siblings, 0 replies; 59+ messages in thread
From: Rich Felker @ 2023-07-12  2:42 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato

On Tue, Jul 11, 2023 at 06:16:02PM +0200, Alexey Gladkov wrote:
> In glibc, the fchmodat(3) function has a flags argument according to the
> POSIX specification [1], but kernel syscalls has no such argument.
> Therefore, libc implementations do workarounds using /proc. However,
> this requires procfs to be mounted and accessible.
> 
> This patch set adds fchmodat2(), a new syscall. The syscall allows to
> pass the AT_SYMLINK_NOFOLLOW flag to disable LOOKUP_FOLLOW. In all other
> respects, this syscall is no different from fchmodat().
> 
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/functions/chmod.html
> 
> Changes since v3 [cover.1689074739.git.legion@kernel.org]:
> 
> * Rebased to master because a new syscall has appeared in master.
> * Increased __NR_compat_syscalls as pointed out by Arnd Bergmann.
> * Syscall renamed fchmodat4 -> fchmodat2 as suggested by Christian Brauner.
> * Returned do_fchmodat4() the original name. We don't need to version
>   internal functions.
> * Fixed warnings found by checkpatch.pl.
> 
> Changes since v2 [20190717012719.5524-1-palmer@sifive.com]:
> 
> * Rebased to master.
> * The lookup_flags passed to sys_fchmodat4 as suggested by Al Viro.
> * Selftest added.
> 
> Changes since v1 [20190531191204.4044-1-palmer@sifive.com]:
> 
> * All architectures are now supported, which support squashed into a
>   single patch.
> * The do_fchmodat() helper function has been removed, in favor of directly
>   calling do_fchmodat4().
> * The patches are based on 5.2 instead of 5.1.

It's good to see this moving forward. I originally proposed this in a
patch submitted in 2020. I suspect implementation details have changed
since then, but it might make sense to look back at that discussion if
nobody has done so yet (apologies if this was already done and I
missed it) to make sure nothing is overlooked -- I remember there were
some subtleties with what fs backends might try to do with chmod on
symlinks. My proposed commit message also documented a lot of the
history of the issue that might be useful to have as context.

https://lore.kernel.org/all/20200910170256.GK3265@brightrain.aerifal.cx/T/

Rich

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
  2023-07-11 16:26         ` Arnd Bergmann
@ 2023-07-25  7:16         ` Geert Uytterhoeven
  2023-07-25 16:43         ` Aleksa Sarai
  2 siblings, 0 replies; 59+ messages in thread
From: Geert Uytterhoeven @ 2023-07-25  7:16 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-ia64, linux-m68k, linux-mips, linux-parisc,
	linux-s390, linux-sh, linux, linuxppc-dev, luto, mattst88, mingo,
	monstr, mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan,
	tglx, tony.luck, tycho, will, x86, ysato

On Tue, Jul 11, 2023 at 6:25 PM Alexey Gladkov <legion@kernel.org> wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
>
> This registers the new fchmodat2 syscall in most places as nuber 452,
> with alpha being the exception where it's 562.  I found all these sites
> by grepping for fspick, which I assume has found me everything.
>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>

>  arch/m68k/kernel/syscalls/syscall.tbl       | 1 +

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 0/5] Add a new fchmodat4() syscall
  2023-07-11 15:14       ` Christian Brauner
@ 2023-07-25 11:05         ` Alexey Gladkov
  2023-07-25 12:05           ` Christian Brauner
  0 siblings, 1 reply; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-25 11:05 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Florian Weimer, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-arm-kernel, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paul.burton, paulus,
	peterz, ralf, rth, sparclinux, stefan, tglx, tony.luck, tycho,
	will, x86, ysato

On Tue, Jul 11, 2023 at 05:14:24PM +0200, Christian Brauner wrote:
> On Tue, Jul 11, 2023 at 02:24:51PM +0200, Florian Weimer wrote:
> > * Alexey Gladkov:
> > 
> > > This patch set adds fchmodat4(), a new syscall. The actual
> > > implementation is super simple: essentially it's just the same as
> > > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> > > I've attempted to make this match "man 2 fchmodat" as closely as
> > > possible, which says EINVAL is returned for invalid flags (as opposed to
> > > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
> > > I have a sketch of a glibc patch that I haven't even compiled yet, but
> > > seems fairly straight-forward:
> > >
> > >     diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c
> > >     index 6d9cbc1ce9e0..b1beab76d56c 100644
> > >     --- a/sysdeps/unix/sysv/linux/fchmodat.c
> > >     +++ b/sysdeps/unix/sysv/linux/fchmodat.c
> > >     @@ -29,12 +29,36 @@
> > >      int
> > >      fchmodat (int fd, const char *file, mode_t mode, int flag)
> > >      {
> > >     -  if (flag & ~AT_SYMLINK_NOFOLLOW)
> > >     -    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
> > >     -#ifndef __NR_lchmod		/* Linux so far has no lchmod syscall.  */
> > >     +  /* There are four paths through this code:
> > >     +      - The flags are zero.  In this case it's fine to call fchmodat.
> > >     +      - The flags are non-zero and glibc doesn't have access to
> > >     +	__NR_fchmodat4.  In this case all we can do is emulate the error codes
> > >     +	defined by the glibc interface from userspace.
> > >     +      - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has
> > >     +	fchmodat4.  This is the simplest case, as the fchmodat4 syscall exactly
> > >     +	matches glibc's library interface so it can be called directly.
> > >     +      - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does
> > 
> > If you define __NR_fchmodat4 on all architectures, we can use these
> > constants directly in glibc.  We no longer depend on the UAPI
> > definitions of those constants, to cut down the number of code variants,
> > and to make glibc's system call profile independent of the kernel header
> > version at build time.
> > 
> > Your version is based on 2.31, more recent versions have some reasonable
> > emulation for fchmodat based on /proc/self/fd.  I even wrote a comment
> > describing the same buggy behavior that you witnessed:
> > 
> > +      /* Some Linux versions with some file systems can actually
> > +        change symbolic link permissions via /proc, but this is not
> > +        intentional, and it gives inconsistent results (e.g., error
> > +        return despite mode change).  The expected behavior is that
> > +        symbolic link modes cannot be changed at all, and this check
> > +        enforces that.  */
> > +      if (S_ISLNK (st.st_mode))
> > +       {
> > +         __close_nocancel (pathfd);
> > +         __set_errno (EOPNOTSUPP);
> > +         return -1;
> > +       }
> > 
> > I think there was some kernel discussion about that behavior before, but
> > apparently, it hasn't led to fixes.
> 
> I think I've explained this somewhere else a couple of months ago but
> just in case you weren't on that thread or don't remember and apologies
> if you should already know.
> 
> A lot of filesystem will happily update the mode of a symlink. The VFS
> doesn't do anything to prevent this from happening. This is filesystem
> specific.
> 
> The EOPNOTSUPP you're seeing very likely comes from POSIX ACLs.
> Specifically it comes from filesystems that call posix_acl_chmod(),
> e.g., btrfs via
> 
>         if (!err && attr->ia_valid & ATTR_MODE)
>                 err = posix_acl_chmod(idmap, dentry, inode->i_mode);
> 
> Most filesystems don't implement i_op->set_acl() for POSIX ACLs.
> So posix_acl_chmod() will report EOPNOTSUPP. By the time
> posix_acl_chmod() is called, most filesystems will have finished
> updating the inode. POSIX ACLs also often aren't integrated into
> transactions so a rollback wouldn't even be possible on some
> filesystems.
> 
> Any filesystem that doesn't implement POSIX ACLs at all will obviously
> never fail unless it blocks mode changes on symlinks. Or filesystems
> that do have a way to rollback failures from posix_acl_chmod(), or
> filesystems that do return an error on chmod() on symlinks such as 9p,
> ntfs, ocfs2.
> 
> > 
> > I wonder if it makes sense to add a similar error return to the system
> > call implementation?
> 
> Hm, blocking symlink mode changes is pretty regression prone. And just
> blocking it through one interface seems weird and makes things even more
> inconsistent.
> 
> So two options I see:
> (1) minimally invasive:
>     Filesystems that do call posix_acl_chmod() on symlinks need to be
>     changed to stop doing that.
> (2) might hit us on the head invasive:
>     Try and block symlink mode changes in chmod_common().
> 
> Thoughts?
> 

We have third option. We can choose not to call chmod_common and return an
error right away:

diff --git a/fs/open.c b/fs/open.c
index 39a7939f0d00..86a427a2a083 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -679,7 +679,9 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int l
 retry:
        error = user_path_at(dfd, filename, lookup_flags, &path);
        if (!error) {
-               error = chmod_common(&path, mode);
+               error = -EOPNOTSUPP;
+               if (!(flags & AT_SYMLINK_NOFOLLOW) || !S_ISLNK(path.dentry->d_inode->i_mode))
+                       error = chmod_common(&path, mode);
                path_put(&path);
                if (retry_estale(error, lookup_flags)) {
                        lookup_flags |= LOOKUP_REVAL;

It doesn't seem to be invasive.

-- 
Rgrds, legion


^ permalink raw reply related	[flat|nested] 59+ messages in thread

* Re: [PATCH v3 0/5] Add a new fchmodat4() syscall
  2023-07-25 11:05         ` Alexey Gladkov
@ 2023-07-25 12:05           ` Christian Brauner
  0 siblings, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-25 12:05 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Florian Weimer, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-arm-kernel, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paul.burton, paulus,
	peterz, ralf, rth, sparclinux, stefan, tglx, tony.luck, tycho,
	will, x86, ysato

On Tue, Jul 25, 2023 at 01:05:40PM +0200, Alexey Gladkov wrote:
> On Tue, Jul 11, 2023 at 05:14:24PM +0200, Christian Brauner wrote:
> > On Tue, Jul 11, 2023 at 02:24:51PM +0200, Florian Weimer wrote:
> > > * Alexey Gladkov:
> > > 
> > > > This patch set adds fchmodat4(), a new syscall. The actual
> > > > implementation is super simple: essentially it's just the same as
> > > > fchmodat(), but LOOKUP_FOLLOW is conditionally set based on the flags.
> > > > I've attempted to make this match "man 2 fchmodat" as closely as
> > > > possible, which says EINVAL is returned for invalid flags (as opposed to
> > > > ENOTSUPP, which is currently returned by glibc for AT_SYMLINK_NOFOLLOW).
> > > > I have a sketch of a glibc patch that I haven't even compiled yet, but
> > > > seems fairly straight-forward:
> > > >
> > > >     diff --git a/sysdeps/unix/sysv/linux/fchmodat.c b/sysdeps/unix/sysv/linux/fchmodat.c
> > > >     index 6d9cbc1ce9e0..b1beab76d56c 100644
> > > >     --- a/sysdeps/unix/sysv/linux/fchmodat.c
> > > >     +++ b/sysdeps/unix/sysv/linux/fchmodat.c
> > > >     @@ -29,12 +29,36 @@
> > > >      int
> > > >      fchmodat (int fd, const char *file, mode_t mode, int flag)
> > > >      {
> > > >     -  if (flag & ~AT_SYMLINK_NOFOLLOW)
> > > >     -    return INLINE_SYSCALL_ERROR_RETURN_VALUE (EINVAL);
> > > >     -#ifndef __NR_lchmod		/* Linux so far has no lchmod syscall.  */
> > > >     +  /* There are four paths through this code:
> > > >     +      - The flags are zero.  In this case it's fine to call fchmodat.
> > > >     +      - The flags are non-zero and glibc doesn't have access to
> > > >     +	__NR_fchmodat4.  In this case all we can do is emulate the error codes
> > > >     +	defined by the glibc interface from userspace.
> > > >     +      - The flags are non-zero, glibc has __NR_fchmodat4, and the kernel has
> > > >     +	fchmodat4.  This is the simplest case, as the fchmodat4 syscall exactly
> > > >     +	matches glibc's library interface so it can be called directly.
> > > >     +      - The flags are non-zero, glibc has __NR_fchmodat4, but the kernel does
> > > 
> > > If you define __NR_fchmodat4 on all architectures, we can use these
> > > constants directly in glibc.  We no longer depend on the UAPI
> > > definitions of those constants, to cut down the number of code variants,
> > > and to make glibc's system call profile independent of the kernel header
> > > version at build time.
> > > 
> > > Your version is based on 2.31, more recent versions have some reasonable
> > > emulation for fchmodat based on /proc/self/fd.  I even wrote a comment
> > > describing the same buggy behavior that you witnessed:
> > > 
> > > +      /* Some Linux versions with some file systems can actually
> > > +        change symbolic link permissions via /proc, but this is not
> > > +        intentional, and it gives inconsistent results (e.g., error
> > > +        return despite mode change).  The expected behavior is that
> > > +        symbolic link modes cannot be changed at all, and this check
> > > +        enforces that.  */
> > > +      if (S_ISLNK (st.st_mode))
> > > +       {
> > > +         __close_nocancel (pathfd);
> > > +         __set_errno (EOPNOTSUPP);
> > > +         return -1;
> > > +       }
> > > 
> > > I think there was some kernel discussion about that behavior before, but
> > > apparently, it hasn't led to fixes.
> > 
> > I think I've explained this somewhere else a couple of months ago but
> > just in case you weren't on that thread or don't remember and apologies
> > if you should already know.
> > 
> > A lot of filesystem will happily update the mode of a symlink. The VFS
> > doesn't do anything to prevent this from happening. This is filesystem
> > specific.
> > 
> > The EOPNOTSUPP you're seeing very likely comes from POSIX ACLs.
> > Specifically it comes from filesystems that call posix_acl_chmod(),
> > e.g., btrfs via
> > 
> >         if (!err && attr->ia_valid & ATTR_MODE)
> >                 err = posix_acl_chmod(idmap, dentry, inode->i_mode);
> > 
> > Most filesystems don't implement i_op->set_acl() for POSIX ACLs.
> > So posix_acl_chmod() will report EOPNOTSUPP. By the time
> > posix_acl_chmod() is called, most filesystems will have finished
> > updating the inode. POSIX ACLs also often aren't integrated into
> > transactions so a rollback wouldn't even be possible on some
> > filesystems.
> > 
> > Any filesystem that doesn't implement POSIX ACLs at all will obviously
> > never fail unless it blocks mode changes on symlinks. Or filesystems
> > that do have a way to rollback failures from posix_acl_chmod(), or
> > filesystems that do return an error on chmod() on symlinks such as 9p,
> > ntfs, ocfs2.
> > 
> > > 
> > > I wonder if it makes sense to add a similar error return to the system
> > > call implementation?
> > 
> > Hm, blocking symlink mode changes is pretty regression prone. And just
> > blocking it through one interface seems weird and makes things even more
> > inconsistent.
> > 
> > So two options I see:
> > (1) minimally invasive:
> >     Filesystems that do call posix_acl_chmod() on symlinks need to be
> >     changed to stop doing that.
> > (2) might hit us on the head invasive:
> >     Try and block symlink mode changes in chmod_common().
> > 
> > Thoughts?
> > 
> 
> We have third option. We can choose not to call chmod_common and return an
> error right away:
> 
> diff --git a/fs/open.c b/fs/open.c
> index 39a7939f0d00..86a427a2a083 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -679,7 +679,9 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int l
>  retry:
>         error = user_path_at(dfd, filename, lookup_flags, &path);
>         if (!error) {
> -               error = chmod_common(&path, mode);
> +               error = -EOPNOTSUPP;
> +               if (!(flags & AT_SYMLINK_NOFOLLOW) || !S_ISLNK(path.dentry->d_inode->i_mode))
> +                       error = chmod_common(&path, mode);
>                 path_put(&path);
>                 if (retry_estale(error, lookup_flags)) {
>                         lookup_flags |= LOOKUP_REVAL;
> 
> It doesn't seem to be invasive.

https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?id=77b652535528770217186589d97261847f15f862

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Add fchmodat2() - or add a more general syscall?
  2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
                       ` (6 preceding siblings ...)
  2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
@ 2023-07-25 15:58     ` David Howells
  2023-07-25 16:10       ` Florian Weimer
                         ` (3 more replies)
  7 siblings, 4 replies; 59+ messages in thread
From: David Howells @ 2023-07-25 15:58 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: dhowells, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, fenghua.yu, fweimer, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-ia64, linux-m68k, linux-mips, linux-parisc,
	linux-s390, linux-sh, linux, linuxppc-dev, luto, mattst88, mingo,
	monstr, mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan,
	tglx, tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

Rather than adding a fchmodat2() syscall, should we add a "set_file_attrs()"
syscall that takes a mask and allows you to set a bunch of stuff all in one
go?  Basically, an interface to notify_change() in the kernel that would allow
several stats to be set atomically.  This might be of particular interest to
network filesystems.

David


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
@ 2023-07-25 16:10       ` Florian Weimer
  2023-07-25 16:50       ` Aleksa Sarai
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 59+ messages in thread
From: Florian Weimer @ 2023-07-25 16:10 UTC (permalink / raw)
  To: David Howells
  Cc: Alexey Gladkov, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, fenghua.yu, geert, glebfm, gor, hare, hpa,
	ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

* David Howells:

> Rather than adding a fchmodat2() syscall, should we add a "set_file_attrs()"
> syscall that takes a mask and allows you to set a bunch of stuff all in one
> go?  Basically, an interface to notify_change() in the kernel that would allow
> several stats to be set atomically.  This might be of particular interest to
> network filesystems.

Do you mean atomically as in compare-and-swap (update only if old values
match), or just a way to update multiple file attributes with a single
system call?

Thanks,
Florian


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-11 16:16       ` [PATCH v4 2/5] fs: Add fchmodat2() Alexey Gladkov
  2023-07-11 17:05         ` Christian Brauner
@ 2023-07-25 16:36         ` Aleksa Sarai
  2023-07-26 13:45           ` Alexey Gladkov
  2023-07-27  9:01           ` David Laight
  1 sibling, 2 replies; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-25 16:36 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

[-- Attachment #1: Type: text/plain, Size: 3951 bytes --]

On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> On the userspace side fchmodat(3) is implemented as a wrapper
> function which implements the POSIX-specified interface. This
> interface differs from the underlying kernel system call, which does not
> have a flags argument. Most implementations require procfs [1][2].
> 
> There doesn't appear to be a good userspace workaround for this issue
> but the implementation in the kernel is pretty straight-forward.
> 
> The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> unlike existing fchmodat.
> 
> [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> 
> Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>
> Acked-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  fs/open.c                | 18 ++++++++++++++----
>  include/linux/syscalls.h |  2 ++
>  2 files changed, 16 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/open.c b/fs/open.c
> index 0c55c8e7f837..39a7939f0d00 100644
> --- a/fs/open.c
> +++ b/fs/open.c
> @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
>  	return err;
>  }
>  
> -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)

I think it'd be much neater to do the conversion of AT_ flags here and
pass 0 as a flags argument for all of the wrappers (this is how most of
the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC).

>  {
>  	struct path path;
>  	int error;
> -	unsigned int lookup_flags = LOOKUP_FOLLOW;
> +
>  retry:
>  	error = user_path_at(dfd, filename, lookup_flags, &path);
>  	if (!error) {
> @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
>  	return error;
>  }
>  
> +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
> +		umode_t, mode, int, flags)
> +{
> +	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
> +		return -EINVAL;

We almost certainly want to support AT_EMPTY_PATH at the same time.
Otherwise userspace will still need to go through /proc when trying to
chmod a file handle they have.

> +
> +	return do_fchmodat(dfd, filename, mode,
> +			flags & AT_SYMLINK_NOFOLLOW ? 0 : LOOKUP_FOLLOW);
> +}
> +
>  SYSCALL_DEFINE3(fchmodat, int, dfd, const char __user *, filename,
>  		umode_t, mode)
>  {
> -	return do_fchmodat(dfd, filename, mode);
> +	return do_fchmodat(dfd, filename, mode, LOOKUP_FOLLOW);
>  }
>  
>  SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
>  {
> -	return do_fchmodat(AT_FDCWD, filename, mode);
> +	return do_fchmodat(AT_FDCWD, filename, mode, LOOKUP_FOLLOW);
>  }
>  
>  /*
> diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
> index 584f404bf868..6e852279fbc3 100644
> --- a/include/linux/syscalls.h
> +++ b/include/linux/syscalls.h
> @@ -440,6 +440,8 @@ asmlinkage long sys_chroot(const char __user *filename);
>  asmlinkage long sys_fchmod(unsigned int fd, umode_t mode);
>  asmlinkage long sys_fchmodat(int dfd, const char __user *filename,
>  			     umode_t mode);
> +asmlinkage long sys_fchmodat2(int dfd, const char __user *filename,
> +			     umode_t mode, int flags);
>  asmlinkage long sys_fchownat(int dfd, const char __user *filename, uid_t user,
>  			     gid_t group, int flag);
>  asmlinkage long sys_fchown(unsigned int fd, uid_t user, gid_t group);
> -- 
> 2.33.8
> 

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
  2023-07-11 16:26         ` Arnd Bergmann
  2023-07-25  7:16         ` Geert Uytterhoeven
@ 2023-07-25 16:43         ` Aleksa Sarai
  2023-07-27 10:37           ` Christian Brauner
  2 siblings, 1 reply; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-25 16:43 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

[-- Attachment #1: Type: text/plain, Size: 10980 bytes --]

On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> From: Palmer Dabbelt <palmer@sifive.com>
> 
> This registers the new fchmodat2 syscall in most places as nuber 452,
> with alpha being the exception where it's 562.  I found all these sites
> by grepping for fspick, which I assume has found me everything.

Shouldn't this patch be squashed with the patch that adds the syscall?
At least, that's how I've usually seen it done...

> Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> Signed-off-by: Alexey Gladkov <legion@kernel.org>
> ---
>  arch/alpha/kernel/syscalls/syscall.tbl      | 1 +
>  arch/arm/tools/syscall.tbl                  | 1 +
>  arch/arm64/include/asm/unistd.h             | 2 +-
>  arch/arm64/include/asm/unistd32.h           | 2 ++
>  arch/ia64/kernel/syscalls/syscall.tbl       | 1 +
>  arch/m68k/kernel/syscalls/syscall.tbl       | 1 +
>  arch/microblaze/kernel/syscalls/syscall.tbl | 1 +
>  arch/mips/kernel/syscalls/syscall_n32.tbl   | 1 +
>  arch/mips/kernel/syscalls/syscall_n64.tbl   | 1 +
>  arch/mips/kernel/syscalls/syscall_o32.tbl   | 1 +
>  arch/parisc/kernel/syscalls/syscall.tbl     | 1 +
>  arch/powerpc/kernel/syscalls/syscall.tbl    | 1 +
>  arch/s390/kernel/syscalls/syscall.tbl       | 1 +
>  arch/sh/kernel/syscalls/syscall.tbl         | 1 +
>  arch/sparc/kernel/syscalls/syscall.tbl      | 1 +
>  arch/x86/entry/syscalls/syscall_32.tbl      | 1 +
>  arch/x86/entry/syscalls/syscall_64.tbl      | 1 +
>  arch/xtensa/kernel/syscalls/syscall.tbl     | 1 +
>  include/uapi/asm-generic/unistd.h           | 5 ++++-
>  19 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/syscalls/syscall.tbl
> index 1f13995d00d7..ad37569d0507 100644
> --- a/arch/alpha/kernel/syscalls/syscall.tbl
> +++ b/arch/alpha/kernel/syscalls/syscall.tbl
> @@ -491,3 +491,4 @@
>  559	common  futex_waitv                     sys_futex_waitv
>  560	common	set_mempolicy_home_node		sys_ni_syscall
>  561	common	cachestat			sys_cachestat
> +562	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl
> index 8ebed8a13874..c572d6c3dee0 100644
> --- a/arch/arm/tools/syscall.tbl
> +++ b/arch/arm/tools/syscall.tbl
> @@ -465,3 +465,4 @@
>  449	common	futex_waitv			sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
> index 64a514f90131..bd77253b62e0 100644
> --- a/arch/arm64/include/asm/unistd.h
> +++ b/arch/arm64/include/asm/unistd.h
> @@ -39,7 +39,7 @@
>  #define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE + 5)
>  #define __ARM_NR_COMPAT_END		(__ARM_NR_COMPAT_BASE + 0x800)
>  
> -#define __NR_compat_syscalls		452
> +#define __NR_compat_syscalls		453
>  #endif
>  
>  #define __ARCH_WANT_SYS_CLONE
> diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
> index d952a28463e0..78b68311ec81 100644
> --- a/arch/arm64/include/asm/unistd32.h
> +++ b/arch/arm64/include/asm/unistd32.h
> @@ -909,6 +909,8 @@ __SYSCALL(__NR_futex_waitv, sys_futex_waitv)
>  __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
>  #define __NR_cachestat 451
>  __SYSCALL(__NR_cachestat, sys_cachestat)
> +#define __NR_fchmodat2 452
> +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
>  
>  /*
>   * Please add new compat syscalls above this comment and update
> diff --git a/arch/ia64/kernel/syscalls/syscall.tbl b/arch/ia64/kernel/syscalls/syscall.tbl
> index f8c74ffeeefb..83d8609aec03 100644
> --- a/arch/ia64/kernel/syscalls/syscall.tbl
> +++ b/arch/ia64/kernel/syscalls/syscall.tbl
> @@ -372,3 +372,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/syscalls/syscall.tbl
> index 4f504783371f..259ceb125367 100644
> --- a/arch/m68k/kernel/syscalls/syscall.tbl
> +++ b/arch/m68k/kernel/syscalls/syscall.tbl
> @@ -451,3 +451,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/kernel/syscalls/syscall.tbl
> index 858d22bf275c..a3798c2637fd 100644
> --- a/arch/microblaze/kernel/syscalls/syscall.tbl
> +++ b/arch/microblaze/kernel/syscalls/syscall.tbl
> @@ -457,3 +457,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/syscalls/syscall_n32.tbl
> index 1976317d4e8b..152034b8e0a0 100644
> --- a/arch/mips/kernel/syscalls/syscall_n32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl
> @@ -390,3 +390,4 @@
>  449	n32	futex_waitv			sys_futex_waitv
>  450	n32	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	n32	cachestat			sys_cachestat
> +452	n32	fchmodat2			sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/syscalls/syscall_n64.tbl
> index cfda2511badf..cb5e757f6621 100644
> --- a/arch/mips/kernel/syscalls/syscall_n64.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl
> @@ -366,3 +366,4 @@
>  449	n64	futex_waitv			sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	n64	cachestat			sys_cachestat
> +452	n64	fchmodat2			sys_fchmodat2
> diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/syscalls/syscall_o32.tbl
> index 7692234c3768..1a646813afdc 100644
> --- a/arch/mips/kernel/syscalls/syscall_o32.tbl
> +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl
> @@ -439,3 +439,4 @@
>  449	o32	futex_waitv			sys_futex_waitv
>  450	o32	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	o32	cachestat			sys_cachestat
> +452	o32	fchmodat2			sys_fchmodat2
> diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/syscalls/syscall.tbl
> index a0a9145b6dd4..e97c175b56f9 100644
> --- a/arch/parisc/kernel/syscalls/syscall.tbl
> +++ b/arch/parisc/kernel/syscalls/syscall.tbl
> @@ -450,3 +450,4 @@
>  449	common	futex_waitv			sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl
> index 8c0b08b7a80e..20e50586e8a2 100644
> --- a/arch/powerpc/kernel/syscalls/syscall.tbl
> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl
> @@ -538,3 +538,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450 	nospu	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/syscalls/syscall.tbl
> index a6935af2235c..0122cc156952 100644
> --- a/arch/s390/kernel/syscalls/syscall.tbl
> +++ b/arch/s390/kernel/syscalls/syscall.tbl
> @@ -454,3 +454,4 @@
>  449  common	futex_waitv		sys_futex_waitv			sys_futex_waitv
>  450  common	set_mempolicy_home_node	sys_set_mempolicy_home_node	sys_set_mempolicy_home_node
>  451  common	cachestat		sys_cachestat			sys_cachestat
> +452  common	fchmodat2		sys_fchmodat2			sys_fchmodat2
> diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/syscall.tbl
> index 97377e8c5025..e90d585c4d3e 100644
> --- a/arch/sh/kernel/syscalls/syscall.tbl
> +++ b/arch/sh/kernel/syscalls/syscall.tbl
> @@ -454,3 +454,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/syscalls/syscall.tbl
> index faa835f3c54a..4ed06c71c43f 100644
> --- a/arch/sparc/kernel/syscalls/syscall.tbl
> +++ b/arch/sparc/kernel/syscalls/syscall.tbl
> @@ -497,3 +497,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscalls/syscall_32.tbl
> index bc0a3c941b35..2d0b1bd866ea 100644
> --- a/arch/x86/entry/syscalls/syscall_32.tbl
> +++ b/arch/x86/entry/syscalls/syscall_32.tbl
> @@ -456,3 +456,4 @@
>  449	i386	futex_waitv		sys_futex_waitv
>  450	i386	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	i386	cachestat		sys_cachestat
> +452	i386	fchmodat2		sys_fchmodat2
> diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscalls/syscall_64.tbl
> index 227538b0ce80..814768249eae 100644
> --- a/arch/x86/entry/syscalls/syscall_64.tbl
> +++ b/arch/x86/entry/syscalls/syscall_64.tbl
> @@ -373,6 +373,7 @@
>  449	common	futex_waitv		sys_futex_waitv
>  450	common	set_mempolicy_home_node	sys_set_mempolicy_home_node
>  451	common	cachestat		sys_cachestat
> +452	common	fchmodat2		sys_fchmodat2
>  
>  #
>  # Due to a historical design error, certain syscalls are numbered differently
> diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/syscalls/syscall.tbl
> index 2b69c3c035b6..fc1a4f3c81d9 100644
> --- a/arch/xtensa/kernel/syscalls/syscall.tbl
> +++ b/arch/xtensa/kernel/syscalls/syscall.tbl
> @@ -422,3 +422,4 @@
>  449	common  futex_waitv                     sys_futex_waitv
>  450	common	set_mempolicy_home_node		sys_set_mempolicy_home_node
>  451	common	cachestat			sys_cachestat
> +452	common	fchmodat2			sys_fchmodat2
> diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/unistd.h
> index fd6c1cb585db..abe087c53b4b 100644
> --- a/include/uapi/asm-generic/unistd.h
> +++ b/include/uapi/asm-generic/unistd.h
> @@ -820,8 +820,11 @@ __SYSCALL(__NR_set_mempolicy_home_node, sys_set_mempolicy_home_node)
>  #define __NR_cachestat 451
>  __SYSCALL(__NR_cachestat, sys_cachestat)
>  
> +#define __NR_fchmodat2 452
> +__SYSCALL(__NR_fchmodat2, sys_fchmodat2)
> +
>  #undef __NR_syscalls
> -#define __NR_syscalls 452
> +#define __NR_syscalls 453
>  
>  /*
>   * 32 bit systems traditionally used different
> -- 
> 2.33.8
> 

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
  2023-07-25 16:10       ` Florian Weimer
@ 2023-07-25 16:50       ` Aleksa Sarai
  2023-07-25 18:39       ` David Howells
  2023-07-27  3:57       ` Eric Biggers
  3 siblings, 0 replies; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-25 16:50 UTC (permalink / raw)
  To: David Howells
  Cc: Alexey Gladkov, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, fenghua.yu, fweimer, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-ia64, linux-m68k, linux-mips, linux-parisc,
	linux-s390, linux-sh, linux, linuxppc-dev, luto, mattst88, mingo,
	monstr, mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan,
	tglx, tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

[-- Attachment #1: Type: text/plain, Size: 888 bytes --]

On 2023-07-25, David Howells <dhowells@redhat.com> wrote:
> Rather than adding a fchmodat2() syscall, should we add a "set_file_attrs()"
> syscall that takes a mask and allows you to set a bunch of stuff all in one
> go?  Basically, an interface to notify_change() in the kernel that would allow
> several stats to be set atomically.  This might be of particular interest to
> network filesystems.

Presumably looking something like statx(2) (except hopefully with
extensible structs this time :P)? I think that could also be useful, but
given this is a fairly straight-forward syscall addition (and it also
would resolve the AT_EMPTY_PATH issue for chmod as well as simplify the
glibc wrapper), I think it makes sense to take this and we can do
set_statx(2) separately?

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
  2023-07-25 16:10       ` Florian Weimer
  2023-07-25 16:50       ` Aleksa Sarai
@ 2023-07-25 18:39       ` David Howells
  2023-07-25 18:44         ` Rich Felker
  2023-07-26 13:30         ` Christian Brauner
  2023-07-27  3:57       ` Eric Biggers
  3 siblings, 2 replies; 59+ messages in thread
From: David Howells @ 2023-07-25 18:39 UTC (permalink / raw)
  To: Florian Weimer
  Cc: dhowells, Alexey Gladkov, James.Bottomley, acme,
	alexander.shishkin, axboe, benh, borntraeger, bp,
	catalin.marinas, christian, dalias, davem, deepa.kernel, deller,
	fenghua.yu, geert, glebfm, gor, hare, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-ia64,
	linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh,
	linux, linuxppc-dev, luto, mattst88, mingo, monstr, mpe,
	namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

Florian Weimer <fweimer@redhat.com> wrote:

> > Rather than adding a fchmodat2() syscall, should we add a
> > "set_file_attrs()" syscall that takes a mask and allows you to set a bunch
> > of stuff all in one go?  Basically, an interface to notify_change() in the
> > kernel that would allow several stats to be set atomically.  This might be
> > of particular interest to network filesystems.
> 
> Do you mean atomically as in compare-and-swap (update only if old values
> match), or just a way to update multiple file attributes with a single
> system call?

I was thinking more in terms of the latter.  AFAIK, there aren't any network
filesystems support a CAS interface on file attributes like that.  To be able
to do a CAS operation, we'd need to pass in the old values as well as the new.

Another thing we could look at is doing "create_and_set_attrs()", possibly
allowing it to take a list of xattrs also.

David


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 18:39       ` David Howells
@ 2023-07-25 18:44         ` Rich Felker
  2023-07-26 13:30         ` Christian Brauner
  1 sibling, 0 replies; 59+ messages in thread
From: Rich Felker @ 2023-07-25 18:44 UTC (permalink / raw)
  To: David Howells
  Cc: Florian Weimer, Alexey Gladkov, James.Bottomley, acme,
	alexander.shishkin, axboe, benh, borntraeger, bp,
	catalin.marinas, christian, davem, deepa.kernel, deller,
	fenghua.yu, geert, glebfm, gor, hare, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-ia64,
	linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh,
	linux, linuxppc-dev, luto, mattst88, mingo, monstr, mpe,
	namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

On Tue, Jul 25, 2023 at 07:39:51PM +0100, David Howells wrote:
> Florian Weimer <fweimer@redhat.com> wrote:
> 
> > > Rather than adding a fchmodat2() syscall, should we add a
> > > "set_file_attrs()" syscall that takes a mask and allows you to set a bunch
> > > of stuff all in one go?  Basically, an interface to notify_change() in the
> > > kernel that would allow several stats to be set atomically.  This might be
> > > of particular interest to network filesystems.
> > 
> > Do you mean atomically as in compare-and-swap (update only if old values
> > match), or just a way to update multiple file attributes with a single
> > system call?
> 
> I was thinking more in terms of the latter.  AFAIK, there aren't any network
> filesystems support a CAS interface on file attributes like that.  To be able
> to do a CAS operation, we'd need to pass in the old values as well as the new.
> 
> Another thing we could look at is doing "create_and_set_attrs()", possibly
> allowing it to take a list of xattrs also.

Can we please not let " hey let's invent a new interface to do
something that will be hard for underlying filesystems to even provide
and that nothing needs because there's no standard API to do it" be
the enemy of "fixing a known problem implementing an existing standard
API that just requires a simple, clearly-scoped syscall to do it"?

Rich

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 18:39       ` David Howells
  2023-07-25 18:44         ` Rich Felker
@ 2023-07-26 13:30         ` Christian Brauner
  1 sibling, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-26 13:30 UTC (permalink / raw)
  To: David Howells
  Cc: Florian Weimer, Alexey Gladkov, James.Bottomley, acme,
	alexander.shishkin, axboe, benh, borntraeger, bp,
	catalin.marinas, christian, dalias, davem, deepa.kernel, deller,
	fenghua.yu, geert, glebfm, gor, hare, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-ia64,
	linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh,
	linux, linuxppc-dev, luto, mattst88, mingo, monstr, mpe,
	namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

On Tue, Jul 25, 2023 at 07:39:51PM +0100, David Howells wrote:
> Florian Weimer <fweimer@redhat.com> wrote:
> 
> > > Rather than adding a fchmodat2() syscall, should we add a
> > > "set_file_attrs()" syscall that takes a mask and allows you to set a bunch
> > > of stuff all in one go?  Basically, an interface to notify_change() in the

That system call would likely be blocked in seccomp sandboxes completely
as seccomp cannot filter structs. I don't consider this an argument to
block new good functionality in general as that would mean arbitrarily
limiting us but it is something to keep in mind. If there's additional
benefit other than just being able to set mutliple values at once then
yeah might be something to discuss.

> > > kernel that would allow several stats to be set atomically.  This might be
> > > of particular interest to network filesystems.
> > 
> > Do you mean atomically as in compare-and-swap (update only if old values
> > match), or just a way to update multiple file attributes with a single
> > system call?
> 
> I was thinking more in terms of the latter.  AFAIK, there aren't any network
> filesystems support a CAS interface on file attributes like that.  To be able
> to do a CAS operation, we'd need to pass in the old values as well as the new.
> 
> Another thing we could look at is doing "create_and_set_attrs()", possibly
> allowing it to take a list of xattrs also.

That would likely require variable sized pointers in a struct which is
something we really try to stay away from. I also think it's not a good
idea to lump xattrs toegether with generic file attributes. They should
remain a separate api imho.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-25 16:36         ` Aleksa Sarai
@ 2023-07-26 13:45           ` Alexey Gladkov
  2023-07-27 10:26             ` Christian Brauner
  2023-07-27 17:12             ` Aleksa Sarai
  2023-07-27  9:01           ` David Laight
  1 sibling, 2 replies; 59+ messages in thread
From: Alexey Gladkov @ 2023-07-26 13:45 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote:
> On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> > On the userspace side fchmodat(3) is implemented as a wrapper
> > function which implements the POSIX-specified interface. This
> > interface differs from the underlying kernel system call, which does not
> > have a flags argument. Most implementations require procfs [1][2].
> > 
> > There doesn't appear to be a good userspace workaround for this issue
> > but the implementation in the kernel is pretty straight-forward.
> > 
> > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > unlike existing fchmodat.
> > 
> > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > 
> > Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
> > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > ---
> >  fs/open.c                | 18 ++++++++++++++----
> >  include/linux/syscalls.h |  2 ++
> >  2 files changed, 16 insertions(+), 4 deletions(-)
> > 
> > diff --git a/fs/open.c b/fs/open.c
> > index 0c55c8e7f837..39a7939f0d00 100644
> > --- a/fs/open.c
> > +++ b/fs/open.c
> > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> >  	return err;
> >  }
> >  
> > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
> 
> I think it'd be much neater to do the conversion of AT_ flags here and
> pass 0 as a flags argument for all of the wrappers (this is how most of
> the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC).

I just addressed the Al Viro's suggestion.

https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/

> >  {
> >  	struct path path;
> >  	int error;
> > -	unsigned int lookup_flags = LOOKUP_FOLLOW;
> > +
> >  retry:
> >  	error = user_path_at(dfd, filename, lookup_flags, &path);
> >  	if (!error) {
> > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> >  	return error;
> >  }
> >  
> > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
> > +		umode_t, mode, int, flags)
> > +{
> > +	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
> > +		return -EINVAL;
> 
> We almost certainly want to support AT_EMPTY_PATH at the same time.
> Otherwise userspace will still need to go through /proc when trying to
> chmod a file handle they have.

I'm not sure I understand. Can you explain what you mean?

-- 
Rgrds, legion


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
                         ` (2 preceding siblings ...)
  2023-07-25 18:39       ` David Howells
@ 2023-07-27  3:57       ` Eric Biggers
  2023-07-27 10:27         ` Christian Brauner
  3 siblings, 1 reply; 59+ messages in thread
From: Eric Biggers @ 2023-07-27  3:57 UTC (permalink / raw)
  To: David Howells
  Cc: Alexey Gladkov, James.Bottomley, acme, alexander.shishkin, axboe,
	benh, borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, fenghua.yu, fweimer, geert, glebfm, gor,
	hare, hpa, ink, jhogan, kim.phillips, ldv, linux-alpha,
	linux-arch, linux-ia64, linux-m68k, linux-mips, linux-parisc,
	linux-s390, linux-sh, linux, linuxppc-dev, luto, mattst88, mingo,
	monstr, mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan,
	tglx, tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

On Tue, Jul 25, 2023 at 04:58:34PM +0100, David Howells wrote:
> Rather than adding a fchmodat2() syscall, should we add a "set_file_attrs()"
> syscall that takes a mask and allows you to set a bunch of stuff all in one
> go?  Basically, an interface to notify_change() in the kernel that would allow
> several stats to be set atomically.  This might be of particular interest to
> network filesystems.
> 
> David
> 

fchmodat2() is a simple addition that fits well with the existing syscalls.
It fixes an oversight in fchmodat().

IMO we should just add fchmodat2(), and not get sidetracked by trying to add
some super-generalized syscall instead.  That can always be done later.

- Eric

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-25 16:36         ` Aleksa Sarai
  2023-07-26 13:45           ` Alexey Gladkov
@ 2023-07-27  9:01           ` David Laight
  2023-07-27 16:28             ` Andreas Schwab
  2023-07-27 16:31             ` dalias
  1 sibling, 2 replies; 59+ messages in thread
From: David Laight @ 2023-07-27  9:01 UTC (permalink / raw)
  To: 'Aleksa Sarai', Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

From: Aleksa Sarai
> Sent: 25 July 2023 17:36
...
> We almost certainly want to support AT_EMPTY_PATH at the same time.
> Otherwise userspace will still need to go through /proc when trying to
> chmod a file handle they have.

That can't be allowed.

Just because a process has a file open and write access to
the directory that contains it doesn't mean they are allowed
to change the file permissions.

They also need directory search access from a directory
they have open through to the containing directory.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-26 13:45           ` Alexey Gladkov
@ 2023-07-27 10:26             ` Christian Brauner
  2023-07-27 17:12             ` Aleksa Sarai
  1 sibling, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-27 10:26 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: Aleksa Sarai, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

> > I think it'd be much neater to do the conversion of AT_ flags here and
> > pass 0 as a flags argument for all of the wrappers (this is how most of
> > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC).

I've fixed that up in-tree.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: Add fchmodat2() - or add a more general syscall?
  2023-07-27  3:57       ` Eric Biggers
@ 2023-07-27 10:27         ` Christian Brauner
  0 siblings, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-27 10:27 UTC (permalink / raw)
  To: Eric Biggers
  Cc: David Howells, Alexey Gladkov, James.Bottomley, acme,
	alexander.shishkin, axboe, benh, borntraeger, bp,
	catalin.marinas, christian, dalias, davem, deepa.kernel, deller,
	fenghua.yu, fweimer, geert, glebfm, gor, hare, hpa, ink, jhogan,
	kim.phillips, ldv, linux-alpha, linux-arch, linux-ia64,
	linux-m68k, linux-mips, linux-parisc, linux-s390, linux-sh,
	linux, linuxppc-dev, luto, mattst88, mingo, monstr, mpe,
	namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, LKML, Arnd Bergmann,
	linux-api, linux-fsdevel, viro

On Wed, Jul 26, 2023 at 08:57:10PM -0700, Eric Biggers wrote:
> On Tue, Jul 25, 2023 at 04:58:34PM +0100, David Howells wrote:
> > Rather than adding a fchmodat2() syscall, should we add a "set_file_attrs()"
> > syscall that takes a mask and allows you to set a bunch of stuff all in one
> > go?  Basically, an interface to notify_change() in the kernel that would allow
> > several stats to be set atomically.  This might be of particular interest to
> > network filesystems.
> > 
> > David
> > 
> 
> fchmodat2() is a simple addition that fits well with the existing syscalls.
> It fixes an oversight in fchmodat().
> 
> IMO we should just add fchmodat2(), and not get sidetracked by trying to add
> some super-generalized syscall instead.  That can always be done later.

Agreed.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-25 16:43         ` Aleksa Sarai
@ 2023-07-27 10:37           ` Christian Brauner
  2023-07-27 17:42             ` Aleksa Sarai
  0 siblings, 1 reply; 59+ messages in thread
From: Christian Brauner @ 2023-07-27 10:37 UTC (permalink / raw)
  To: Aleksa Sarai
  Cc: Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin,
	axboe, benh, borntraeger, bp, catalin.marinas, christian, dalias,
	davem, deepa.kernel, deller, dhowells, fenghua.yu, fweimer,
	geert, glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

On Wed, Jul 26, 2023 at 02:43:41AM +1000, Aleksa Sarai wrote:
> On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> > From: Palmer Dabbelt <palmer@sifive.com>
> > 
> > This registers the new fchmodat2 syscall in most places as nuber 452,
> > with alpha being the exception where it's 562.  I found all these sites
> > by grepping for fspick, which I assume has found me everything.
> 
> Shouldn't this patch be squashed with the patch that adds the syscall?
> At least, that's how I've usually seen it done...

Depends. Iirc, someone said they'd prefer for doing it in one patch
in some circumstances on some system call we added years ago. But otoh,
having the syscall wiring done separately makes it easy for arch
maintainers to ack only the wiring up part. Both ways are valid imho.
(cachestat() did it for x86 and then all the others separately. So
really it seems a bit all over the place depending on the scenario.)

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27  9:01           ` David Laight
@ 2023-07-27 16:28             ` Andreas Schwab
  2023-07-27 17:02               ` Christian Brauner
  2023-07-27 16:31             ` dalias
  1 sibling, 1 reply; 59+ messages in thread
From: Andreas Schwab @ 2023-07-27 16:28 UTC (permalink / raw)
  To: David Laight
  Cc: 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

On Jul 27 2023, David Laight wrote:

> From: Aleksa Sarai
>> Sent: 25 July 2023 17:36
> ...
>> We almost certainly want to support AT_EMPTY_PATH at the same time.
>> Otherwise userspace will still need to go through /proc when trying to
>> chmod a file handle they have.
>
> That can't be allowed.

IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd,
m).  With that, new architectures only need to implement the fchmodat2
syscall to cover all chmod variants.

-- 
Andreas Schwab, schwab@linux-m68k.org
GPG Key fingerprint = 7578 EB47 D4E5 4D69 2510  2552 DF73 E780 A9DA AEC1
"And now for something completely different."

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27  9:01           ` David Laight
  2023-07-27 16:28             ` Andreas Schwab
@ 2023-07-27 16:31             ` dalias
  1 sibling, 0 replies; 59+ messages in thread
From: dalias @ 2023-07-27 16:31 UTC (permalink / raw)
  To: David Laight
  Cc: 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, Palmer Dabbelt

On Thu, Jul 27, 2023 at 09:01:06AM +0000, David Laight wrote:
> From: Aleksa Sarai
> > Sent: 25 July 2023 17:36
> ....
> > We almost certainly want to support AT_EMPTY_PATH at the same time.
> > Otherwise userspace will still need to go through /proc when trying to
> > chmod a file handle they have.
> 
> That can't be allowed.
> 
> Just because a process has a file open and write access to
> the directory that contains it doesn't mean they are allowed
> to change the file permissions.
> 
> They also need directory search access from a directory
> they have open through to the containing directory.

Am I missing something? How is this different from fchmod?

Rich

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27 16:28             ` Andreas Schwab
@ 2023-07-27 17:02               ` Christian Brauner
  2023-07-27 17:13                 ` dalias
  0 siblings, 1 reply; 59+ messages in thread
From: Christian Brauner @ 2023-07-27 17:02 UTC (permalink / raw)
  To: Andreas Schwab
  Cc: David Laight, 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote:
> On Jul 27 2023, David Laight wrote:
> 
> > From: Aleksa Sarai
> >> Sent: 25 July 2023 17:36
> > ...
> >> We almost certainly want to support AT_EMPTY_PATH at the same time.
> >> Otherwise userspace will still need to go through /proc when trying to
> >> chmod a file handle they have.
> >
> > That can't be allowed.
> 
> IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd,
> m).  With that, new architectures only need to implement the fchmodat2
> syscall to cover all chmod variants.

There's a difference though as fchmod() doesn't work with O_PATH file
descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't
work with O_PATH file descriptors.

However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason
to not allow it for fchmodat2().

But it's a bit of a shame that O_PATH looks less and less like O_PATH.
It came from can-do-barely-anything to can-do-quite-a-lot-of-things over
the years.

In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on
top.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-26 13:45           ` Alexey Gladkov
  2023-07-27 10:26             ` Christian Brauner
@ 2023-07-27 17:12             ` Aleksa Sarai
  2023-07-27 17:39               ` Aleksa Sarai
  1 sibling, 1 reply; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-27 17:12 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

[-- Attachment #1: Type: text/plain, Size: 4000 bytes --]

On 2023-07-26, Alexey Gladkov <legion@kernel.org> wrote:
> On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote:
> > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > function which implements the POSIX-specified interface. This
> > > interface differs from the underlying kernel system call, which does not
> > > have a flags argument. Most implementations require procfs [1][2].
> > > 
> > > There doesn't appear to be a good userspace workaround for this issue
> > > but the implementation in the kernel is pretty straight-forward.
> > > 
> > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > unlike existing fchmodat.
> > > 
> > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > 
> > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
> > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > ---
> > >  fs/open.c                | 18 ++++++++++++++----
> > >  include/linux/syscalls.h |  2 ++
> > >  2 files changed, 16 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/fs/open.c b/fs/open.c
> > > index 0c55c8e7f837..39a7939f0d00 100644
> > > --- a/fs/open.c
> > > +++ b/fs/open.c
> > > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> > >  	return err;
> > >  }
> > >  
> > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
> > 
> > I think it'd be much neater to do the conversion of AT_ flags here and
> > pass 0 as a flags argument for all of the wrappers (this is how most of
> > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC).
> 
> I just addressed the Al Viro's suggestion.
> 
> https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/

I think Al misspoke, because he also said "pass it 0 as an extra
argument", but you actually have to pass LOOKUP_FOLLOW from the
wrappers. If you look at how faccessat2 and faccessat are implemented,
it follows the behaviour I described.

> > >  {
> > >  	struct path path;
> > >  	int error;
> > > -	unsigned int lookup_flags = LOOKUP_FOLLOW;
> > > +
> > >  retry:
> > >  	error = user_path_at(dfd, filename, lookup_flags, &path);
> > >  	if (!error) {
> > > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > >  	return error;
> > >  }
> > >  
> > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
> > > +		umode_t, mode, int, flags)
> > > +{
> > > +	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
> > > +		return -EINVAL;
> > 
> > We almost certainly want to support AT_EMPTY_PATH at the same time.
> > Otherwise userspace will still need to go through /proc when trying to
> > chmod a file handle they have.
> 
> I'm not sure I understand. Can you explain what you mean?

You should add support for AT_EMPTY_PATH (LOOKUP_EMPTY) as well as
AT_SYMLINK_NOFOLLOW. It would only require something like:

	unsigned int lookup_flags = LOOKUP_FOLLOW;

	if (flags & ~(AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW))
		return -EINVAL;

	if (flags & AT_EMPTY_PATH)
		lookup_flags |= LOOKUP_EMPTY;
	if (flags & AT_SYMLINK_NOFOLLOW)
		lookup_flags &= ~LOOKUP_FOLLOW;

	/* ... */

This would be effectively equivalent to fchmod(fd, mode). (I was wrong
when I said this wasn't already possible -- I forgot about fchmod(2).)

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27 17:02               ` Christian Brauner
@ 2023-07-27 17:13                 ` dalias
  2023-07-27 17:36                   ` Christian Brauner
  0 siblings, 1 reply; 59+ messages in thread
From: dalias @ 2023-07-27 17:13 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Andreas Schwab, David Laight, 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, Palmer Dabbelt

On Thu, Jul 27, 2023 at 07:02:53PM +0200, Christian Brauner wrote:
> On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote:
> > On Jul 27 2023, David Laight wrote:
> > 
> > > From: Aleksa Sarai
> > >> Sent: 25 July 2023 17:36
> > > ...
> > >> We almost certainly want to support AT_EMPTY_PATH at the same time.
> > >> Otherwise userspace will still need to go through /proc when trying to
> > >> chmod a file handle they have.
> > >
> > > That can't be allowed.
> > 
> > IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd,
> > m).  With that, new architectures only need to implement the fchmodat2
> > syscall to cover all chmod variants.
> 
> There's a difference though as fchmod() doesn't work with O_PATH file
> descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't
> work with O_PATH file descriptors.
> 
> However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason
> to not allow it for fchmodat2().
> 
> But it's a bit of a shame that O_PATH looks less and less like O_PATH.
> It came from can-do-barely-anything to can-do-quite-a-lot-of-things over
> the years.
> 
> In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on
> top.

From a standpoint of implementing O_SEARCH/O_EXEC using it, I don't
see any reason fchown/fchmod should not work on O_PATH file
descriptors. And indeed when you have procfs available to emulate them
via procfs, it already does. So I don't see this as unwanted
functionality or an access control regression. I see it as things
behaving as expected.

Semantically, O_PATH is a reference to the inode, not to the dirent.
So there is no reason you should not be able to do things that need
permission to the inode (changing permissions on it) rather than to
the dirent (renaming/moving).

Rich

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27 17:13                 ` dalias
@ 2023-07-27 17:36                   ` Christian Brauner
  0 siblings, 0 replies; 59+ messages in thread
From: Christian Brauner @ 2023-07-27 17:36 UTC (permalink / raw)
  To: dalias
  Cc: Andreas Schwab, David Laight, 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, Palmer Dabbelt

On Thu, Jul 27, 2023 at 01:13:37PM -0400, dalias@libc.org wrote:
> On Thu, Jul 27, 2023 at 07:02:53PM +0200, Christian Brauner wrote:
> > On Thu, Jul 27, 2023 at 06:28:53PM +0200, Andreas Schwab wrote:
> > > On Jul 27 2023, David Laight wrote:
> > > 
> > > > From: Aleksa Sarai
> > > >> Sent: 25 July 2023 17:36
> > > > ...
> > > >> We almost certainly want to support AT_EMPTY_PATH at the same time.
> > > >> Otherwise userspace will still need to go through /proc when trying to
> > > >> chmod a file handle they have.
> > > >
> > > > That can't be allowed.
> > > 
> > > IIUC, fchmodat2(fd, "", m, AT_EMPTY_PATH) is equivalent to fchmod(fd,
> > > m).  With that, new architectures only need to implement the fchmodat2
> > > syscall to cover all chmod variants.
> > 
> > There's a difference though as fchmod() doesn't work with O_PATH file
> > descriptors while AT_EMPTY_PATH does. Similar to how fchown() doesn't
> > work with O_PATH file descriptors.
> > 
> > However, we do allow AT_EMPTY_PATH with fchownat() so there's no reason
> > to not allow it for fchmodat2().
> > 
> > But it's a bit of a shame that O_PATH looks less and less like O_PATH.
> > It came from can-do-barely-anything to can-do-quite-a-lot-of-things over
> > the years.
> > 
> > In any case, AT_EMPTY_PATH for fchmodat2() can be an additional patch on
> > top.
> 
> From a standpoint of implementing O_SEARCH/O_EXEC using it, I don't
> see any reason fchown/fchmod should not work on O_PATH file
> descriptors. And indeed when you have procfs available to emulate them
> via procfs, it already does. So I don't see this as unwanted

I'm really not talking about the fact that proc is a giant loophole for
basically everyhing related to O_PATH and reopening fds.

I'm saying that both fchmod() and fchown() don't work on O_PATH fds.
They explicitly reject them.

AT_EMPTY_PATH and therefore O_PATH for fchmodat2() is fine given that we
do it for fchownat() already.

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27 17:12             ` Aleksa Sarai
@ 2023-07-27 17:39               ` Aleksa Sarai
  2023-07-28  8:43                 ` David Laight
  0 siblings, 1 reply; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-27 17:39 UTC (permalink / raw)
  To: Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

[-- Attachment #1: Type: text/plain, Size: 5175 bytes --]

On 2023-07-28, Aleksa Sarai <cyphar@cyphar.com> wrote:
> On 2023-07-26, Alexey Gladkov <legion@kernel.org> wrote:
> > On Wed, Jul 26, 2023 at 02:36:25AM +1000, Aleksa Sarai wrote:
> > > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> > > > On the userspace side fchmodat(3) is implemented as a wrapper
> > > > function which implements the POSIX-specified interface. This
> > > > interface differs from the underlying kernel system call, which does not
> > > > have a flags argument. Most implementations require procfs [1][2].
> > > > 
> > > > There doesn't appear to be a good userspace workaround for this issue
> > > > but the implementation in the kernel is pretty straight-forward.
> > > > 
> > > > The new fchmodat2() syscall allows to pass the AT_SYMLINK_NOFOLLOW flag,
> > > > unlike existing fchmodat.
> > > > 
> > > > [1] https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/fchmodat.c;h=17eca54051ee28ba1ec3f9aed170a62630959143;hb=a492b1e5ef7ab50c6fdd4e4e9879ea5569ab0a6c#l35
> > > > [2] https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c?id=718f363bc2067b6487900eddc9180c84e7739f80#n28
> > > > 
> > > > Co-developed-by: Palmer Dabbelt <palmer@sifive.com>
> > > > Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
> > > > Signed-off-by: Alexey Gladkov <legion@kernel.org>
> > > > Acked-by: Arnd Bergmann <arnd@arndb.de>
> > > > ---
> > > >  fs/open.c                | 18 ++++++++++++++----
> > > >  include/linux/syscalls.h |  2 ++
> > > >  2 files changed, 16 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/fs/open.c b/fs/open.c
> > > > index 0c55c8e7f837..39a7939f0d00 100644
> > > > --- a/fs/open.c
> > > > +++ b/fs/open.c
> > > > @@ -671,11 +671,11 @@ SYSCALL_DEFINE2(fchmod, unsigned int, fd, umode_t, mode)
> > > >  	return err;
> > > >  }
> > > >  
> > > > -static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > > > +static int do_fchmodat(int dfd, const char __user *filename, umode_t mode, int lookup_flags)
> > > 
> > > I think it'd be much neater to do the conversion of AT_ flags here and
> > > pass 0 as a flags argument for all of the wrappers (this is how most of
> > > the other xyz(), fxyz(), fxyzat() syscall wrappers are done IIRC).
> > 
> > I just addressed the Al Viro's suggestion.
> > 
> > https://lore.kernel.org/lkml/20190717014802.GS17978@ZenIV.linux.org.uk/
> 
> I think Al misspoke, because he also said "pass it 0 as an extra
> argument", but you actually have to pass LOOKUP_FOLLOW from the
> wrappers. If you look at how faccessat2 and faccessat are implemented,
> it follows the behaviour I described.
> 
> > > >  {
> > > >  	struct path path;
> > > >  	int error;
> > > > -	unsigned int lookup_flags = LOOKUP_FOLLOW;
> > > > +
> > > >  retry:
> > > >  	error = user_path_at(dfd, filename, lookup_flags, &path);
> > > >  	if (!error) {
> > > > @@ -689,15 +689,25 @@ static int do_fchmodat(int dfd, const char __user *filename, umode_t mode)
> > > >  	return error;
> > > >  }
> > > >  
> > > > +SYSCALL_DEFINE4(fchmodat2, int, dfd, const char __user *, filename,
> > > > +		umode_t, mode, int, flags)
> > > > +{
> > > > +	if (unlikely(flags & ~AT_SYMLINK_NOFOLLOW))
> > > > +		return -EINVAL;
> > > 
> > > We almost certainly want to support AT_EMPTY_PATH at the same time.
> > > Otherwise userspace will still need to go through /proc when trying to
> > > chmod a file handle they have.
> > 
> > I'm not sure I understand. Can you explain what you mean?
> 
> You should add support for AT_EMPTY_PATH (LOOKUP_EMPTY) as well as
> AT_SYMLINK_NOFOLLOW. It would only require something like:
> 
> 	unsigned int lookup_flags = LOOKUP_FOLLOW;
> 
> 	if (flags & ~(AT_EMPTY_PATH | AT_SYMLINK_NOFOLLOW))
> 		return -EINVAL;
> 
> 	if (flags & AT_EMPTY_PATH)
> 		lookup_flags |= LOOKUP_EMPTY;
> 	if (flags & AT_SYMLINK_NOFOLLOW)
> 		lookup_flags &= ~LOOKUP_FOLLOW;
> 
> 	/* ... */
> 
> This would be effectively equivalent to fchmod(fd, mode). (I was wrong
> when I said this wasn't already possible -- I forgot about fchmod(2).)

... with the exception (as Christian mentioned) of O_PATH descriptors.
However, there are two counter-points to this:

 * fchownat(AT_EMPTY_PATH) exists but fchown() doesn't work on O_PATH
   descriptors *by design* (according to open(2)).
 * chmod(/proc/self/fd/$n) works on O_PATH descriptors, meaning this
   behaviour is already allowed and all that AT_EMPTY_PATH would do is
   allow programs to avoid depending on procfs for this.

FWIW, I agree with Christian that these behaviours are not ideal (and
I'm working on a series that might allow for these things to be properly
blocked in the future) but there's also the consistency argument -- I
don't think fchownat() is much safer to allow in this way than
fchmodat() and (again) this behaviour is already possible through
procfs.

Ultimately, we can always add AT_EMPTY_PATH later. It just seemed like
an obvious omission to me that would be easy to resolve.

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452
  2023-07-27 10:37           ` Christian Brauner
@ 2023-07-27 17:42             ` Aleksa Sarai
  0 siblings, 0 replies; 59+ messages in thread
From: Aleksa Sarai @ 2023-07-27 17:42 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, Palmer Dabbelt, James.Bottomley, acme, alexander.shishkin,
	axboe, benh, borntraeger, bp, catalin.marinas, christian, dalias,
	davem, deepa.kernel, deller, dhowells, fenghua.yu, fweimer,
	geert, glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato

[-- Attachment #1: Type: text/plain, Size: 1162 bytes --]

On 2023-07-27, Christian Brauner <brauner@kernel.org> wrote:
> On Wed, Jul 26, 2023 at 02:43:41AM +1000, Aleksa Sarai wrote:
> > On 2023-07-11, Alexey Gladkov <legion@kernel.org> wrote:
> > > From: Palmer Dabbelt <palmer@sifive.com>
> > > 
> > > This registers the new fchmodat2 syscall in most places as nuber 452,
> > > with alpha being the exception where it's 562.  I found all these sites
> > > by grepping for fspick, which I assume has found me everything.
> > 
> > Shouldn't this patch be squashed with the patch that adds the syscall?
> > At least, that's how I've usually seen it done...
> 
> Depends. Iirc, someone said they'd prefer for doing it in one patch
> in some circumstances on some system call we added years ago. But otoh,
> having the syscall wiring done separately makes it easy for arch
> maintainers to ack only the wiring up part. Both ways are valid imho.
> (cachestat() did it for x86 and then all the others separately. So
> really it seems a bit all over the place depending on the scenario.)

Fair enough!

-- 
Aleksa Sarai
Senior Software Engineer (Containers)
SUSE Linux GmbH
<https://www.cyphar.com/>

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 59+ messages in thread

* RE: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-27 17:39               ` Aleksa Sarai
@ 2023-07-28  8:43                 ` David Laight
  2023-07-28 18:42                   ` dalias
  0 siblings, 1 reply; 59+ messages in thread
From: David Laight @ 2023-07-28  8:43 UTC (permalink / raw)
  To: 'Aleksa Sarai', Alexey Gladkov
  Cc: LKML, Arnd Bergmann, linux-api, linux-fsdevel, viro,
	James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, dalias, davem,
	deepa.kernel, deller, dhowells, fenghua.yu, fweimer, geert,
	glebfm, gor, hare, hpa, ink, jhogan, kim.phillips, ldv,
	linux-alpha, linux-arch, linux-ia64, linux-m68k, linux-mips,
	linux-parisc, linux-s390, linux-sh, linux, linuxppc-dev, luto,
	mattst88, mingo, monstr, mpe, namhyung, paulus, peterz, ralf,
	sparclinux, stefan, tglx, tony.luck, tycho, will, x86, ysato,
	Palmer Dabbelt

...
> FWIW, I agree with Christian that these behaviours are not ideal (and
> I'm working on a series that might allow for these things to be properly
> blocked in the future) but there's also the consistency argument -- I
> don't think fchownat() is much safer to allow in this way than
> fchmodat() and (again) this behaviour is already possible through
> procfs.

If the 'through procfs' involves readlink("/proc/self/fd/n") and
accessing through the returned path then the permission checks
are different.
Using the returned path requires search permissions on all the
directories.

This is all fine for xxxat() functions where a real open
directory fd is supplied.
But other cases definitely need a lot of thought to ensure
they don't let programs acquire permissions they aren't
supposed to have.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 59+ messages in thread

* Re: [PATCH v4 2/5] fs: Add fchmodat2()
  2023-07-28  8:43                 ` David Laight
@ 2023-07-28 18:42                   ` dalias
  0 siblings, 0 replies; 59+ messages in thread
From: dalias @ 2023-07-28 18:42 UTC (permalink / raw)
  To: David Laight
  Cc: 'Aleksa Sarai',
	Alexey Gladkov, LKML, Arnd Bergmann, linux-api, linux-fsdevel,
	viro, James.Bottomley, acme, alexander.shishkin, axboe, benh,
	borntraeger, bp, catalin.marinas, christian, davem, deepa.kernel,
	deller, dhowells, fenghua.yu, fweimer, geert, glebfm, gor, hare,
	hpa, ink, jhogan, kim.phillips, ldv, linux-alpha, linux-arch,
	linux-ia64, linux-m68k, linux-mips, linux-parisc, linux-s390,
	linux-sh, linux, linuxppc-dev, luto, mattst88, mingo, monstr,
	mpe, namhyung, paulus, peterz, ralf, sparclinux, stefan, tglx,
	tony.luck, tycho, will, x86, ysato, Palmer Dabbelt

On Fri, Jul 28, 2023 at 08:43:58AM +0000, David Laight wrote:
> ....
> > FWIW, I agree with Christian that these behaviours are not ideal (and
> > I'm working on a series that might allow for these things to be properly
> > blocked in the future) but there's also the consistency argument -- I
> > don't think fchownat() is much safer to allow in this way than
> > fchmodat() and (again) this behaviour is already possible through
> > procfs.
> 
> If the 'through procfs' involves readlink("/proc/self/fd/n") and
> accessing through the returned path then the permission checks
> are different.
> Using the returned path requires search permissions on all the
> directories.

That's *not* how "through procfs" works. The "magic symlinks" in
/proc/*/fd are not actual symlinks that get dereferenced to the
contents they readlink() to, but special-type objects that dereference
directly to the underlying file associated with the open file
description.

Rich

^ permalink raw reply	[flat|nested] 59+ messages in thread

end of thread, other threads:[~2023-07-28 18:42 UTC | newest]

Thread overview: 59+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20190717012719.5524-1-palmer@sifive.com>
2020-06-09 13:52 ` Add a new fchmodat4() syscall, v2 Florian Weimer
2023-07-11 11:25   ` [PATCH v3 0/5] Add a new fchmodat4() syscall Alexey Gladkov
2023-07-11 11:25     ` [PATCH v3 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
2023-07-11 11:32       ` Arnd Bergmann
2023-07-11 11:25     ` [PATCH v3 2/5] fs: Add fchmodat4() Alexey Gladkov
2023-07-11 11:42       ` Arnd Bergmann
2023-07-11 11:52         ` Christian Brauner
2023-07-11 12:51           ` Alexey Gladkov
2023-07-11 14:01             ` Christian Brauner
2023-07-11 15:23               ` Alexey Gladkov
2023-07-11 12:28       ` Matthew Wilcox
2023-07-11 12:49         ` Alexey Gladkov
2023-07-11 11:25     ` [PATCH v3 3/5] arch: Register fchmodat4, usually as syscall 451 Alexey Gladkov
2023-07-11 11:31       ` Arnd Bergmann
2023-07-11 11:25     ` [PATCH v3 4/5] tools headers UAPI: Sync files changed by new fchmodat4 syscall Alexey Gladkov
2023-07-11 11:25     ` [PATCH v3 5/5] selftests: add fchmodat4(2) selftest Alexey Gladkov
2023-07-11 12:10       ` Florian Weimer
2023-07-11 13:38         ` Alexey Gladkov
2023-07-11 12:24     ` [PATCH v3 0/5] Add a new fchmodat4() syscall Florian Weimer
2023-07-11 15:14       ` Christian Brauner
2023-07-25 11:05         ` Alexey Gladkov
2023-07-25 12:05           ` Christian Brauner
2023-07-11 16:16     ` [PATCH v4 0/5] Add a new fchmodat2() syscall Alexey Gladkov
2023-07-11 16:16       ` [PATCH v4 1/5] Non-functional cleanup of a "__user * filename" Alexey Gladkov
2023-07-11 16:16       ` [PATCH v4 2/5] fs: Add fchmodat2() Alexey Gladkov
2023-07-11 17:05         ` Christian Brauner
2023-07-25 16:36         ` Aleksa Sarai
2023-07-26 13:45           ` Alexey Gladkov
2023-07-27 10:26             ` Christian Brauner
2023-07-27 17:12             ` Aleksa Sarai
2023-07-27 17:39               ` Aleksa Sarai
2023-07-28  8:43                 ` David Laight
2023-07-28 18:42                   ` dalias
2023-07-27  9:01           ` David Laight
2023-07-27 16:28             ` Andreas Schwab
2023-07-27 17:02               ` Christian Brauner
2023-07-27 17:13                 ` dalias
2023-07-27 17:36                   ` Christian Brauner
2023-07-27 16:31             ` dalias
2023-07-11 16:16       ` [PATCH v4 3/5] arch: Register fchmodat2, usually as syscall 452 Alexey Gladkov
2023-07-11 16:26         ` Arnd Bergmann
2023-07-25  7:16         ` Geert Uytterhoeven
2023-07-25 16:43         ` Aleksa Sarai
2023-07-27 10:37           ` Christian Brauner
2023-07-27 17:42             ` Aleksa Sarai
2023-07-11 16:16       ` [PATCH v4 4/5] tools headers UAPI: Sync files changed by new fchmodat2 syscall Alexey Gladkov
2023-07-11 17:19         ` Namhyung Kim
2023-07-11 17:23           ` Alexey Gladkov
2023-07-11 16:16       ` [PATCH v4 5/5] selftests: Add fchmodat2 selftest Alexey Gladkov
2023-07-11 17:36       ` (subset) [PATCH v4 0/5] Add a new fchmodat2() syscall Christian Brauner
2023-07-12  2:42       ` Rich Felker
2023-07-25 15:58     ` Add fchmodat2() - or add a more general syscall? David Howells
2023-07-25 16:10       ` Florian Weimer
2023-07-25 16:50       ` Aleksa Sarai
2023-07-25 18:39       ` David Howells
2023-07-25 18:44         ` Rich Felker
2023-07-26 13:30         ` Christian Brauner
2023-07-27  3:57       ` Eric Biggers
2023-07-27 10:27         ` Christian Brauner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).