linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
@ 2020-08-31 15:32 Rich Felker
  2020-08-31 15:46 ` Jann Horn
  2024-01-19 14:33 ` Christian Brauner
  0 siblings, 2 replies; 7+ messages in thread
From: Rich Felker @ 2020-08-31 15:32 UTC (permalink / raw)
  To: Jann Horn
  Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro, Pavel Begunkov

The pwrite function, originally defined by POSIX (thus the "p"), is
defined to ignore O_APPEND and write at the offset passed as its
argument. However, historically Linux honored O_APPEND if set and
ignored the offset. This cannot be changed due to stability policy,
but is documented in the man page as a bug.

Now that there's a pwritev2 syscall providing a superset of the pwrite
functionality that has a flags argument, the conforming behavior can
be offered to userspace via a new flag. Since pwritev2 checks flag
validity (in kiocb_set_rw_flags) and reports unknown ones with
EOPNOTSUPP, callers will not get wrong behavior on old kernels that
don't support the new flag; the error is reported and the caller can
decide how to handle it.

Signed-off-by: Rich Felker <dalias@libc.org>
---

Changes in v2: I've added a check to ensure that RWF_NOAPPEND does not
override O_APPEND for S_APPEND (chattr +a) inodes, and fixed conflicts
with 1752f0adea98ef85, which optimized kiocb_set_rw_flags to work with
a local copy of flags. Unfortunately the same optimization does not
work for RWF_NOAPPEND since it needs to remove flags from the original
set at function entry.

If desired, I could further change this so that kiocb_flags is
initialized to ki->ki_flags, with assignment-back in place of |= at
the end of the function. This would allow the same local variable
pattern in the RWF_NOAPPEND code path, which might be more elegant,
but I'm not sure if the emitted code would improve or get worse.


 include/linux/fs.h      | 7 +++++++
 include/uapi/linux/fs.h | 5 ++++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 7519ae003a08..924e17ac8e7e 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -3321,6 +3321,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		return 0;
 	if (unlikely(flags & ~RWF_SUPPORTED))
 		return -EOPNOTSUPP;
+	if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
+		return -EINVAL;
 
 	if (flags & RWF_NOWAIT) {
 		if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
@@ -3335,6 +3337,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
 		kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
 	if (flags & RWF_APPEND)
 		kiocb_flags |= IOCB_APPEND;
+	if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) {
+		if (IS_APPEND(file_inode(ki->ki_filp)))
+			return -EPERM;
+		ki->ki_flags &= ~IOCB_APPEND;
+	}
 
 	ki->ki_flags |= kiocb_flags;
 	return 0;
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index f44eb0a04afd..d5e54e0742cf 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -300,8 +300,11 @@ typedef int __bitwise __kernel_rwf_t;
 /* per-IO O_APPEND */
 #define RWF_APPEND	((__force __kernel_rwf_t)0x00000010)
 
+/* per-IO negation of O_APPEND */
+#define RWF_NOAPPEND	((__force __kernel_rwf_t)0x00000020)
+
 /* mask of flags supported by the kernel */
 #define RWF_SUPPORTED	(RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
-			 RWF_APPEND)
+			 RWF_APPEND | RWF_NOAPPEND)
 
 #endif /* _UAPI_LINUX_FS_H */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 15:32 [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
@ 2020-08-31 15:46 ` Jann Horn
  2020-08-31 17:05   ` Jens Axboe
  2024-01-19 14:33 ` Christian Brauner
  1 sibling, 1 reply; 7+ messages in thread
From: Jann Horn @ 2020-08-31 15:46 UTC (permalink / raw)
  To: Rich Felker, Alexander Viro, Jens Axboe
  Cc: linux-fsdevel, kernel list, Linux API, Pavel Begunkov

On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
> The pwrite function, originally defined by POSIX (thus the "p"), is
> defined to ignore O_APPEND and write at the offset passed as its
> argument. However, historically Linux honored O_APPEND if set and
> ignored the offset. This cannot be changed due to stability policy,
> but is documented in the man page as a bug.
>
> Now that there's a pwritev2 syscall providing a superset of the pwrite
> functionality that has a flags argument, the conforming behavior can
> be offered to userspace via a new flag. Since pwritev2 checks flag
> validity (in kiocb_set_rw_flags) and reports unknown ones with
> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
> don't support the new flag; the error is reported and the caller can
> decide how to handle it.
>
> Signed-off-by: Rich Felker <dalias@libc.org>

Reviewed-by: Jann Horn <jannh@google.com>

Note that if this lands, Michael Kerrisk will probably be happy if you
send a corresponding patch for the manpage man2/readv.2.

Btw, I'm not really sure whose tree this should go through - VFS is
normally Al Viro's turf, but it looks like the most recent
modifications to this function have gone through Jens Axboe's tree?

> ---
>
> Changes in v2: I've added a check to ensure that RWF_NOAPPEND does not
> override O_APPEND for S_APPEND (chattr +a) inodes, and fixed conflicts
> with 1752f0adea98ef85, which optimized kiocb_set_rw_flags to work with
> a local copy of flags. Unfortunately the same optimization does not
> work for RWF_NOAPPEND since it needs to remove flags from the original
> set at function entry.
>
> If desired, I could further change this so that kiocb_flags is
> initialized to ki->ki_flags, with assignment-back in place of |= at
> the end of the function. This would allow the same local variable
> pattern in the RWF_NOAPPEND code path, which might be more elegant,
> but I'm not sure if the emitted code would improve or get worse.
>
>
>  include/linux/fs.h      | 7 +++++++
>  include/uapi/linux/fs.h | 5 ++++-
>  2 files changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index 7519ae003a08..924e17ac8e7e 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -3321,6 +3321,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 return 0;
>         if (unlikely(flags & ~RWF_SUPPORTED))
>                 return -EOPNOTSUPP;
> +       if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND)))
> +               return -EINVAL;
>
>         if (flags & RWF_NOWAIT) {
>                 if (!(ki->ki_filp->f_mode & FMODE_NOWAIT))
> @@ -3335,6 +3337,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags)
>                 kiocb_flags |= (IOCB_DSYNC | IOCB_SYNC);
>         if (flags & RWF_APPEND)
>                 kiocb_flags |= IOCB_APPEND;
> +       if ((flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)) {
> +               if (IS_APPEND(file_inode(ki->ki_filp)))
> +                       return -EPERM;
> +               ki->ki_flags &= ~IOCB_APPEND;
> +       }
>
>         ki->ki_flags |= kiocb_flags;
>         return 0;
> diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
> index f44eb0a04afd..d5e54e0742cf 100644
> --- a/include/uapi/linux/fs.h
> +++ b/include/uapi/linux/fs.h
> @@ -300,8 +300,11 @@ typedef int __bitwise __kernel_rwf_t;
>  /* per-IO O_APPEND */
>  #define RWF_APPEND     ((__force __kernel_rwf_t)0x00000010)
>
> +/* per-IO negation of O_APPEND */
> +#define RWF_NOAPPEND   ((__force __kernel_rwf_t)0x00000020)
> +
>  /* mask of flags supported by the kernel */
>  #define RWF_SUPPORTED  (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\
> -                        RWF_APPEND)
> +                        RWF_APPEND | RWF_NOAPPEND)
>
>  #endif /* _UAPI_LINUX_FS_H */
> --
> 2.21.0
>

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 15:46 ` Jann Horn
@ 2020-08-31 17:05   ` Jens Axboe
  2024-01-18 15:57     ` Rich Felker
  0 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2020-08-31 17:05 UTC (permalink / raw)
  To: Jann Horn, Rich Felker, Alexander Viro
  Cc: linux-fsdevel, kernel list, Linux API, Pavel Begunkov

On 8/31/20 9:46 AM, Jann Horn wrote:
> On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
>> The pwrite function, originally defined by POSIX (thus the "p"), is
>> defined to ignore O_APPEND and write at the offset passed as its
>> argument. However, historically Linux honored O_APPEND if set and
>> ignored the offset. This cannot be changed due to stability policy,
>> but is documented in the man page as a bug.
>>
>> Now that there's a pwritev2 syscall providing a superset of the pwrite
>> functionality that has a flags argument, the conforming behavior can
>> be offered to userspace via a new flag. Since pwritev2 checks flag
>> validity (in kiocb_set_rw_flags) and reports unknown ones with
>> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
>> don't support the new flag; the error is reported and the caller can
>> decide how to handle it.
>>
>> Signed-off-by: Rich Felker <dalias@libc.org>
> 
> Reviewed-by: Jann Horn <jannh@google.com>
> 
> Note that if this lands, Michael Kerrisk will probably be happy if you
> send a corresponding patch for the manpage man2/readv.2.
> 
> Btw, I'm not really sure whose tree this should go through - VFS is
> normally Al Viro's turf, but it looks like the most recent
> modifications to this function have gone through Jens Axboe's tree?

Should probably go through Al's tree, I've only carried them when
they've been associated with io_uring in some shape or form.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 17:05   ` Jens Axboe
@ 2024-01-18 15:57     ` Rich Felker
  2024-01-18 16:02       ` Jens Axboe
  0 siblings, 1 reply; 7+ messages in thread
From: Rich Felker @ 2024-01-18 15:57 UTC (permalink / raw)
  To: Jens Axboe
  Cc: Jann Horn, Alexander Viro, linux-fsdevel, kernel list, Linux API,
	Pavel Begunkov

On Mon, Aug 31, 2020 at 11:05:34AM -0600, Jens Axboe wrote:
> On 8/31/20 9:46 AM, Jann Horn wrote:
> > On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
> >> The pwrite function, originally defined by POSIX (thus the "p"), is
> >> defined to ignore O_APPEND and write at the offset passed as its
> >> argument. However, historically Linux honored O_APPEND if set and
> >> ignored the offset. This cannot be changed due to stability policy,
> >> but is documented in the man page as a bug.
> >>
> >> Now that there's a pwritev2 syscall providing a superset of the pwrite
> >> functionality that has a flags argument, the conforming behavior can
> >> be offered to userspace via a new flag. Since pwritev2 checks flag
> >> validity (in kiocb_set_rw_flags) and reports unknown ones with
> >> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
> >> don't support the new flag; the error is reported and the caller can
> >> decide how to handle it.
> >>
> >> Signed-off-by: Rich Felker <dalias@libc.org>
> > 
> > Reviewed-by: Jann Horn <jannh@google.com>
> > 
> > Note that if this lands, Michael Kerrisk will probably be happy if you
> > send a corresponding patch for the manpage man2/readv.2.
> > 
> > Btw, I'm not really sure whose tree this should go through - VFS is
> > normally Al Viro's turf, but it looks like the most recent
> > modifications to this function have gone through Jens Axboe's tree?
> 
> Should probably go through Al's tree, I've only carried them when
> they've been associated with io_uring in some shape or form.

This appears to have slipped through the cracks. Do I need to send an
updated rebase of it? Were there any objections to it I missed?

Rich

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2024-01-18 15:57     ` Rich Felker
@ 2024-01-18 16:02       ` Jens Axboe
  0 siblings, 0 replies; 7+ messages in thread
From: Jens Axboe @ 2024-01-18 16:02 UTC (permalink / raw)
  To: Rich Felker
  Cc: Jann Horn, Alexander Viro, linux-fsdevel, kernel list, Linux API,
	Pavel Begunkov, Christian Brauner

On 1/18/24 8:57 AM, Rich Felker wrote:
> On Mon, Aug 31, 2020 at 11:05:34AM -0600, Jens Axboe wrote:
>> On 8/31/20 9:46 AM, Jann Horn wrote:
>>> On Mon, Aug 31, 2020 at 5:32 PM Rich Felker <dalias@libc.org> wrote:
>>>> The pwrite function, originally defined by POSIX (thus the "p"), is
>>>> defined to ignore O_APPEND and write at the offset passed as its
>>>> argument. However, historically Linux honored O_APPEND if set and
>>>> ignored the offset. This cannot be changed due to stability policy,
>>>> but is documented in the man page as a bug.
>>>>
>>>> Now that there's a pwritev2 syscall providing a superset of the pwrite
>>>> functionality that has a flags argument, the conforming behavior can
>>>> be offered to userspace via a new flag. Since pwritev2 checks flag
>>>> validity (in kiocb_set_rw_flags) and reports unknown ones with
>>>> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
>>>> don't support the new flag; the error is reported and the caller can
>>>> decide how to handle it.
>>>>
>>>> Signed-off-by: Rich Felker <dalias@libc.org>
>>>
>>> Reviewed-by: Jann Horn <jannh@google.com>
>>>
>>> Note that if this lands, Michael Kerrisk will probably be happy if you
>>> send a corresponding patch for the manpage man2/readv.2.
>>>
>>> Btw, I'm not really sure whose tree this should go through - VFS is
>>> normally Al Viro's turf, but it looks like the most recent
>>> modifications to this function have gone through Jens Axboe's tree?
>>
>> Should probably go through Al's tree, I've only carried them when
>> they've been associated with io_uring in some shape or form.
> 
> This appears to have slipped through the cracks. Do I need to send an
> updated rebase of it? Were there any objections to it I missed?

Let's add Christian.

-- 
Jens Axboe



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2020-08-31 15:32 [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
  2020-08-31 15:46 ` Jann Horn
@ 2024-01-19 14:33 ` Christian Brauner
  2024-01-19 19:10   ` Rich Felker
  1 sibling, 1 reply; 7+ messages in thread
From: Christian Brauner @ 2024-01-19 14:33 UTC (permalink / raw)
  To: Rich Felker, Jens Axboe
  Cc: Christian Brauner, linux-fsdevel, kernel list, Linux API,
	Alexander Viro, Pavel Begunkov, Jann Horn

On Mon, 31 Aug 2020 11:32:08 -0400, Rich Felker wrote:
> The pwrite function, originally defined by POSIX (thus the "p"), is
> defined to ignore O_APPEND and write at the offset passed as its
> argument. However, historically Linux honored O_APPEND if set and
> ignored the offset. This cannot be changed due to stability policy,
> but is documented in the man page as a bug.
> 
> Now that there's a pwritev2 syscall providing a superset of the pwrite
> functionality that has a flags argument, the conforming behavior can
> be offered to userspace via a new flag. Since pwritev2 checks flag
> validity (in kiocb_set_rw_flags) and reports unknown ones with
> EOPNOTSUPP, callers will not get wrong behavior on old kernels that
> don't support the new flag; the error is reported and the caller can
> decide how to handle it.
> 
> [...]

The RWF_* and IOCB_* flags were
aligned so they could be set in one operation. So there was a merge
conflict when applying. I've resolved it. Please take a look and make
sure that it's all correct.

---

Applied to the vfs.misc branch of the vfs/vfs.git tree.
Patches in the vfs.misc branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc

[1/1] vfs: add RWF_NOAPPEND flag for pwritev2
      https://git.kernel.org/vfs/vfs/c/31081ab305a1

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2
  2024-01-19 14:33 ` Christian Brauner
@ 2024-01-19 19:10   ` Rich Felker
  0 siblings, 0 replies; 7+ messages in thread
From: Rich Felker @ 2024-01-19 19:10 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Jens Axboe, linux-fsdevel, kernel list, Linux API,
	Alexander Viro, Pavel Begunkov, Jann Horn

On Fri, Jan 19, 2024 at 03:33:32PM +0100, Christian Brauner wrote:
> On Mon, 31 Aug 2020 11:32:08 -0400, Rich Felker wrote:
> > The pwrite function, originally defined by POSIX (thus the "p"), is
> > defined to ignore O_APPEND and write at the offset passed as its
> > argument. However, historically Linux honored O_APPEND if set and
> > ignored the offset. This cannot be changed due to stability policy,
> > but is documented in the man page as a bug.
> > 
> > Now that there's a pwritev2 syscall providing a superset of the pwrite
> > functionality that has a flags argument, the conforming behavior can
> > be offered to userspace via a new flag. Since pwritev2 checks flag
> > validity (in kiocb_set_rw_flags) and reports unknown ones with
> > EOPNOTSUPP, callers will not get wrong behavior on old kernels that
> > don't support the new flag; the error is reported and the caller can
> > decide how to handle it.
> > 
> > [...]
> 
> The RWF_* and IOCB_* flags were
> aligned so they could be set in one operation. So there was a merge
> conflict when applying. I've resolved it. Please take a look and make
> sure that it's all correct.
> 
> ---
> 
> Applied to the vfs.misc branch of the vfs/vfs.git tree.
> Patches in the vfs.misc branch should appear in linux-next soon.
> 
> Please report any outstanding bugs that were missed during review in a
> new review to the original patch series allowing us to drop it.
> 
> It's encouraged to provide Acked-bys and Reviewed-bys even though the
> patch has now been applied. If possible patch trailers will be updated.
> 
> Note that commit hashes shown below are subject to change due to rebase,
> trailer updates or similar. If in doubt, please check the listed branch.
> 
> tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
> branch: vfs.misc
> 
> [1/1] vfs: add RWF_NOAPPEND flag for pwritev2
>       https://git.kernel.org/vfs/vfs/c/31081ab305a1

LGTM. Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2024-01-19 19:10 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-08-31 15:32 [PATCH v2] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker
2020-08-31 15:46 ` Jann Horn
2020-08-31 17:05   ` Jens Axboe
2024-01-18 15:57     ` Rich Felker
2024-01-18 16:02       ` Jens Axboe
2024-01-19 14:33 ` Christian Brauner
2024-01-19 19:10   ` Rich Felker

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).