* [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 @ 2020-08-29 2:00 Rich Felker 2020-08-30 15:05 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-29 2:00 UTC (permalink / raw) To: linux-fsdevel, linux-kernel, linux-api; +Cc: Alexander Viro The pwrite function, originally defined by POSIX (thus the "p"), is defined to ignore O_APPEND and write at the offset passed as its argument. However, historically Linux honored O_APPEND if set and ignored the offset. This cannot be changed due to stability policy, but is documented in the man page as a bug. Now that there's a pwritev2 syscall providing a superset of the pwrite functionality that has a flags argument, the conforming behavior can be offered to userspace via a new flag. Since pwritev2 checks flag validity (in kiocb_set_rw_flags) and reports unknown ones with EOPNOTSUPP, callers will not get wrong behavior on old kernels that don't support the new flag; the error is reported and the caller can decide how to handle it. Signed-off-by: Rich Felker <dalias@libc.org> --- include/linux/fs.h | 4 ++++ include/uapi/linux/fs.h | 5 ++++- 2 files changed, 8 insertions(+), 1 deletion(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index e0d909d35763..3a769a972f79 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3397,6 +3397,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) { if (unlikely(flags & ~RWF_SUPPORTED)) return -EOPNOTSUPP; + if (unlikely((flags & RWF_APPEND) && (flags & RWF_NOAPPEND))) + return -EINVAL; if (flags & RWF_NOWAIT) { if (!(ki->ki_filp->f_mode & FMODE_NOWAIT)) @@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); if (flags & RWF_APPEND) ki->ki_flags |= IOCB_APPEND; + if (flags & RWF_NOAPPEND) + ki->ki_flags &= ~IOCB_APPEND; return 0; } diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 379a612f8f1d..591357d9b3c9 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -299,8 +299,11 @@ typedef int __bitwise __kernel_rwf_t; /* per-IO O_APPEND */ #define RWF_APPEND ((__force __kernel_rwf_t)0x00000010) +/* per-IO negation of O_APPEND */ +#define RWF_NOAPPEND ((__force __kernel_rwf_t)0x00000020) + /* mask of flags supported by the kernel */ #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ - RWF_APPEND) + RWF_APPEND | RWF_NOAPPEND) #endif /* _UAPI_LINUX_FS_H */ -- 2.21.0 ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-29 2:00 [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker @ 2020-08-30 15:05 ` Jann Horn 2020-08-30 16:36 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Jann Horn @ 2020-08-30 15:05 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote: > The pwrite function, originally defined by POSIX (thus the "p"), is > defined to ignore O_APPEND and write at the offset passed as its > argument. However, historically Linux honored O_APPEND if set and > ignored the offset. This cannot be changed due to stability policy, > but is documented in the man page as a bug. > > Now that there's a pwritev2 syscall providing a superset of the pwrite > functionality that has a flags argument, the conforming behavior can > be offered to userspace via a new flag. [...] > diff --git a/include/linux/fs.h b/include/linux/fs.h [...] > @@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > if (flags & RWF_APPEND) > ki->ki_flags |= IOCB_APPEND; > + if (flags & RWF_NOAPPEND) > + ki->ki_flags &= ~IOCB_APPEND; > return 0; > } Linux enforces the S_APPEND flag (set by "chattr +a") only at open() time, not at write() time: # touch testfile # exec 100>testfile # echo foo > testfile # cat testfile foo # chattr +a testfile # echo bar > testfile bash: testfile: Operation not permitted # echo bar >&100 # cat testfile bar # At open() time, the kernel enforces that you can't use O_WRONLY/O_RDWR without also setting O_APPEND if the file is marked as append-only: static int may_open(const struct path *path, int acc_mode, int flag) { [...] /* * An append-only file must be opened in append mode for writing. */ if (IS_APPEND(inode)) { if ((flag & O_ACCMODE) != O_RDONLY && !(flag & O_APPEND)) return -EPERM; if (flag & O_TRUNC) return -EPERM; } [...] } It seems to me like your patch will permit bypassing S_APPEND by opening an append-only file with O_WRONLY|O_APPEND, then calling pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra check for IS_APPEND() somewhere. One could also argue that if an O_APPEND file descriptor is handed across privilege boundaries, a programmer might reasonably expect that the recipient will not be able to use the file descriptor for non-append writes; if that is not actually true, that should probably be noted in the open.2 manpage, at the end of the description of O_APPEND. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 15:05 ` Jann Horn @ 2020-08-30 16:36 ` Rich Felker 2020-08-30 18:31 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-30 16:36 UTC (permalink / raw) To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote: > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote: > > The pwrite function, originally defined by POSIX (thus the "p"), is > > defined to ignore O_APPEND and write at the offset passed as its > > argument. However, historically Linux honored O_APPEND if set and > > ignored the offset. This cannot be changed due to stability policy, > > but is documented in the man page as a bug. > > > > Now that there's a pwritev2 syscall providing a superset of the pwrite > > functionality that has a flags argument, the conforming behavior can > > be offered to userspace via a new flag. > [...] > > diff --git a/include/linux/fs.h b/include/linux/fs.h > [...] > > @@ -3411,6 +3413,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > > if (flags & RWF_APPEND) > > ki->ki_flags |= IOCB_APPEND; > > + if (flags & RWF_NOAPPEND) > > + ki->ki_flags &= ~IOCB_APPEND; > > return 0; > > } > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open() > time, not at write() time: > > # touch testfile > # exec 100>testfile > # echo foo > testfile > # cat testfile > foo > # chattr +a testfile > # echo bar > testfile > bash: testfile: Operation not permitted > # echo bar >&100 > # cat testfile > bar > # > > At open() time, the kernel enforces that you can't use O_WRONLY/O_RDWR > without also setting O_APPEND if the file is marked as append-only: > > static int may_open(const struct path *path, int acc_mode, int flag) > { > [...] > /* > * An append-only file must be opened in append mode for writing. > */ > if (IS_APPEND(inode)) { > if ((flag & O_ACCMODE) != O_RDONLY && !(flag & O_APPEND)) > return -EPERM; > if (flag & O_TRUNC) > return -EPERM; > } > [...] > } > > It seems to me like your patch will permit bypassing S_APPEND by > opening an append-only file with O_WRONLY|O_APPEND, then calling > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra > check for IS_APPEND() somewhere. > > > One could also argue that if an O_APPEND file descriptor is handed > across privilege boundaries, a programmer might reasonably expect that > the recipient will not be able to use the file descriptor for > non-append writes; if that is not actually true, that should probably > be noted in the open.2 manpage, at the end of the description of > O_APPEND. fcntl F_SETFL can remove O_APPEND, so it is not a security boundary. I'm not sure how this interacts with S_APPEND; presumably fcntl rechecks it. So just checking IS_APPEND in the code paths used by pwritev2 (and erroring out rather than silently writing output at the wrong place) should suffice to preserve all existing security invariants. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 16:36 ` Rich Felker @ 2020-08-30 18:31 ` Jann Horn 2020-08-30 18:43 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Jann Horn @ 2020-08-30 18:31 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote: > > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote: > > > The pwrite function, originally defined by POSIX (thus the "p"), is > > > defined to ignore O_APPEND and write at the offset passed as its > > > argument. However, historically Linux honored O_APPEND if set and > > > ignored the offset. This cannot be changed due to stability policy, > > > but is documented in the man page as a bug. > > > > > > Now that there's a pwritev2 syscall providing a superset of the pwrite > > > functionality that has a flags argument, the conforming behavior can > > > be offered to userspace via a new flag. [...] > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open() > > time, not at write() time: [...] > > It seems to me like your patch will permit bypassing S_APPEND by > > opening an append-only file with O_WRONLY|O_APPEND, then calling > > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra > > check for IS_APPEND() somewhere. > > > > > > One could also argue that if an O_APPEND file descriptor is handed > > across privilege boundaries, a programmer might reasonably expect that > > the recipient will not be able to use the file descriptor for > > non-append writes; if that is not actually true, that should probably > > be noted in the open.2 manpage, at the end of the description of > > O_APPEND. > > fcntl F_SETFL can remove O_APPEND, so it is not a security boundary. > I'm not sure how this interacts with S_APPEND; presumably fcntl > rechecks it. Ah, good point, you're right. In fs/fcntl.c: 35 static int setfl(int fd, struct file * filp, unsigned long arg) 36 { 37 struct inode * inode = file_inode(filp); 38 int error = 0; 39 40 /* 41 * O_APPEND cannot be cleared if the file is marked as append-only 42 * and the file is open for write. 43 */ 44 if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode)) 45 return -EPERM; > So just checking IS_APPEND in the code paths used by > pwritev2 (and erroring out rather than silently writing output at the > wrong place) should suffice to preserve all existing security > invariants. Makes sense. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 18:31 ` Jann Horn @ 2020-08-30 18:43 ` Rich Felker 2020-08-30 19:02 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-30 18:43 UTC (permalink / raw) To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote: > > > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote: > > > > The pwrite function, originally defined by POSIX (thus the "p"), is > > > > defined to ignore O_APPEND and write at the offset passed as its > > > > argument. However, historically Linux honored O_APPEND if set and > > > > ignored the offset. This cannot be changed due to stability policy, > > > > but is documented in the man page as a bug. > > > > > > > > Now that there's a pwritev2 syscall providing a superset of the pwrite > > > > functionality that has a flags argument, the conforming behavior can > > > > be offered to userspace via a new flag. > [...] > > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open() > > > time, not at write() time: > [...] > > > It seems to me like your patch will permit bypassing S_APPEND by > > > opening an append-only file with O_WRONLY|O_APPEND, then calling > > > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra > > > check for IS_APPEND() somewhere. > > > > > > > > > One could also argue that if an O_APPEND file descriptor is handed > > > across privilege boundaries, a programmer might reasonably expect that > > > the recipient will not be able to use the file descriptor for > > > non-append writes; if that is not actually true, that should probably > > > be noted in the open.2 manpage, at the end of the description of > > > O_APPEND. > > > > fcntl F_SETFL can remove O_APPEND, so it is not a security boundary. > > I'm not sure how this interacts with S_APPEND; presumably fcntl > > rechecks it. > > Ah, good point, you're right. In fs/fcntl.c: > > 35 static int setfl(int fd, struct file * filp, unsigned long arg) > 36 { > 37 struct inode * inode = file_inode(filp); > 38 int error = 0; > 39 > 40 /* > 41 * O_APPEND cannot be cleared if the file is marked as append-only > 42 * and the file is open for write. > 43 */ > 44 if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode)) > 45 return -EPERM; FWIW I think this check is mildly wrong; it seems to disallow *adding* O_APPEND if the file became chattr +a after it was opened. It should probably be changed to only disallow removal. > > So just checking IS_APPEND in the code paths used by > > pwritev2 (and erroring out rather than silently writing output at the > > wrong place) should suffice to preserve all existing security > > invariants. > > Makes sense. There are 3 places where kiocb_set_rw_flags is called with flags that seem to be controlled by userspace: aio.c, io_uring.c, and read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if the underlying inode is S_APPEND. To avoid repeating the same logic in an error-prone way, should kiocb_set_rw_flags's signature be updated to take the filp so that it can obtain the inode and check IS_APPEND before accepting RWF_NOAPPEND? It's inline so this should avoid actually loading anything except in the codepath where flags&RWF_NOAPPEND is nonzero. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 18:43 ` Rich Felker @ 2020-08-30 19:02 ` Jann Horn 2020-08-30 20:00 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Jann Horn @ 2020-08-30 19:02 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > On Sun, Aug 30, 2020 at 05:05:45PM +0200, Jann Horn wrote: > > > > On Sat, Aug 29, 2020 at 4:00 AM Rich Felker <dalias@libc.org> wrote: > > > > > The pwrite function, originally defined by POSIX (thus the "p"), is > > > > > defined to ignore O_APPEND and write at the offset passed as its > > > > > argument. However, historically Linux honored O_APPEND if set and > > > > > ignored the offset. This cannot be changed due to stability policy, > > > > > but is documented in the man page as a bug. > > > > > > > > > > Now that there's a pwritev2 syscall providing a superset of the pwrite > > > > > functionality that has a flags argument, the conforming behavior can > > > > > be offered to userspace via a new flag. > > [...] > > > > Linux enforces the S_APPEND flag (set by "chattr +a") only at open() > > > > time, not at write() time: > > [...] > > > > It seems to me like your patch will permit bypassing S_APPEND by > > > > opening an append-only file with O_WRONLY|O_APPEND, then calling > > > > pwritev2() with RWF_NOAPPEND? I think you'll have to add an extra > > > > check for IS_APPEND() somewhere. > > > > > > > > > > > > One could also argue that if an O_APPEND file descriptor is handed > > > > across privilege boundaries, a programmer might reasonably expect that > > > > the recipient will not be able to use the file descriptor for > > > > non-append writes; if that is not actually true, that should probably > > > > be noted in the open.2 manpage, at the end of the description of > > > > O_APPEND. > > > > > > fcntl F_SETFL can remove O_APPEND, so it is not a security boundary. > > > I'm not sure how this interacts with S_APPEND; presumably fcntl > > > rechecks it. > > > > Ah, good point, you're right. In fs/fcntl.c: > > > > 35 static int setfl(int fd, struct file * filp, unsigned long arg) > > 36 { > > 37 struct inode * inode = file_inode(filp); > > 38 int error = 0; > > 39 > > 40 /* > > 41 * O_APPEND cannot be cleared if the file is marked as append-only > > 42 * and the file is open for write. > > 43 */ > > 44 if (((arg ^ filp->f_flags) & O_APPEND) && IS_APPEND(inode)) > > 45 return -EPERM; > > FWIW I think this check is mildly wrong; it seems to disallow *adding* > O_APPEND if the file became chattr +a after it was opened. It should > probably be changed to only disallow removal. Yeah... > > > So just checking IS_APPEND in the code paths used by > > > pwritev2 (and erroring out rather than silently writing output at the > > > wrong place) should suffice to preserve all existing security > > > invariants. > > > > Makes sense. > > There are 3 places where kiocb_set_rw_flags is called with flags that > seem to be controlled by userspace: aio.c, io_uring.c, and > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > the underlying inode is S_APPEND. To avoid repeating the same logic in > an error-prone way, should kiocb_set_rw_flags's signature be updated > to take the filp so that it can obtain the inode and check IS_APPEND > before accepting RWF_NOAPPEND? It's inline so this should avoid > actually loading anything except in the codepath where > flags&RWF_NOAPPEND is nonzero. You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT branch of kiocb_set_rw_flags(). ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 19:02 ` Jann Horn @ 2020-08-30 20:00 ` Rich Felker 2020-08-31 1:15 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-30 20:00 UTC (permalink / raw) To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > So just checking IS_APPEND in the code paths used by > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > wrong place) should suffice to preserve all existing security > > > > invariants. > > > > > > Makes sense. > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > seem to be controlled by userspace: aio.c, io_uring.c, and > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > to take the filp so that it can obtain the inode and check IS_APPEND > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > actually loading anything except in the codepath where > > flags&RWF_NOAPPEND is nonzero. > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > branch of kiocb_set_rw_flags(). Thanks. I should have looked for that. OK, so a fixup like this on top of the existing patch? diff --git a/include/linux/fs.h b/include/linux/fs.h index 473289bff4c6..674131e8d139 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); if (flags & RWF_APPEND) ki->ki_flags |= IOCB_APPEND; - if (flags & RWF_NOAPPEND) + if (flags & RWF_NOAPPEND) { + if (IS_APPEND(file_inode(ki->ki_filp))) + return -EPERM; ki->ki_flags &= ~IOCB_APPEND; + } return 0; } If this is good I'll submit a v2 as the above squashed with the original patch. Rich ^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-30 20:00 ` Rich Felker @ 2020-08-31 1:15 ` Jann Horn 2020-08-31 1:46 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Jann Horn @ 2020-08-31 1:15 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote: > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > > So just checking IS_APPEND in the code paths used by > > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > > wrong place) should suffice to preserve all existing security > > > > > invariants. > > > > > > > > Makes sense. > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > > seem to be controlled by userspace: aio.c, io_uring.c, and > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > > to take the filp so that it can obtain the inode and check IS_APPEND > > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > > actually loading anything except in the codepath where > > > flags&RWF_NOAPPEND is nonzero. > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > > branch of kiocb_set_rw_flags(). > > Thanks. I should have looked for that. OK, so a fixup like this on top > of the existing patch? > > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 473289bff4c6..674131e8d139 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > if (flags & RWF_APPEND) > ki->ki_flags |= IOCB_APPEND; > - if (flags & RWF_NOAPPEND) > + if (flags & RWF_NOAPPEND) { > + if (IS_APPEND(file_inode(ki->ki_filp))) > + return -EPERM; > ki->ki_flags &= ~IOCB_APPEND; > + } > return 0; > } > > If this is good I'll submit a v2 as the above squashed with the > original patch. Looks good to me. ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-31 1:15 ` Jann Horn @ 2020-08-31 1:46 ` Rich Felker 2020-08-31 9:15 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-31 1:46 UTC (permalink / raw) To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote: > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote: > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > > > So just checking IS_APPEND in the code paths used by > > > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > > > wrong place) should suffice to preserve all existing security > > > > > > invariants. > > > > > > > > > > Makes sense. > > > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > > > seem to be controlled by userspace: aio.c, io_uring.c, and > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > > > to take the filp so that it can obtain the inode and check IS_APPEND > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > > > actually loading anything except in the codepath where > > > > flags&RWF_NOAPPEND is nonzero. > > > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > > > branch of kiocb_set_rw_flags(). > > > > Thanks. I should have looked for that. OK, so a fixup like this on top > > of the existing patch? > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 473289bff4c6..674131e8d139 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > > if (flags & RWF_APPEND) > > ki->ki_flags |= IOCB_APPEND; > > - if (flags & RWF_NOAPPEND) > > + if (flags & RWF_NOAPPEND) { > > + if (IS_APPEND(file_inode(ki->ki_filp))) > > + return -EPERM; > > ki->ki_flags &= ~IOCB_APPEND; > > + } > > return 0; > > } > > > > If this is good I'll submit a v2 as the above squashed with the > > original patch. > > Looks good to me. Actually it's not quite. I think it should be: if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) { if (IS_APPEND(file_inode(ki->ki_filp))) return -EPERM; ki->ki_flags &= ~IOCB_APPEND; } i.e. don't refuse RWF_NOAPPEND on a file that was already successfully opened without O_APPEND that only subsequently got chattr +a. The permission check should only be done if it's overriding the default action for how the file is open. This is actually related to the fcntl corner case mentioned before. Are you ok with this change? If so I'll go ahead and prepare a v2. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-31 1:46 ` Rich Felker @ 2020-08-31 9:15 ` Jann Horn 2020-08-31 12:57 ` Rich Felker 0 siblings, 1 reply; 12+ messages in thread From: Jann Horn @ 2020-08-31 9:15 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote: > On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote: > > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote: > > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > > > > So just checking IS_APPEND in the code paths used by > > > > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > > > > wrong place) should suffice to preserve all existing security > > > > > > > invariants. > > > > > > > > > > > > Makes sense. > > > > > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > > > > seem to be controlled by userspace: aio.c, io_uring.c, and > > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > > > > to take the filp so that it can obtain the inode and check IS_APPEND > > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > > > > actually loading anything except in the codepath where > > > > > flags&RWF_NOAPPEND is nonzero. > > > > > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > > > > branch of kiocb_set_rw_flags(). > > > > > > Thanks. I should have looked for that. OK, so a fixup like this on top > > > of the existing patch? > > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > > index 473289bff4c6..674131e8d139 100644 > > > --- a/include/linux/fs.h > > > +++ b/include/linux/fs.h > > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > > > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > > > if (flags & RWF_APPEND) > > > ki->ki_flags |= IOCB_APPEND; > > > - if (flags & RWF_NOAPPEND) > > > + if (flags & RWF_NOAPPEND) { > > > + if (IS_APPEND(file_inode(ki->ki_filp))) > > > + return -EPERM; > > > ki->ki_flags &= ~IOCB_APPEND; > > > + } > > > return 0; > > > } > > > > > > If this is good I'll submit a v2 as the above squashed with the > > > original patch. > > > > Looks good to me. > > Actually it's not quite. I think it should be: > > if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) { > if (IS_APPEND(file_inode(ki->ki_filp))) > return -EPERM; > ki->ki_flags &= ~IOCB_APPEND; > } > > i.e. don't refuse RWF_NOAPPEND on a file that was already successfully > opened without O_APPEND that only subsequently got chattr +a. The > permission check should only be done if it's overriding the default > action for how the file is open. > > This is actually related to the fcntl corner case mentioned before. > > Are you ok with this change? If so I'll go ahead and prepare a v2. Ah, yeah, I guess that makes sense to keep things more consistent. (You'll have to write that as "(flags & RWF_NOAPPEND) && (ki->ki_flags & IOCB_APPEND)" though (logical AND, not bitwise AND).) ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-31 9:15 ` Jann Horn @ 2020-08-31 12:57 ` Rich Felker 2020-08-31 13:12 ` Jann Horn 0 siblings, 1 reply; 12+ messages in thread From: Rich Felker @ 2020-08-31 12:57 UTC (permalink / raw) To: Jann Horn; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Mon, Aug 31, 2020 at 11:15:57AM +0200, Jann Horn wrote: > On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote: > > On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote: > > > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote: > > > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > > > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > > > > > So just checking IS_APPEND in the code paths used by > > > > > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > > > > > wrong place) should suffice to preserve all existing security > > > > > > > > invariants. > > > > > > > > > > > > > > Makes sense. > > > > > > > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > > > > > seem to be controlled by userspace: aio.c, io_uring.c, and > > > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > > > > > to take the filp so that it can obtain the inode and check IS_APPEND > > > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > > > > > actually loading anything except in the codepath where > > > > > > flags&RWF_NOAPPEND is nonzero. > > > > > > > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > > > > > branch of kiocb_set_rw_flags(). > > > > > > > > Thanks. I should have looked for that. OK, so a fixup like this on top > > > > of the existing patch? > > > > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > > > index 473289bff4c6..674131e8d139 100644 > > > > --- a/include/linux/fs.h > > > > +++ b/include/linux/fs.h > > > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > > > > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > > > > if (flags & RWF_APPEND) > > > > ki->ki_flags |= IOCB_APPEND; > > > > - if (flags & RWF_NOAPPEND) > > > > + if (flags & RWF_NOAPPEND) { > > > > + if (IS_APPEND(file_inode(ki->ki_filp))) > > > > + return -EPERM; > > > > ki->ki_flags &= ~IOCB_APPEND; > > > > + } > > > > return 0; > > > > } > > > > > > > > If this is good I'll submit a v2 as the above squashed with the > > > > original patch. > > > > > > Looks good to me. > > > > Actually it's not quite. I think it should be: > > > > if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) { > > if (IS_APPEND(file_inode(ki->ki_filp))) > > return -EPERM; > > ki->ki_flags &= ~IOCB_APPEND; > > } > > > > i.e. don't refuse RWF_NOAPPEND on a file that was already successfully > > opened without O_APPEND that only subsequently got chattr +a. The > > permission check should only be done if it's overriding the default > > action for how the file is open. > > > > This is actually related to the fcntl corner case mentioned before. > > > > Are you ok with this change? If so I'll go ahead and prepare a v2. > > Ah, yeah, I guess that makes sense to keep things more consistent. > > (You'll have to write that as "(flags & RWF_NOAPPEND) && (ki->ki_flags > & IOCB_APPEND)" though (logical AND, not bitwise AND).) Thanks, no idea how I made that mistake -- probably typing code in email. While reparing this I rebased against Linus's tree, and found conflicts with 1752f0adea98ef85 which were easy to resolve. Unfortunately the same improvement made in that commit does not work for the new RWF_NOAPPEND, since it needs to inspect and mask bits off the original ki_flags, not the local set of added flags, but the penalty should be isolated to this branch. I'm not opposed to adding unlikely() around it if you think that would help codegen for the common cases. Alternatively, kiocb_flags could be initialized to ki->ki_flags, with assignment-back in place of |= at the end of the function. This might be more elegant but I'm not sure if the emitted code would improve. Rich ^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 2020-08-31 12:57 ` Rich Felker @ 2020-08-31 13:12 ` Jann Horn 0 siblings, 0 replies; 12+ messages in thread From: Jann Horn @ 2020-08-31 13:12 UTC (permalink / raw) To: Rich Felker; +Cc: linux-fsdevel, kernel list, Linux API, Alexander Viro On Mon, Aug 31, 2020 at 2:57 PM Rich Felker <dalias@libc.org> wrote: > On Mon, Aug 31, 2020 at 11:15:57AM +0200, Jann Horn wrote: > > On Mon, Aug 31, 2020 at 3:46 AM Rich Felker <dalias@libc.org> wrote: > > > On Mon, Aug 31, 2020 at 03:15:04AM +0200, Jann Horn wrote: > > > > On Sun, Aug 30, 2020 at 10:00 PM Rich Felker <dalias@libc.org> wrote: > > > > > On Sun, Aug 30, 2020 at 09:02:31PM +0200, Jann Horn wrote: > > > > > > On Sun, Aug 30, 2020 at 8:43 PM Rich Felker <dalias@libc.org> wrote: > > > > > > > On Sun, Aug 30, 2020 at 08:31:36PM +0200, Jann Horn wrote: > > > > > > > > On Sun, Aug 30, 2020 at 6:36 PM Rich Felker <dalias@libc.org> wrote: > > > > > > > > > So just checking IS_APPEND in the code paths used by > > > > > > > > > pwritev2 (and erroring out rather than silently writing output at the > > > > > > > > > wrong place) should suffice to preserve all existing security > > > > > > > > > invariants. > > > > > > > > > > > > > > > > Makes sense. > > > > > > > > > > > > > > There are 3 places where kiocb_set_rw_flags is called with flags that > > > > > > > seem to be controlled by userspace: aio.c, io_uring.c, and > > > > > > > read_write.c. Presumably each needs to EPERM out on RWF_NOAPPEND if > > > > > > > the underlying inode is S_APPEND. To avoid repeating the same logic in > > > > > > > an error-prone way, should kiocb_set_rw_flags's signature be updated > > > > > > > to take the filp so that it can obtain the inode and check IS_APPEND > > > > > > > before accepting RWF_NOAPPEND? It's inline so this should avoid > > > > > > > actually loading anything except in the codepath where > > > > > > > flags&RWF_NOAPPEND is nonzero. > > > > > > > > > > > > You can get the file pointer from ki->ki_filp. See the RWF_NOWAIT > > > > > > branch of kiocb_set_rw_flags(). > > > > > > > > > > Thanks. I should have looked for that. OK, so a fixup like this on top > > > > > of the existing patch? > > > > > > > > > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > > > > index 473289bff4c6..674131e8d139 100644 > > > > > --- a/include/linux/fs.h > > > > > +++ b/include/linux/fs.h > > > > > @@ -3457,8 +3457,11 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) > > > > > ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); > > > > > if (flags & RWF_APPEND) > > > > > ki->ki_flags |= IOCB_APPEND; > > > > > - if (flags & RWF_NOAPPEND) > > > > > + if (flags & RWF_NOAPPEND) { > > > > > + if (IS_APPEND(file_inode(ki->ki_filp))) > > > > > + return -EPERM; > > > > > ki->ki_flags &= ~IOCB_APPEND; > > > > > + } > > > > > return 0; > > > > > } > > > > > > > > > > If this is good I'll submit a v2 as the above squashed with the > > > > > original patch. > > > > > > > > Looks good to me. > > > > > > Actually it's not quite. I think it should be: > > > > > > if ((flags & RWF_NOAPPEND) & (ki->ki_flags & IOCB_APPEND)) { > > > if (IS_APPEND(file_inode(ki->ki_filp))) > > > return -EPERM; > > > ki->ki_flags &= ~IOCB_APPEND; > > > } > > > > > > i.e. don't refuse RWF_NOAPPEND on a file that was already successfully > > > opened without O_APPEND that only subsequently got chattr +a. The > > > permission check should only be done if it's overriding the default > > > action for how the file is open. > > > > > > This is actually related to the fcntl corner case mentioned before. [...] > While reparing this I rebased against Linus's tree, and found > conflicts with 1752f0adea98ef85 which were easy to resolve. > Unfortunately the same improvement made in that commit does not work > for the new RWF_NOAPPEND, since it needs to inspect and mask bits off > the original ki_flags, not the local set of added flags, but the > penalty should be isolated to this branch. I'm not opposed to adding > unlikely() around it if you think that would help codegen for the > common cases. > > Alternatively, kiocb_flags could be initialized to ki->ki_flags, with > assignment-back in place of |= at the end of the function. This might > be more elegant but I'm not sure if the emitted code would improve. Presumably RWF_NOAPPEND would be somewhat rare, and a simple comparison and not-taken branch should be really cheap? I'm not really concerned about it. I guess you can CC the author of that patch on your v2. ^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2020-08-31 13:13 UTC | newest] Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-08-29 2:00 [RESEND PATCH] vfs: add RWF_NOAPPEND flag for pwritev2 Rich Felker 2020-08-30 15:05 ` Jann Horn 2020-08-30 16:36 ` Rich Felker 2020-08-30 18:31 ` Jann Horn 2020-08-30 18:43 ` Rich Felker 2020-08-30 19:02 ` Jann Horn 2020-08-30 20:00 ` Rich Felker 2020-08-31 1:15 ` Jann Horn 2020-08-31 1:46 ` Rich Felker 2020-08-31 9:15 ` Jann Horn 2020-08-31 12:57 ` Rich Felker 2020-08-31 13:12 ` Jann Horn
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).