* [RFC] unify the file-closing stuff in fs/file.c
@ 2022-05-12 21:20 Al Viro
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
` (2 more replies)
0 siblings, 3 replies; 10+ messages in thread
From: Al Viro @ 2022-05-12 21:20 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Christian Brauner, Jens Axboe, Todd Kjos, Giuseppe Scrivano
Right now we have two places that do such removals - pick_file()
and {__,}close_fd_get_file().
They are almost identical - the only difference is in calling
conventions (well, and the fact that __... is called with descriptor
table locked).
Calling conventions are... interesting.
1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
The latter is for "descriptor is greater than size of descriptor table".
One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
passed to caller by way of struct file ** argument. One of the callers
(binder) ignores the return value completely and checks if the file is NULL.
Another (io_uring) checks for return value being negative, then maps
-ENOENT to -EBADF, not that any other value would be possible.
ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
{__,}close_fd_get_file() conventions are insane. The older caller
(in binder) had never even looked at return value; the newer one
patches the bogus -ENOENT to what it wants to report, with strange
"defensive" BS logics just in case __close_fd_get_file() would somehow
find a different error to report.
At the very least, {__,}close_fd_get_file() callers would've been happier
if it just returned file or NULL. What's more, I'm seriously tempted
to make pick_file() do the same thing. close_fd() won't care (checking
for NULL is just as easy as for IS_ERR) and __range_close() could just
as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
Does anybody see problems with the following?
commit 8819510a641800a63ab10d6b5ab283cada1cbd50
Author: Al Viro <viro@zeniv.linux.org.uk>
Date: Thu May 12 17:08:03 2022 -0400
Unify the primitives for file descriptor closing
Currently we have 3 primitives for removing an opened file from descriptor
table - pick_file(), __close_fd_get_file() and close_fd_get_file(). Their
calling conventions are rather odd and there's a code duplication for no
good reason. They can be unified -
1) have __range_close() cap max_fd in the very beginning; that way
we don't need separate way for pick_file() to report being past the end
of descriptor table.
2) make {__,}close_fd_get_file() return file (or NULL) directly, rather
than returning it via struct file ** argument. Don't bother with
(bogus) return value - nobody wants that -ENOENT.
3) make pick_file() return NULL on unopened descriptor - the only caller
that used to care about the distinction between descriptor past the end
of descriptor table and finding NULL in descriptor table doesn't give
a damn after (1).
4) lift ->files_lock out of pick_file()
That actually simplifies the callers, as well as the primitives themselves.
Code duplication is also gone...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 8351c5638880..27c9b004823a 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1855,7 +1855,7 @@ static void binder_deferred_fd_close(int fd)
if (!twcb)
return;
init_task_work(&twcb->twork, binder_do_fd_close);
- close_fd_get_file(fd, &twcb->file);
+ twcb->file = close_fd_get_file(fd);
if (twcb->file) {
filp_close(twcb->file, current->files);
task_work_add(current, &twcb->twork, TWA_RESUME);
diff --git a/fs/file.c b/fs/file.c
index ee9317346702..9780888fa2da 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -630,32 +630,21 @@ EXPORT_SYMBOL(fd_install);
* @files: file struct to retrieve file from
* @fd: file descriptor to retrieve file for
*
- * If this functions returns an EINVAL error pointer the fd was beyond the
- * current maximum number of file descriptors for that fdtable.
- *
- * Returns: The file associated with @fd, on error returns an error pointer.
+ * Returns: The file associated with @fd (NULL if @fd is not open)
*/
static struct file *pick_file(struct files_struct *files, unsigned fd)
{
+ struct fdtable *fdt = files_fdtable(files);
struct file *file;
- struct fdtable *fdt;
- spin_lock(&files->file_lock);
- fdt = files_fdtable(files);
- if (fd >= fdt->max_fds) {
- file = ERR_PTR(-EINVAL);
- goto out_unlock;
- }
+ if (fd >= fdt->max_fds)
+ return NULL;
+
file = fdt->fd[fd];
- if (!file) {
- file = ERR_PTR(-EBADF);
- goto out_unlock;
+ if (file) {
+ rcu_assign_pointer(fdt->fd[fd], NULL);
+ __put_unused_fd(files, fd);
}
- rcu_assign_pointer(fdt->fd[fd], NULL);
- __put_unused_fd(files, fd);
-
-out_unlock:
- spin_unlock(&files->file_lock);
return file;
}
@@ -664,8 +653,10 @@ int close_fd(unsigned fd)
struct files_struct *files = current->files;
struct file *file;
+ spin_lock(&files->file_lock);
file = pick_file(files, fd);
- if (IS_ERR(file))
+ spin_unlock(&files->file_lock);
+ if (!file)
return -EBADF;
return filp_close(file, files);
@@ -702,20 +693,25 @@ static inline void __range_cloexec(struct files_struct *cur_fds,
static inline void __range_close(struct files_struct *cur_fds, unsigned int fd,
unsigned int max_fd)
{
+ unsigned n;
+
+ rcu_read_lock();
+ n = last_fd(files_fdtable(cur_fds));
+ rcu_read_unlock();
+ max_fd = min(max_fd, n);
+
while (fd <= max_fd) {
struct file *file;
+ spin_lock(&cur_fds->file_lock);
file = pick_file(cur_fds, fd++);
- if (!IS_ERR(file)) {
+ spin_unlock(&cur_fds->file_lock);
+
+ if (file) {
/* found a valid file to close */
filp_close(file, cur_fds);
cond_resched();
- continue;
}
-
- /* beyond the last fd in that table */
- if (PTR_ERR(file) == -EINVAL)
- return;
}
}
@@ -795,26 +791,9 @@ int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
* See close_fd_get_file() below, this variant assumes current->files->file_lock
* is held.
*/
-int __close_fd_get_file(unsigned int fd, struct file **res)
+struct file *__close_fd_get_file(unsigned int fd)
{
- struct files_struct *files = current->files;
- struct file *file;
- struct fdtable *fdt;
-
- fdt = files_fdtable(files);
- if (fd >= fdt->max_fds)
- goto out_err;
- file = fdt->fd[fd];
- if (!file)
- goto out_err;
- rcu_assign_pointer(fdt->fd[fd], NULL);
- __put_unused_fd(files, fd);
- get_file(file);
- *res = file;
- return 0;
-out_err:
- *res = NULL;
- return -ENOENT;
+ return pick_file(current->files, fd);
}
/*
@@ -822,16 +801,16 @@ int __close_fd_get_file(unsigned int fd, struct file **res)
* The caller must ensure that filp_close() called on the file, and then
* an fput().
*/
-int close_fd_get_file(unsigned int fd, struct file **res)
+struct file *close_fd_get_file(unsigned int fd)
{
struct files_struct *files = current->files;
- int ret;
+ struct file *file;
spin_lock(&files->file_lock);
- ret = __close_fd_get_file(fd, res);
+ file = pick_file(files, fd);
spin_unlock(&files->file_lock);
- return ret;
+ return file;
}
void do_close_on_exec(struct files_struct *files)
diff --git a/fs/internal.h b/fs/internal.h
index 08503dc68d2b..4065e2679103 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -125,7 +125,7 @@ extern struct file *do_file_open_root(const struct path *,
const char *, const struct open_flags *);
extern struct open_how build_open_how(int flags, umode_t mode);
extern int build_open_flags(const struct open_how *how, struct open_flags *op);
-extern int __close_fd_get_file(unsigned int fd, struct file **res);
+extern struct file *__close_fd_get_file(unsigned int fd);
long do_sys_ftruncate(unsigned int fd, loff_t length, int small);
int chmod_common(const struct path *path, umode_t mode);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index dc580a30723d..7257b0870353 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5137,13 +5137,10 @@ static int io_close(struct io_kiocb *req, unsigned int issue_flags)
return -EAGAIN;
}
- ret = __close_fd_get_file(close->fd, &file);
+ file = __close_fd_get_file(close->fd);
spin_unlock(&files->file_lock);
- if (ret < 0) {
- if (ret == -ENOENT)
- ret = -EBADF;
+ if (!file)
goto err;
- }
/* No ->flush() or already async, safely close from here */
ret = filp_close(file, current->files);
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index d0e78174874a..e066816f3519 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -125,7 +125,7 @@ int iterate_fd(struct files_struct *, unsigned,
extern int close_fd(unsigned int fd);
extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags);
-extern int close_fd_get_file(unsigned int fd, struct file **res);
+extern struct file *close_fd_get_file(unsigned int fd);
extern int unshare_fd(unsigned long unshare_flags, unsigned int max_fds,
struct files_struct **new_fdp);
^ permalink raw reply related [flat|nested] 10+ messages in thread
* [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
@ 2022-05-12 23:26 ` Al Viro
2022-05-12 23:48 ` Jens Axboe
2022-05-13 10:15 ` Christian Brauner
2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
2022-05-13 10:52 ` Christian Brauner
2 siblings, 2 replies; 10+ messages in thread
From: Al Viro @ 2022-05-12 23:26 UTC (permalink / raw)
To: linux-fsdevel; +Cc: Jens Axboe, Pavel Begunkov
Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
optimisation", should've been killed back then...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/file.c b/fs/file.c
index 9780888fa2da..9fbc0c653930 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -850,7 +850,7 @@ void do_close_on_exec(struct files_struct *files)
}
static inline struct file *__fget_files_rcu(struct files_struct *files,
- unsigned int fd, fmode_t mask, unsigned int refs)
+ unsigned int fd, fmode_t mask)
{
for (;;) {
struct file *file;
@@ -876,10 +876,10 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
* Such a race can take two forms:
*
* (a) the file ref already went down to zero,
- * and get_file_rcu_many() fails. Just try
+ * and get_file_rcu() fails. Just try
* again:
*/
- if (unlikely(!get_file_rcu_many(file, refs)))
+ if (unlikely(!get_file_rcu(file)))
continue;
/*
@@ -888,11 +888,11 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
* pointer having changed, because it always goes
* hand-in-hand with 'fdt'.
*
- * If so, we need to put our refs and try again.
+ * If so, we need to put our ref and try again.
*/
if (unlikely(rcu_dereference_raw(files->fdt) != fdt) ||
unlikely(rcu_dereference_raw(*fdentry) != file)) {
- fput_many(file, refs);
+ fput(file);
continue;
}
@@ -905,37 +905,31 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
}
static struct file *__fget_files(struct files_struct *files, unsigned int fd,
- fmode_t mask, unsigned int refs)
+ fmode_t mask)
{
struct file *file;
rcu_read_lock();
- file = __fget_files_rcu(files, fd, mask, refs);
+ file = __fget_files_rcu(files, fd, mask);
rcu_read_unlock();
return file;
}
-static inline struct file *__fget(unsigned int fd, fmode_t mask,
- unsigned int refs)
-{
- return __fget_files(current->files, fd, mask, refs);
-}
-
-struct file *fget_many(unsigned int fd, unsigned int refs)
+static inline struct file *__fget(unsigned int fd, fmode_t mask)
{
- return __fget(fd, FMODE_PATH, refs);
+ return __fget_files(current->files, fd, mask);
}
struct file *fget(unsigned int fd)
{
- return __fget(fd, FMODE_PATH, 1);
+ return __fget(fd, FMODE_PATH);
}
EXPORT_SYMBOL(fget);
struct file *fget_raw(unsigned int fd)
{
- return __fget(fd, 0, 1);
+ return __fget(fd, 0);
}
EXPORT_SYMBOL(fget_raw);
@@ -945,7 +939,7 @@ struct file *fget_task(struct task_struct *task, unsigned int fd)
task_lock(task);
if (task->files)
- file = __fget_files(task->files, fd, 0, 1);
+ file = __fget_files(task->files, fd, 0);
task_unlock(task);
return file;
@@ -1014,7 +1008,7 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask)
return 0;
return (unsigned long)file;
} else {
- file = __fget(fd, mask, 1);
+ file = __fget(fd, mask);
if (!file)
return 0;
return FDPUT_FPUT | (unsigned long)file;
diff --git a/fs/file_table.c b/fs/file_table.c
index 7d2e692b66a9..1ffd74bbbed6 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -368,9 +368,9 @@ EXPORT_SYMBOL_GPL(flush_delayed_fput);
static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput);
-void fput_many(struct file *file, unsigned int refs)
+void fput(struct file *file)
{
- if (atomic_long_sub_and_test(refs, &file->f_count)) {
+ if (atomic_long_dec_and_test(&file->f_count)) {
struct task_struct *task = current;
if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
@@ -389,11 +389,6 @@ void fput_many(struct file *file, unsigned int refs)
}
}
-void fput(struct file *file)
-{
- fput_many(file, 1);
-}
-
/*
* synchronous analog of fput(); for kernel threads that might be needed
* in some umount() (and thus can't use flush_delayed_fput() without
diff --git a/include/linux/file.h b/include/linux/file.h
index 51e830b4fe3a..39704eae83e2 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -14,7 +14,6 @@
struct file;
extern void fput(struct file *);
-extern void fput_many(struct file *, unsigned int);
struct file_operations;
struct task_struct;
@@ -47,7 +46,6 @@ static inline void fdput(struct fd fd)
}
extern struct file *fget(unsigned int fd);
-extern struct file *fget_many(unsigned int fd, unsigned int refs);
extern struct file *fget_raw(unsigned int fd);
extern struct file *fget_task(struct task_struct *task, unsigned int fd);
extern unsigned long __fdget(unsigned int fd);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bbde95387a23..0521f0b1356b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -981,9 +981,8 @@ static inline struct file *get_file(struct file *f)
atomic_long_inc(&f->f_count);
return f;
}
-#define get_file_rcu_many(x, cnt) \
- atomic_long_add_unless(&(x)->f_count, (cnt), 0)
-#define get_file_rcu(x) get_file_rcu_many((x), 1)
+#define get_file_rcu(x) \
+ atomic_long_add_unless(&(x)->f_count, 1, 0)
#define file_count(x) atomic_long_read(&(x)->f_count)
#define MAX_NON_LFS ((1UL<<31) - 1)
^ permalink raw reply related [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
@ 2022-05-12 23:48 ` Jens Axboe
2022-05-13 0:48 ` Al Viro
2022-05-13 10:15 ` Christian Brauner
1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2022-05-12 23:48 UTC (permalink / raw)
To: Al Viro, linux-fsdevel; +Cc: Pavel Begunkov
On 5/12/22 5:26 PM, Al Viro wrote:
> Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> optimisation", should've been killed back then...
I'm pretty sure this has been sent out before, forget from whom. So it's
not like it hasn't been suggested or posted... In case it matters:
Acked-by: Jens Axboe <axboe@kernel.dk>
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] unify the file-closing stuff in fs/file.c
2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
@ 2022-05-12 23:48 ` Jens Axboe
2022-05-13 0:46 ` Todd Kjos
2022-05-13 10:52 ` Christian Brauner
2 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2022-05-12 23:48 UTC (permalink / raw)
To: Al Viro, linux-fsdevel; +Cc: Christian Brauner, Todd Kjos, Giuseppe Scrivano
On 5/12/22 3:20 PM, Al Viro wrote:
> Right now we have two places that do such removals - pick_file()
> and {__,}close_fd_get_file().
>
> They are almost identical - the only difference is in calling
> conventions (well, and the fact that __... is called with descriptor
> table locked).
>
> Calling conventions are... interesting.
>
> 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> The latter is for "descriptor is greater than size of descriptor table".
> One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
>
> 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> passed to caller by way of struct file ** argument. One of the callers
> (binder) ignores the return value completely and checks if the file is NULL.
> Another (io_uring) checks for return value being negative, then maps
> -ENOENT to -EBADF, not that any other value would be possible.
>
> ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> {__,}close_fd_get_file() conventions are insane. The older caller
> (in binder) had never even looked at return value; the newer one
> patches the bogus -ENOENT to what it wants to report, with strange
> "defensive" BS logics just in case __close_fd_get_file() would somehow
> find a different error to report.
>
> At the very least, {__,}close_fd_get_file() callers would've been happier
> if it just returned file or NULL. What's more, I'm seriously tempted
> to make pick_file() do the same thing. close_fd() won't care (checking
> for NULL is just as easy as for IS_ERR) and __range_close() could just
> as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
>
> Does anybody see problems with the following?
Looks good to me, and much better than passing in the pointer to the
file pointer imho.
--
Jens Axboe
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] unify the file-closing stuff in fs/file.c
2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
@ 2022-05-13 0:46 ` Todd Kjos
0 siblings, 0 replies; 10+ messages in thread
From: Todd Kjos @ 2022-05-13 0:46 UTC (permalink / raw)
To: Jens Axboe; +Cc: Al Viro, linux-fsdevel, Christian Brauner, Giuseppe Scrivano
On Thu, May 12, 2022 at 4:48 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 5/12/22 3:20 PM, Al Viro wrote:
> > Right now we have two places that do such removals - pick_file()
> > and {__,}close_fd_get_file().
> >
> > They are almost identical - the only difference is in calling
> > conventions (well, and the fact that __... is called with descriptor
> > table locked).
> >
> > Calling conventions are... interesting.
> >
> > 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> > The latter is for "descriptor is greater than size of descriptor table".
> > One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> > uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
> >
> > 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> > passed to caller by way of struct file ** argument. One of the callers
> > (binder) ignores the return value completely and checks if the file is NULL.
> > Another (io_uring) checks for return value being negative, then maps
> > -ENOENT to -EBADF, not that any other value would be possible.
> >
> > ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> > {__,}close_fd_get_file() conventions are insane. The older caller
> > (in binder) had never even looked at return value; the newer one
> > patches the bogus -ENOENT to what it wants to report, with strange
> > "defensive" BS logics just in case __close_fd_get_file() would somehow
> > find a different error to report.
> >
> > At the very least, {__,}close_fd_get_file() callers would've been happier
> > if it just returned file or NULL. What's more, I'm seriously tempted
> > to make pick_file() do the same thing. close_fd() won't care (checking
> > for NULL is just as easy as for IS_ERR) and __range_close() could just
> > as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
> >
> > Does anybody see problems with the following?
>
> Looks good to me, and much better than passing in the pointer to the
> file pointer imho.
I agree. This looks good.
>
> --
> Jens Axboe
>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
2022-05-12 23:48 ` Jens Axboe
@ 2022-05-13 0:48 ` Al Viro
0 siblings, 0 replies; 10+ messages in thread
From: Al Viro @ 2022-05-13 0:48 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-fsdevel, Pavel Begunkov
On Thu, May 12, 2022 at 05:48:08PM -0600, Jens Axboe wrote:
> On 5/12/22 5:26 PM, Al Viro wrote:
> > Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> > optimisation", should've been killed back then...
>
> I'm pretty sure this has been sent out before, forget from whom.
Right you are... From Gou Hao, had fallen through the cracks back in
November ;-/ Rebased and replaced my variant with it now...
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
2022-05-12 23:48 ` Jens Axboe
@ 2022-05-13 10:15 ` Christian Brauner
1 sibling, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2022-05-13 10:15 UTC (permalink / raw)
To: Al Viro; +Cc: linux-fsdevel, Jens Axboe, Pavel Begunkov
On Thu, May 12, 2022 at 11:26:39PM +0000, Al Viro wrote:
> Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> optimisation", should've been killed back then...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---
looks good,
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] unify the file-closing stuff in fs/file.c
2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
@ 2022-05-13 10:52 ` Christian Brauner
2022-05-14 23:33 ` Al Viro
2 siblings, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2022-05-13 10:52 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
Giuseppe Scrivano
On Thu, May 12, 2022 at 09:20:51PM +0000, Al Viro wrote:
> Right now we have two places that do such removals - pick_file()
> and {__,}close_fd_get_file().
>
> They are almost identical - the only difference is in calling
> conventions (well, and the fact that __... is called with descriptor
> table locked).
>
> Calling conventions are... interesting.
>
> 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> The latter is for "descriptor is greater than size of descriptor table".
> One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
>
> 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> passed to caller by way of struct file ** argument. One of the callers
> (binder) ignores the return value completely and checks if the file is NULL.
> Another (io_uring) checks for return value being negative, then maps
> -ENOENT to -EBADF, not that any other value would be possible.
>
> ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> {__,}close_fd_get_file() conventions are insane. The older caller
> (in binder) had never even looked at return value; the newer one
> patches the bogus -ENOENT to what it wants to report, with strange
> "defensive" BS logics just in case __close_fd_get_file() would somehow
> find a different error to report.
>
> At the very least, {__,}close_fd_get_file() callers would've been happier
> if it just returned file or NULL. What's more, I'm seriously tempted
> to make pick_file() do the same thing. close_fd() won't care (checking
> for NULL is just as easy as for IS_ERR) and __range_close() could just
> as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
Originally, __close_range() did that last_fd() thing for both the
cloexec and the non-cloexec case. But that proved buggy for the cloexec
part (dumb oversight). So the cloexec part retrieves the last_fd() and
marks cloexec under the spinlock to make sure it's stable.
We could've done the same that this patch does now and retrieved
last_fd() in __range_close() too. But without having looked at
{__,}close_fd_get_file() it was more obvious imho to let pick_file()
tell the caller when to terminate the loop as it avoids fiddling
last_fd() out from under the rcu lock given that we below we had to take
the spinlock anyway.
With the motivation of reducing __close_fd_get_file() to pick_file() the
other way is more sensible.
>
> Does anybody see problems with the following?
>
> commit 8819510a641800a63ab10d6b5ab283cada1cbd50
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date: Thu May 12 17:08:03 2022 -0400
>
> Unify the primitives for file descriptor closing
>
> Currently we have 3 primitives for removing an opened file from descriptor
> table - pick_file(), __close_fd_get_file() and close_fd_get_file(). Their
> calling conventions are rather odd and there's a code duplication for no
> good reason. They can be unified -
>
> 1) have __range_close() cap max_fd in the very beginning; that way
> we don't need separate way for pick_file() to report being past the end
> of descriptor table.
>
> 2) make {__,}close_fd_get_file() return file (or NULL) directly, rather
> than returning it via struct file ** argument. Don't bother with
> (bogus) return value - nobody wants that -ENOENT.
>
> 3) make pick_file() return NULL on unopened descriptor - the only caller
> that used to care about the distinction between descriptor past the end
> of descriptor table and finding NULL in descriptor table doesn't give
> a damn after (1).
>
> 4) lift ->files_lock out of pick_file()
>
> That actually simplifies the callers, as well as the primitives themselves.
> Code duplication is also gone...
>
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
The change in io_close() looked a bit subtle because ret was overriden
in there by __close_fd_get_file() prior to changing it to return struct
file but ret is set to -EBADF at the top of io_close() so it looks fine.
Since you change pick_file() to require the caller to hold the lock it'd
be good to add a:
Context: Caller must hold files_lock.
to the kernel-doc I added; similar to what I did for last_fd().
Also, there's a bunch of regression tests I added in:
tools/testing/selftests/core/close_range_test.c
including various tests for issues reported by syzbot. Might be worth
running to verify we didn't regress anything.
Thanks!
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] unify the file-closing stuff in fs/file.c
2022-05-13 10:52 ` Christian Brauner
@ 2022-05-14 23:33 ` Al Viro
2022-05-16 8:04 ` Christian Brauner
0 siblings, 1 reply; 10+ messages in thread
From: Al Viro @ 2022-05-14 23:33 UTC (permalink / raw)
To: Christian Brauner
Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
Giuseppe Scrivano
On Fri, May 13, 2022 at 12:52:18PM +0200, Christian Brauner wrote:
> Context: Caller must hold files_lock.
Done and force-pushed to #work.fd
> Also, there's a bunch of regression tests I added in:
>
> tools/testing/selftests/core/close_range_test.c
>
> including various tests for issues reported by syzbot. Might be worth
> running to verify we didn't regress anything.
# PASSED: 7 / 7 tests passed.
# Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0
^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: [RFC] unify the file-closing stuff in fs/file.c
2022-05-14 23:33 ` Al Viro
@ 2022-05-16 8:04 ` Christian Brauner
0 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2022-05-16 8:04 UTC (permalink / raw)
To: Al Viro
Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
Giuseppe Scrivano
On Sat, May 14, 2022 at 11:33:01PM +0000, Al Viro wrote:
> On Fri, May 13, 2022 at 12:52:18PM +0200, Christian Brauner wrote:
>
> > Context: Caller must hold files_lock.
>
> Done and force-pushed to #work.fd
>
> > Also, there's a bunch of regression tests I added in:
> >
> > tools/testing/selftests/core/close_range_test.c
> >
> > including various tests for issues reported by syzbot. Might be worth
> > running to verify we didn't regress anything.
>
> # PASSED: 7 / 7 tests passed.
> # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0
Thank you, appreciate it!
Christian
^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-05-16 8:05 UTC | newest]
Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
2022-05-12 23:48 ` Jens Axboe
2022-05-13 0:48 ` Al Viro
2022-05-13 10:15 ` Christian Brauner
2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
2022-05-13 0:46 ` Todd Kjos
2022-05-13 10:52 ` Christian Brauner
2022-05-14 23:33 ` Al Viro
2022-05-16 8:04 ` Christian Brauner
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.