All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] unify the file-closing stuff in fs/file.c
@ 2022-05-12 21:20 Al Viro
  2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Al Viro @ 2022-05-12 21:20 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Christian Brauner, Jens Axboe, Todd Kjos, Giuseppe Scrivano

	Right now we have two places that do such removals - pick_file()
and {__,}close_fd_get_file().

	They are almost identical - the only difference is in calling
conventions (well, and the fact that __... is called with descriptor
table locked).

	Calling conventions are... interesting.

1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
The latter is for "descriptor is greater than size of descriptor table".
One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
uses ERR_PTR(-EINVAL) as "end the loop now" indicator.

2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
passed to caller by way of struct file ** argument.  One of the callers
(binder) ignores the return value completely and checks if the file is NULL.
Another (io_uring) checks for return value being negative, then maps
-ENOENT to -EBADF, not that any other value would be possible.

ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
{__,}close_fd_get_file() conventions are insane.  The older caller
(in binder) had never even looked at return value; the newer one
patches the bogus -ENOENT to what it wants to report, with strange
"defensive" BS logics just in case __close_fd_get_file() would somehow
find a different error to report.

At the very least, {__,}close_fd_get_file() callers would've been happier
if it just returned file or NULL.  What's more, I'm seriously tempted
to make pick_file() do the same thing.  close_fd() won't care (checking
for NULL is just as easy as for IS_ERR) and __range_close() could just
as well cap the max_fd argument with last_fd(files_fdtable(current->files)).

Does anybody see problems with the following?

commit 8819510a641800a63ab10d6b5ab283cada1cbd50
Author: Al Viro <viro@zeniv.linux.org.uk>
Date:   Thu May 12 17:08:03 2022 -0400

    Unify the primitives for file descriptor closing
    
    Currently we have 3 primitives for removing an opened file from descriptor
    table - pick_file(), __close_fd_get_file() and close_fd_get_file().  Their
    calling conventions are rather odd and there's a code duplication for no
    good reason.  They can be unified -
    
    1) have __range_close() cap max_fd in the very beginning; that way
    we don't need separate way for pick_file() to report being past the end
    of descriptor table.
    
    2) make {__,}close_fd_get_file() return file (or NULL) directly, rather
    than returning it via struct file ** argument.  Don't bother with
    (bogus) return value - nobody wants that -ENOENT.
    
    3) make pick_file() return NULL on unopened descriptor - the only caller
    that used to care about the distinction between descriptor past the end
    of descriptor table and finding NULL in descriptor table doesn't give
    a damn after (1).
    
    4) lift ->files_lock out of pick_file()
    
    That actually simplifies the callers, as well as the primitives themselves.
    Code duplication is also gone...
    
    Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

diff --git a/drivers/android/binder.c b/drivers/android/binder.c
index 8351c5638880..27c9b004823a 100644
--- a/drivers/android/binder.c
+++ b/drivers/android/binder.c
@@ -1855,7 +1855,7 @@ static void binder_deferred_fd_close(int fd)
 	if (!twcb)
 		return;
 	init_task_work(&twcb->twork, binder_do_fd_close);
-	close_fd_get_file(fd, &twcb->file);
+	twcb->file = close_fd_get_file(fd);
 	if (twcb->file) {
 		filp_close(twcb->file, current->files);
 		task_work_add(current, &twcb->twork, TWA_RESUME);
diff --git a/fs/file.c b/fs/file.c
index ee9317346702..9780888fa2da 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -630,32 +630,21 @@ EXPORT_SYMBOL(fd_install);
  * @files: file struct to retrieve file from
  * @fd: file descriptor to retrieve file for
  *
- * If this functions returns an EINVAL error pointer the fd was beyond the
- * current maximum number of file descriptors for that fdtable.
- *
- * Returns: The file associated with @fd, on error returns an error pointer.
+ * Returns: The file associated with @fd (NULL if @fd is not open)
  */
 static struct file *pick_file(struct files_struct *files, unsigned fd)
 {
+	struct fdtable *fdt = files_fdtable(files);
 	struct file *file;
-	struct fdtable *fdt;
 
-	spin_lock(&files->file_lock);
-	fdt = files_fdtable(files);
-	if (fd >= fdt->max_fds) {
-		file = ERR_PTR(-EINVAL);
-		goto out_unlock;
-	}
+	if (fd >= fdt->max_fds)
+		return NULL;
+
 	file = fdt->fd[fd];
-	if (!file) {
-		file = ERR_PTR(-EBADF);
-		goto out_unlock;
+	if (file) {
+		rcu_assign_pointer(fdt->fd[fd], NULL);
+		__put_unused_fd(files, fd);
 	}
-	rcu_assign_pointer(fdt->fd[fd], NULL);
-	__put_unused_fd(files, fd);
-
-out_unlock:
-	spin_unlock(&files->file_lock);
 	return file;
 }
 
@@ -664,8 +653,10 @@ int close_fd(unsigned fd)
 	struct files_struct *files = current->files;
 	struct file *file;
 
+	spin_lock(&files->file_lock);
 	file = pick_file(files, fd);
-	if (IS_ERR(file))
+	spin_unlock(&files->file_lock);
+	if (!file)
 		return -EBADF;
 
 	return filp_close(file, files);
@@ -702,20 +693,25 @@ static inline void __range_cloexec(struct files_struct *cur_fds,
 static inline void __range_close(struct files_struct *cur_fds, unsigned int fd,
 				 unsigned int max_fd)
 {
+	unsigned n;
+
+	rcu_read_lock();
+	n = last_fd(files_fdtable(cur_fds));
+	rcu_read_unlock();
+	max_fd = min(max_fd, n);
+
 	while (fd <= max_fd) {
 		struct file *file;
 
+		spin_lock(&cur_fds->file_lock);
 		file = pick_file(cur_fds, fd++);
-		if (!IS_ERR(file)) {
+		spin_unlock(&cur_fds->file_lock);
+
+		if (file) {
 			/* found a valid file to close */
 			filp_close(file, cur_fds);
 			cond_resched();
-			continue;
 		}
-
-		/* beyond the last fd in that table */
-		if (PTR_ERR(file) == -EINVAL)
-			return;
 	}
 }
 
@@ -795,26 +791,9 @@ int __close_range(unsigned fd, unsigned max_fd, unsigned int flags)
  * See close_fd_get_file() below, this variant assumes current->files->file_lock
  * is held.
  */
-int __close_fd_get_file(unsigned int fd, struct file **res)
+struct file *__close_fd_get_file(unsigned int fd)
 {
-	struct files_struct *files = current->files;
-	struct file *file;
-	struct fdtable *fdt;
-
-	fdt = files_fdtable(files);
-	if (fd >= fdt->max_fds)
-		goto out_err;
-	file = fdt->fd[fd];
-	if (!file)
-		goto out_err;
-	rcu_assign_pointer(fdt->fd[fd], NULL);
-	__put_unused_fd(files, fd);
-	get_file(file);
-	*res = file;
-	return 0;
-out_err:
-	*res = NULL;
-	return -ENOENT;
+	return pick_file(current->files, fd);
 }
 
 /*
@@ -822,16 +801,16 @@ int __close_fd_get_file(unsigned int fd, struct file **res)
  * The caller must ensure that filp_close() called on the file, and then
  * an fput().
  */
-int close_fd_get_file(unsigned int fd, struct file **res)
+struct file *close_fd_get_file(unsigned int fd)
 {
 	struct files_struct *files = current->files;
-	int ret;
+	struct file *file;
 
 	spin_lock(&files->file_lock);
-	ret = __close_fd_get_file(fd, res);
+	file = pick_file(files, fd);
 	spin_unlock(&files->file_lock);
 
-	return ret;
+	return file;
 }
 
 void do_close_on_exec(struct files_struct *files)
diff --git a/fs/internal.h b/fs/internal.h
index 08503dc68d2b..4065e2679103 100644
--- a/fs/internal.h
+++ b/fs/internal.h
@@ -125,7 +125,7 @@ extern struct file *do_file_open_root(const struct path *,
 		const char *, const struct open_flags *);
 extern struct open_how build_open_how(int flags, umode_t mode);
 extern int build_open_flags(const struct open_how *how, struct open_flags *op);
-extern int __close_fd_get_file(unsigned int fd, struct file **res);
+extern struct file *__close_fd_get_file(unsigned int fd);
 
 long do_sys_ftruncate(unsigned int fd, loff_t length, int small);
 int chmod_common(const struct path *path, umode_t mode);
diff --git a/fs/io_uring.c b/fs/io_uring.c
index dc580a30723d..7257b0870353 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5137,13 +5137,10 @@ static int io_close(struct io_kiocb *req, unsigned int issue_flags)
 		return -EAGAIN;
 	}
 
-	ret = __close_fd_get_file(close->fd, &file);
+	file = __close_fd_get_file(close->fd);
 	spin_unlock(&files->file_lock);
-	if (ret < 0) {
-		if (ret == -ENOENT)
-			ret = -EBADF;
+	if (!file)
 		goto err;
-	}
 
 	/* No ->flush() or already async, safely close from here */
 	ret = filp_close(file, current->files);
diff --git a/include/linux/fdtable.h b/include/linux/fdtable.h
index d0e78174874a..e066816f3519 100644
--- a/include/linux/fdtable.h
+++ b/include/linux/fdtable.h
@@ -125,7 +125,7 @@ int iterate_fd(struct files_struct *, unsigned,
 
 extern int close_fd(unsigned int fd);
 extern int __close_range(unsigned int fd, unsigned int max_fd, unsigned int flags);
-extern int close_fd_get_file(unsigned int fd, struct file **res);
+extern struct file *close_fd_get_file(unsigned int fd);
 extern int unshare_fd(unsigned long unshare_flags, unsigned int max_fds,
 		      struct files_struct **new_fdp);
 

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
  2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
@ 2022-05-12 23:26 ` Al Viro
  2022-05-12 23:48   ` Jens Axboe
  2022-05-13 10:15   ` Christian Brauner
  2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
  2022-05-13 10:52 ` Christian Brauner
  2 siblings, 2 replies; 10+ messages in thread
From: Al Viro @ 2022-05-12 23:26 UTC (permalink / raw)
  To: linux-fsdevel; +Cc: Jens Axboe, Pavel Begunkov

Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
optimisation", should've been killed back then...

Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
---
diff --git a/fs/file.c b/fs/file.c
index 9780888fa2da..9fbc0c653930 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -850,7 +850,7 @@ void do_close_on_exec(struct files_struct *files)
 }
 
 static inline struct file *__fget_files_rcu(struct files_struct *files,
-	unsigned int fd, fmode_t mask, unsigned int refs)
+	unsigned int fd, fmode_t mask)
 {
 	for (;;) {
 		struct file *file;
@@ -876,10 +876,10 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
 		 * Such a race can take two forms:
 		 *
 		 *  (a) the file ref already went down to zero,
-		 *      and get_file_rcu_many() fails. Just try
+		 *      and get_file_rcu() fails. Just try
 		 *      again:
 		 */
-		if (unlikely(!get_file_rcu_many(file, refs)))
+		if (unlikely(!get_file_rcu(file)))
 			continue;
 
 		/*
@@ -888,11 +888,11 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
 		 *       pointer having changed, because it always goes
 		 *       hand-in-hand with 'fdt'.
 		 *
-		 * If so, we need to put our refs and try again.
+		 * If so, we need to put our ref and try again.
 		 */
 		if (unlikely(rcu_dereference_raw(files->fdt) != fdt) ||
 		    unlikely(rcu_dereference_raw(*fdentry) != file)) {
-			fput_many(file, refs);
+			fput(file);
 			continue;
 		}
 
@@ -905,37 +905,31 @@ static inline struct file *__fget_files_rcu(struct files_struct *files,
 }
 
 static struct file *__fget_files(struct files_struct *files, unsigned int fd,
-				 fmode_t mask, unsigned int refs)
+				 fmode_t mask)
 {
 	struct file *file;
 
 	rcu_read_lock();
-	file = __fget_files_rcu(files, fd, mask, refs);
+	file = __fget_files_rcu(files, fd, mask);
 	rcu_read_unlock();
 
 	return file;
 }
 
-static inline struct file *__fget(unsigned int fd, fmode_t mask,
-				  unsigned int refs)
-{
-	return __fget_files(current->files, fd, mask, refs);
-}
-
-struct file *fget_many(unsigned int fd, unsigned int refs)
+static inline struct file *__fget(unsigned int fd, fmode_t mask)
 {
-	return __fget(fd, FMODE_PATH, refs);
+	return __fget_files(current->files, fd, mask);
 }
 
 struct file *fget(unsigned int fd)
 {
-	return __fget(fd, FMODE_PATH, 1);
+	return __fget(fd, FMODE_PATH);
 }
 EXPORT_SYMBOL(fget);
 
 struct file *fget_raw(unsigned int fd)
 {
-	return __fget(fd, 0, 1);
+	return __fget(fd, 0);
 }
 EXPORT_SYMBOL(fget_raw);
 
@@ -945,7 +939,7 @@ struct file *fget_task(struct task_struct *task, unsigned int fd)
 
 	task_lock(task);
 	if (task->files)
-		file = __fget_files(task->files, fd, 0, 1);
+		file = __fget_files(task->files, fd, 0);
 	task_unlock(task);
 
 	return file;
@@ -1014,7 +1008,7 @@ static unsigned long __fget_light(unsigned int fd, fmode_t mask)
 			return 0;
 		return (unsigned long)file;
 	} else {
-		file = __fget(fd, mask, 1);
+		file = __fget(fd, mask);
 		if (!file)
 			return 0;
 		return FDPUT_FPUT | (unsigned long)file;
diff --git a/fs/file_table.c b/fs/file_table.c
index 7d2e692b66a9..1ffd74bbbed6 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -368,9 +368,9 @@ EXPORT_SYMBOL_GPL(flush_delayed_fput);
 
 static DECLARE_DELAYED_WORK(delayed_fput_work, delayed_fput);
 
-void fput_many(struct file *file, unsigned int refs)
+void fput(struct file *file)
 {
-	if (atomic_long_sub_and_test(refs, &file->f_count)) {
+	if (atomic_long_dec_and_test(&file->f_count)) {
 		struct task_struct *task = current;
 
 		if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) {
@@ -389,11 +389,6 @@ void fput_many(struct file *file, unsigned int refs)
 	}
 }
 
-void fput(struct file *file)
-{
-	fput_many(file, 1);
-}
-
 /*
  * synchronous analog of fput(); for kernel threads that might be needed
  * in some umount() (and thus can't use flush_delayed_fput() without
diff --git a/include/linux/file.h b/include/linux/file.h
index 51e830b4fe3a..39704eae83e2 100644
--- a/include/linux/file.h
+++ b/include/linux/file.h
@@ -14,7 +14,6 @@
 struct file;
 
 extern void fput(struct file *);
-extern void fput_many(struct file *, unsigned int);
 
 struct file_operations;
 struct task_struct;
@@ -47,7 +46,6 @@ static inline void fdput(struct fd fd)
 }
 
 extern struct file *fget(unsigned int fd);
-extern struct file *fget_many(unsigned int fd, unsigned int refs);
 extern struct file *fget_raw(unsigned int fd);
 extern struct file *fget_task(struct task_struct *task, unsigned int fd);
 extern unsigned long __fdget(unsigned int fd);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index bbde95387a23..0521f0b1356b 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -981,9 +981,8 @@ static inline struct file *get_file(struct file *f)
 	atomic_long_inc(&f->f_count);
 	return f;
 }
-#define get_file_rcu_many(x, cnt)	\
-	atomic_long_add_unless(&(x)->f_count, (cnt), 0)
-#define get_file_rcu(x) get_file_rcu_many((x), 1)
+#define get_file_rcu(x)	\
+	atomic_long_add_unless(&(x)->f_count, 1, 0)
 #define file_count(x)	atomic_long_read(&(x)->f_count)
 
 #define	MAX_NON_LFS	((1UL<<31) - 1)

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
  2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
@ 2022-05-12 23:48   ` Jens Axboe
  2022-05-13  0:48     ` Al Viro
  2022-05-13 10:15   ` Christian Brauner
  1 sibling, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2022-05-12 23:48 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel; +Cc: Pavel Begunkov

On 5/12/22 5:26 PM, Al Viro wrote:
> Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> optimisation", should've been killed back then...

I'm pretty sure this has been sent out before, forget from whom. So it's
not like it hasn't been suggested or posted... In case it matters:

Acked-by: Jens Axboe <axboe@kernel.dk>

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] unify the file-closing stuff in fs/file.c
  2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
  2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
@ 2022-05-12 23:48 ` Jens Axboe
  2022-05-13  0:46   ` Todd Kjos
  2022-05-13 10:52 ` Christian Brauner
  2 siblings, 1 reply; 10+ messages in thread
From: Jens Axboe @ 2022-05-12 23:48 UTC (permalink / raw)
  To: Al Viro, linux-fsdevel; +Cc: Christian Brauner, Todd Kjos, Giuseppe Scrivano

On 5/12/22 3:20 PM, Al Viro wrote:
> 	Right now we have two places that do such removals - pick_file()
> and {__,}close_fd_get_file().
> 
> 	They are almost identical - the only difference is in calling
> conventions (well, and the fact that __... is called with descriptor
> table locked).
> 
> 	Calling conventions are... interesting.
> 
> 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> The latter is for "descriptor is greater than size of descriptor table".
> One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
> 
> 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> passed to caller by way of struct file ** argument.  One of the callers
> (binder) ignores the return value completely and checks if the file is NULL.
> Another (io_uring) checks for return value being negative, then maps
> -ENOENT to -EBADF, not that any other value would be possible.
> 
> ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> {__,}close_fd_get_file() conventions are insane.  The older caller
> (in binder) had never even looked at return value; the newer one
> patches the bogus -ENOENT to what it wants to report, with strange
> "defensive" BS logics just in case __close_fd_get_file() would somehow
> find a different error to report.
> 
> At the very least, {__,}close_fd_get_file() callers would've been happier
> if it just returned file or NULL.  What's more, I'm seriously tempted
> to make pick_file() do the same thing.  close_fd() won't care (checking
> for NULL is just as easy as for IS_ERR) and __range_close() could just
> as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
> 
> Does anybody see problems with the following?

Looks good to me, and much better than passing in the pointer to the
file pointer imho.

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] unify the file-closing stuff in fs/file.c
  2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
@ 2022-05-13  0:46   ` Todd Kjos
  0 siblings, 0 replies; 10+ messages in thread
From: Todd Kjos @ 2022-05-13  0:46 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Al Viro, linux-fsdevel, Christian Brauner, Giuseppe Scrivano

On Thu, May 12, 2022 at 4:48 PM Jens Axboe <axboe@kernel.dk> wrote:
>
> On 5/12/22 3:20 PM, Al Viro wrote:
> >       Right now we have two places that do such removals - pick_file()
> > and {__,}close_fd_get_file().
> >
> >       They are almost identical - the only difference is in calling
> > conventions (well, and the fact that __... is called with descriptor
> > table locked).
> >
> >       Calling conventions are... interesting.
> >
> > 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> > The latter is for "descriptor is greater than size of descriptor table".
> > One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> > uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
> >
> > 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> > passed to caller by way of struct file ** argument.  One of the callers
> > (binder) ignores the return value completely and checks if the file is NULL.
> > Another (io_uring) checks for return value being negative, then maps
> > -ENOENT to -EBADF, not that any other value would be possible.
> >
> > ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> > {__,}close_fd_get_file() conventions are insane.  The older caller
> > (in binder) had never even looked at return value; the newer one
> > patches the bogus -ENOENT to what it wants to report, with strange
> > "defensive" BS logics just in case __close_fd_get_file() would somehow
> > find a different error to report.
> >
> > At the very least, {__,}close_fd_get_file() callers would've been happier
> > if it just returned file or NULL.  What's more, I'm seriously tempted
> > to make pick_file() do the same thing.  close_fd() won't care (checking
> > for NULL is just as easy as for IS_ERR) and __range_close() could just
> > as well cap the max_fd argument with last_fd(files_fdtable(current->files)).
> >
> > Does anybody see problems with the following?
>
> Looks good to me, and much better than passing in the pointer to the
> file pointer imho.

I agree. This looks good.

>
> --
> Jens Axboe
>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
  2022-05-12 23:48   ` Jens Axboe
@ 2022-05-13  0:48     ` Al Viro
  0 siblings, 0 replies; 10+ messages in thread
From: Al Viro @ 2022-05-13  0:48 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-fsdevel, Pavel Begunkov

On Thu, May 12, 2022 at 05:48:08PM -0600, Jens Axboe wrote:
> On 5/12/22 5:26 PM, Al Viro wrote:
> > Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> > optimisation", should've been killed back then...
> 
> I'm pretty sure this has been sent out before, forget from whom.

Right you are...  From Gou Hao, had fallen through the cracks back in
November ;-/  Rebased and replaced my variant with it now...

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff
  2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
  2022-05-12 23:48   ` Jens Axboe
@ 2022-05-13 10:15   ` Christian Brauner
  1 sibling, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2022-05-13 10:15 UTC (permalink / raw)
  To: Al Viro; +Cc: linux-fsdevel, Jens Axboe, Pavel Begunkov

On Thu, May 12, 2022 at 11:26:39PM +0000, Al Viro wrote:
> Hadn't been used since 62906e89e63b "io_uring: remove file batch-get
> optimisation", should've been killed back then...
> 
> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
> ---

looks good,
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] unify the file-closing stuff in fs/file.c
  2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
  2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
  2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
@ 2022-05-13 10:52 ` Christian Brauner
  2022-05-14 23:33   ` Al Viro
  2 siblings, 1 reply; 10+ messages in thread
From: Christian Brauner @ 2022-05-13 10:52 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
	Giuseppe Scrivano

On Thu, May 12, 2022 at 09:20:51PM +0000, Al Viro wrote:
> 	Right now we have two places that do such removals - pick_file()
> and {__,}close_fd_get_file().
> 
> 	They are almost identical - the only difference is in calling
> conventions (well, and the fact that __... is called with descriptor
> table locked).
> 
> 	Calling conventions are... interesting.
> 
> 1) pick_file() - returns file or ERR_PTR(-EBADF) or ERR_PTR(-EINVAL).
> The latter is for "descriptor is greater than size of descriptor table".
> One of the callers treats all ERR_PTR(...) as "return -EBADF"; another
> uses ERR_PTR(-EINVAL) as "end the loop now" indicator.
> 
> 2) {__,}close_fd_get_file() returns 0 or -ENOENT (huh?), with file (or NULL)
> passed to caller by way of struct file ** argument.  One of the callers
> (binder) ignores the return value completely and checks if the file is NULL.
> Another (io_uring) checks for return value being negative, then maps
> -ENOENT to -EBADF, not that any other value would be possible.
> 
> ERR_PTR(-EINVAL) magic in case of pick_file() is borderline defensible;
> {__,}close_fd_get_file() conventions are insane.  The older caller
> (in binder) had never even looked at return value; the newer one
> patches the bogus -ENOENT to what it wants to report, with strange
> "defensive" BS logics just in case __close_fd_get_file() would somehow
> find a different error to report.
> 
> At the very least, {__,}close_fd_get_file() callers would've been happier
> if it just returned file or NULL.  What's more, I'm seriously tempted
> to make pick_file() do the same thing.  close_fd() won't care (checking
> for NULL is just as easy as for IS_ERR) and __range_close() could just
> as well cap the max_fd argument with last_fd(files_fdtable(current->files)).

Originally, __close_range() did that last_fd() thing for both the
cloexec and the non-cloexec case. But that proved buggy for the cloexec
part (dumb oversight). So the cloexec part retrieves the last_fd() and
marks cloexec under the spinlock to make sure it's stable.

We could've done the same that this patch does now and retrieved
last_fd() in __range_close() too. But without having looked at
{__,}close_fd_get_file() it was more obvious imho to let pick_file()
tell the caller when to terminate the loop as it avoids fiddling
last_fd() out from under the rcu lock given that we below we had to take
the spinlock anyway.

With the motivation of reducing __close_fd_get_file() to pick_file() the
other way is more sensible.

> 
> Does anybody see problems with the following?
> 
> commit 8819510a641800a63ab10d6b5ab283cada1cbd50
> Author: Al Viro <viro@zeniv.linux.org.uk>
> Date:   Thu May 12 17:08:03 2022 -0400
> 
>     Unify the primitives for file descriptor closing
>     
>     Currently we have 3 primitives for removing an opened file from descriptor
>     table - pick_file(), __close_fd_get_file() and close_fd_get_file().  Their
>     calling conventions are rather odd and there's a code duplication for no
>     good reason.  They can be unified -
>     
>     1) have __range_close() cap max_fd in the very beginning; that way
>     we don't need separate way for pick_file() to report being past the end
>     of descriptor table.
>     
>     2) make {__,}close_fd_get_file() return file (or NULL) directly, rather
>     than returning it via struct file ** argument.  Don't bother with
>     (bogus) return value - nobody wants that -ENOENT.
>     
>     3) make pick_file() return NULL on unopened descriptor - the only caller
>     that used to care about the distinction between descriptor past the end
>     of descriptor table and finding NULL in descriptor table doesn't give
>     a damn after (1).
>     
>     4) lift ->files_lock out of pick_file()
>     
>     That actually simplifies the callers, as well as the primitives themselves.
>     Code duplication is also gone...
>     
>     Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>

The change in io_close() looked a bit subtle because ret was overriden
in there by __close_fd_get_file() prior to changing it to return struct
file but ret is set to -EBADF at the top of io_close() so it looks fine.

Since you change pick_file() to require the caller to hold the lock it'd
be good to add a:

	Context: Caller must hold files_lock.

to the kernel-doc I added; similar to what I did for last_fd().

Also, there's a bunch of regression tests I added in:

tools/testing/selftests/core/close_range_test.c

including various tests for issues reported by syzbot. Might be worth
running to verify we didn't regress anything.

Thanks!
Reviewed-by: Christian Brauner (Microsoft) <brauner@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] unify the file-closing stuff in fs/file.c
  2022-05-13 10:52 ` Christian Brauner
@ 2022-05-14 23:33   ` Al Viro
  2022-05-16  8:04     ` Christian Brauner
  0 siblings, 1 reply; 10+ messages in thread
From: Al Viro @ 2022-05-14 23:33 UTC (permalink / raw)
  To: Christian Brauner
  Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
	Giuseppe Scrivano

On Fri, May 13, 2022 at 12:52:18PM +0200, Christian Brauner wrote:

> 	Context: Caller must hold files_lock.

Done and force-pushed to #work.fd
 
> Also, there's a bunch of regression tests I added in:
> 
> tools/testing/selftests/core/close_range_test.c
> 
> including various tests for issues reported by syzbot. Might be worth
> running to verify we didn't regress anything.

# PASSED: 7 / 7 tests passed.
# Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC] unify the file-closing stuff in fs/file.c
  2022-05-14 23:33   ` Al Viro
@ 2022-05-16  8:04     ` Christian Brauner
  0 siblings, 0 replies; 10+ messages in thread
From: Christian Brauner @ 2022-05-16  8:04 UTC (permalink / raw)
  To: Al Viro
  Cc: linux-fsdevel, Christian Brauner, Jens Axboe, Todd Kjos,
	Giuseppe Scrivano

On Sat, May 14, 2022 at 11:33:01PM +0000, Al Viro wrote:
> On Fri, May 13, 2022 at 12:52:18PM +0200, Christian Brauner wrote:
> 
> > 	Context: Caller must hold files_lock.
> 
> Done and force-pushed to #work.fd
>  
> > Also, there's a bunch of regression tests I added in:
> > 
> > tools/testing/selftests/core/close_range_test.c
> > 
> > including various tests for issues reported by syzbot. Might be worth
> > running to verify we didn't regress anything.
> 
> # PASSED: 7 / 7 tests passed.
> # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0

Thank you, appreciate it!

Christian

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-05-16  8:05 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-05-12 21:20 [RFC] unify the file-closing stuff in fs/file.c Al Viro
2022-05-12 23:26 ` [RFC][PATCH] get rid of the remnants of 'batched' fget/fput stuff Al Viro
2022-05-12 23:48   ` Jens Axboe
2022-05-13  0:48     ` Al Viro
2022-05-13 10:15   ` Christian Brauner
2022-05-12 23:48 ` [RFC] unify the file-closing stuff in fs/file.c Jens Axboe
2022-05-13  0:46   ` Todd Kjos
2022-05-13 10:52 ` Christian Brauner
2022-05-14 23:33   ` Al Viro
2022-05-16  8:04     ` Christian Brauner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.