BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH] files: Use rcu lock to get the file structures for better performance
@ 2020-05-21 12:38 Muchun Song
  2020-05-21 15:21 ` Matthew Wilcox
  2020-05-21 16:47 ` Matthew Wilcox
  0 siblings, 2 replies; 7+ messages in thread
From: Muchun Song @ 2020-05-21 12:38 UTC (permalink / raw)
  To: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh
  Cc: ebiederm, bernd.edlinger, linux-kernel, linux-fsdevel, netdev,
	bpf, Muchun Song

There is another safe way to get the file structure without
holding the files->file_lock. That is rcu lock, and this way
has better performance. So use the rcu lock instead of the
files->file_lock.

Signed-off-by: Muchun Song <songmuchun@bytedance.com>
---
 fs/proc/fd.c         | 31 ++++++++++++++++++++++++-------
 kernel/bpf/syscall.c | 17 +++++++++++------
 kernel/kcmp.c        | 15 ++++++++++-----
 3 files changed, 45 insertions(+), 18 deletions(-)

diff --git a/fs/proc/fd.c b/fs/proc/fd.c
index 81882a13212d3..5d5b0f091d32a 100644
--- a/fs/proc/fd.c
+++ b/fs/proc/fd.c
@@ -34,19 +34,27 @@ static int seq_show(struct seq_file *m, void *v)
 	if (files) {
 		unsigned int fd = proc_fd(m->private);
 
-		spin_lock(&files->file_lock);
+		rcu_read_lock();
+again:
 		file = fcheck_files(files, fd);
 		if (file) {
-			struct fdtable *fdt = files_fdtable(files);
+			struct fdtable *fdt;
+
+			if (!get_file_rcu(file)) {
+				/*
+				 * we loop to catch the new file (or NULL
+				 * pointer).
+				 */
+				goto again;
+			}
 
+			fdt = files_fdtable(files);
 			f_flags = file->f_flags;
 			if (close_on_exec(fd, fdt))
 				f_flags |= O_CLOEXEC;
-
-			get_file(file);
 			ret = 0;
 		}
-		spin_unlock(&files->file_lock);
+		rcu_read_unlock();
 		put_files_struct(files);
 	}
 
@@ -160,14 +168,23 @@ static int proc_fd_link(struct dentry *dentry, struct path *path)
 		unsigned int fd = proc_fd(d_inode(dentry));
 		struct file *fd_file;
 
-		spin_lock(&files->file_lock);
+		rcu_read_lock();
+again:
 		fd_file = fcheck_files(files, fd);
 		if (fd_file) {
+			if (!get_file_rcu(fd_file)) {
+				/*
+				 * we loop to catch the new file
+				 * (or NULL pointer).
+				 */
+				goto again;
+			}
 			*path = fd_file->f_path;
 			path_get(&fd_file->f_path);
+			fput(fd_file);
 			ret = 0;
 		}
-		spin_unlock(&files->file_lock);
+		rcu_read_unlock();
 		put_files_struct(files);
 	}
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8608d6e1b0e0e..441c91378a1fc 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3451,14 +3451,19 @@ static int bpf_task_fd_query(const union bpf_attr *attr,
 	if (!files)
 		return -ENOENT;
 
-	err = 0;
-	spin_lock(&files->file_lock);
+	rcu_read_lock();
+again:
 	file = fcheck_files(files, fd);
-	if (!file)
+	if (file) {
+		if (!get_file_rcu(file)) {
+			/* we loop to catch the new file (or NULL pointer) */
+			goto again;
+		}
+		err = 0;
+	} else {
 		err = -EBADF;
-	else
-		get_file(file);
-	spin_unlock(&files->file_lock);
+	}
+	rcu_read_unlock();
 	put_files_struct(files);
 
 	if (err)
diff --git a/kernel/kcmp.c b/kernel/kcmp.c
index b3ff9288c6cc9..3b4f2a54186f2 100644
--- a/kernel/kcmp.c
+++ b/kernel/kcmp.c
@@ -120,13 +120,18 @@ static int kcmp_epoll_target(struct task_struct *task1,
 	if (!files)
 		return -EBADF;
 
-	spin_lock(&files->file_lock);
+	rcu_read_lock();
+again:
 	filp_epoll = fcheck_files(files, slot.efd);
-	if (filp_epoll)
-		get_file(filp_epoll);
-	else
+	if (filp_epoll) {
+		if (!get_file_rcu(filp_epoll)) {
+			/* we loop to catch the new file (or NULL pointer) */
+			goto again;
+		}
+	} else {
 		filp_tgt = ERR_PTR(-EBADF);
-	spin_unlock(&files->file_lock);
+	}
+	rcu_read_unlock();
 	put_files_struct(files);
 
 	if (filp_epoll) {
-- 
2.11.0


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-21 12:38 [PATCH] files: Use rcu lock to get the file structures for better performance Muchun Song
@ 2020-05-21 15:21 ` Matthew Wilcox
  2020-05-21 16:06   ` [External] " Muchun Song
  2020-05-21 16:47 ` Matthew Wilcox
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2020-05-21 15:21 UTC (permalink / raw)
  To: Muchun Song
  Cc: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh, ebiederm, bernd.edlinger, linux-kernel,
	linux-fsdevel, netdev, bpf

On Thu, May 21, 2020 at 08:38:35PM +0800, Muchun Song wrote:
> There is another safe way to get the file structure without
> holding the files->file_lock. That is rcu lock, and this way
> has better performance. So use the rcu lock instead of the
> files->file_lock.

What makes you think this is safe?  Are you actually seeing contention
on this spinlock?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-21 15:21 ` Matthew Wilcox
@ 2020-05-21 16:06   ` Muchun Song
  2020-05-21 16:16     ` Greg KH
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2020-05-21 16:06 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh, ebiederm, bernd.edlinger, linux-kernel,
	linux-fsdevel, netdev, bpf

On Thu, May 21, 2020 at 11:21 PM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, May 21, 2020 at 08:38:35PM +0800, Muchun Song wrote:
> > There is another safe way to get the file structure without
> > holding the files->file_lock. That is rcu lock, and this way
> > has better performance. So use the rcu lock instead of the
> > files->file_lock.
>
> What makes you think this is safe?  Are you actually seeing contention
> on this spinlock?
>

I have read the doc which is in the Documentation/filesystems/files.txt.
If my understanding is correct, I think it is safe to use rcu lock.

Thanks.

-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-21 16:06   ` [External] " Muchun Song
@ 2020-05-21 16:16     ` Greg KH
  0 siblings, 0 replies; 7+ messages in thread
From: Greg KH @ 2020-05-21 16:16 UTC (permalink / raw)
  To: Muchun Song
  Cc: Matthew Wilcox, adobriyan, ast, daniel, kafai, songliubraving,
	yhs, andriin, john.fastabend, kpsingh, ebiederm, bernd.edlinger,
	linux-kernel, linux-fsdevel, netdev, bpf

On Fri, May 22, 2020 at 12:06:46AM +0800, Muchun Song wrote:
> On Thu, May 21, 2020 at 11:21 PM Matthew Wilcox <willy@infradead.org> wrote:
> >
> > On Thu, May 21, 2020 at 08:38:35PM +0800, Muchun Song wrote:
> > > There is another safe way to get the file structure without
> > > holding the files->file_lock. That is rcu lock, and this way
> > > has better performance. So use the rcu lock instead of the
> > > files->file_lock.
> >
> > What makes you think this is safe?  Are you actually seeing contention
> > on this spinlock?
> >
> 
> I have read the doc which is in the Documentation/filesystems/files.txt.
> If my understanding is correct, I think it is safe to use rcu lock.

Did you test this and prove that it is safe and "faster"?  If so, you
always have to show that in your changelog.  Please fix it up and
resend.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-21 12:38 [PATCH] files: Use rcu lock to get the file structures for better performance Muchun Song
  2020-05-21 15:21 ` Matthew Wilcox
@ 2020-05-21 16:47 ` Matthew Wilcox
  2020-05-22  7:52   ` [External] " Muchun Song
  1 sibling, 1 reply; 7+ messages in thread
From: Matthew Wilcox @ 2020-05-21 16:47 UTC (permalink / raw)
  To: Muchun Song
  Cc: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh, ebiederm, bernd.edlinger, linux-kernel,
	linux-fsdevel, netdev, bpf

On Thu, May 21, 2020 at 08:38:35PM +0800, Muchun Song wrote:
> +++ b/fs/proc/fd.c
> @@ -34,19 +34,27 @@ static int seq_show(struct seq_file *m, void *v)
>  	if (files) {
>  		unsigned int fd = proc_fd(m->private);
>  
> -		spin_lock(&files->file_lock);
> +		rcu_read_lock();
> +again:
>  		file = fcheck_files(files, fd);
>  		if (file) {
> -			struct fdtable *fdt = files_fdtable(files);
> +			struct fdtable *fdt;
> +
> +			if (!get_file_rcu(file)) {
> +				/*
> +				 * we loop to catch the new file (or NULL
> +				 * pointer).
> +				 */
> +				goto again;
> +			}
>  
> +			fdt = files_fdtable(files);

This is unusual, and may not be safe.

fcheck_files() loads files->fdt.  Then it loads file from fdt->fd[].
Now you're loading files->fdt again here, and it could have been changed
by another thread expanding the fd table.

You have to write a changelog which convinces me you've thought about
this race and that it's safe.  Because I don't think you even realise
it's a possibility at this point.

> @@ -160,14 +168,23 @@ static int proc_fd_link(struct dentry *dentry, struct path *path)
>  		unsigned int fd = proc_fd(d_inode(dentry));
>  		struct file *fd_file;
>  
> -		spin_lock(&files->file_lock);
> +		rcu_read_lock();
> +again:
>  		fd_file = fcheck_files(files, fd);
>  		if (fd_file) {
> +			if (!get_file_rcu(fd_file)) {
> +				/*
> +				 * we loop to catch the new file
> +				 * (or NULL pointer).
> +				 */
> +				goto again;
> +			}
>  			*path = fd_file->f_path;
>  			path_get(&fd_file->f_path);
> +			fput(fd_file);
>  			ret = 0;
>  		}
> -		spin_unlock(&files->file_lock);
> +		rcu_read_unlock();

Why is it an improvement to increment/decrement the refcount on the
struct file here, rather than take/release the spinlock?


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-21 16:47 ` Matthew Wilcox
@ 2020-05-22  7:52   ` Muchun Song
  2020-05-22 11:43     ` Matthew Wilcox
  0 siblings, 1 reply; 7+ messages in thread
From: Muchun Song @ 2020-05-22  7:52 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh, ebiederm, bernd.edlinger, linux-kernel,
	linux-fsdevel, netdev, bpf

On Fri, May 22, 2020 at 12:47 AM Matthew Wilcox <willy@infradead.org> wrote:
>
> On Thu, May 21, 2020 at 08:38:35PM +0800, Muchun Song wrote:
> > +++ b/fs/proc/fd.c
> > @@ -34,19 +34,27 @@ static int seq_show(struct seq_file *m, void *v)
> >       if (files) {
> >               unsigned int fd = proc_fd(m->private);
> >
> > -             spin_lock(&files->file_lock);
> > +             rcu_read_lock();
> > +again:
> >               file = fcheck_files(files, fd);
> >               if (file) {
> > -                     struct fdtable *fdt = files_fdtable(files);
> > +                     struct fdtable *fdt;
> > +
> > +                     if (!get_file_rcu(file)) {
> > +                             /*
> > +                              * we loop to catch the new file (or NULL
> > +                              * pointer).
> > +                              */
> > +                             goto again;
> > +                     }
> >
> > +                     fdt = files_fdtable(files);
>
> This is unusual, and may not be safe.
>
> fcheck_files() loads files->fdt.  Then it loads file from fdt->fd[].
> Now you're loading files->fdt again here, and it could have been changed
> by another thread expanding the fd table.
>
> You have to write a changelog which convinces me you've thought about
> this race and that it's safe.  Because I don't think you even realise
> it's a possibility at this point.

Thanks for your review, it is a problem. I can fix it.

>
> > @@ -160,14 +168,23 @@ static int proc_fd_link(struct dentry *dentry, struct path *path)
> >               unsigned int fd = proc_fd(d_inode(dentry));
> >               struct file *fd_file;
> >
> > -             spin_lock(&files->file_lock);
> > +             rcu_read_lock();
> > +again:
> >               fd_file = fcheck_files(files, fd);
> >               if (fd_file) {
> > +                     if (!get_file_rcu(fd_file)) {
> > +                             /*
> > +                              * we loop to catch the new file
> > +                              * (or NULL pointer).
> > +                              */
> > +                             goto again;
> > +                     }
> >                       *path = fd_file->f_path;
> >                       path_get(&fd_file->f_path);
> > +                     fput(fd_file);
> >                       ret = 0;
> >               }
> > -             spin_unlock(&files->file_lock);
> > +             rcu_read_unlock();
>
> Why is it an improvement to increment/decrement the refcount on the
> struct file here, rather than take/release the spinlock?
>

lock-free vs spinlock.

Do you think spinlock would be better than the lock-free method?
Actually I prefer the rcu lock.

-- 
Yours,
Muchun

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [External] Re: [PATCH] files: Use rcu lock to get the file structures for better performance
  2020-05-22  7:52   ` [External] " Muchun Song
@ 2020-05-22 11:43     ` Matthew Wilcox
  0 siblings, 0 replies; 7+ messages in thread
From: Matthew Wilcox @ 2020-05-22 11:43 UTC (permalink / raw)
  To: Muchun Song
  Cc: adobriyan, ast, daniel, kafai, songliubraving, yhs, andriin,
	john.fastabend, kpsingh, ebiederm, bernd.edlinger, linux-kernel,
	linux-fsdevel, netdev, bpf

On Fri, May 22, 2020 at 03:52:39PM +0800, Muchun Song wrote:
> On Fri, May 22, 2020 at 12:47 AM Matthew Wilcox <willy@infradead.org> wrote:
> > > @@ -160,14 +168,23 @@ static int proc_fd_link(struct dentry *dentry, struct path *path)
> > >               unsigned int fd = proc_fd(d_inode(dentry));
> > >               struct file *fd_file;
> > >
> > > -             spin_lock(&files->file_lock);
> > > +             rcu_read_lock();
> > > +again:
> > >               fd_file = fcheck_files(files, fd);
> > >               if (fd_file) {
> > > +                     if (!get_file_rcu(fd_file)) {
> > > +                             /*
> > > +                              * we loop to catch the new file
> > > +                              * (or NULL pointer).
> > > +                              */
> > > +                             goto again;
> > > +                     }
> > >                       *path = fd_file->f_path;
> > >                       path_get(&fd_file->f_path);
> > > +                     fput(fd_file);
> > >                       ret = 0;
> > >               }
> > > -             spin_unlock(&files->file_lock);
> > > +             rcu_read_unlock();
> >
> > Why is it an improvement to increment/decrement the refcount on the
> > struct file here, rather than take/release the spinlock?
> >
> 
> lock-free vs spinlock.

bananas vs oranges.

How do you think refcounts work?  How do you think spinlocks work?

> Do you think spinlock would be better than the lock-free method?
> Actually I prefer the rcu lock.

Why?  You don't seem to understand the tradeoffs.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, back to index

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-21 12:38 [PATCH] files: Use rcu lock to get the file structures for better performance Muchun Song
2020-05-21 15:21 ` Matthew Wilcox
2020-05-21 16:06   ` [External] " Muchun Song
2020-05-21 16:16     ` Greg KH
2020-05-21 16:47 ` Matthew Wilcox
2020-05-22  7:52   ` [External] " Muchun Song
2020-05-22 11:43     ` Matthew Wilcox

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git