All of lore.kernel.org
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
Date: Wed, 24 Nov 2021 20:03:58 +0000	[thread overview]
Message-ID: <YZ6arlsi2L3LVbFO@casper.infradead.org> (raw)
In-Reply-To: <20211124192024.2408218-4-catalin.marinas@arm.com>

On Wed, Nov 24, 2021 at 07:20:24PM +0000, Catalin Marinas wrote:
> +++ b/fs/btrfs/ioctl.c
> @@ -2223,7 +2223,8 @@ static noinline int search_ioctl(struct inode *inode,
>  
>  	while (1) {
>  		ret = -EFAULT;
> -		if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset))
> +		if (fault_in_exact_writeable(ubuf + sk_offset,
> +					     *buf_size - sk_offset))
>  			break;
>  
>  		ret = btrfs_search_forward(root, &key, path, sk->min_transid);

Couldn't we avoid all of this nastiness by doing ...

@@ -2121,10 +2121,9 @@ static noinline int copy_to_sk(struct btrfs_path *path,
                 * problem. Otherwise we'll fault and then copy the buffer in
                 * properly this next time through
                 */
-               if (copy_to_user_nofault(ubuf + *sk_offset, &sh, sizeof(sh))) {
-                       ret = 0;
+               ret = __copy_to_user_nofault(ubuf + *sk_offset, &sh, sizeof(sh));
+               if (ret)
                        goto out;
-               }
 
                *sk_offset += sizeof(sh);
@@ -2196,6 +2195,7 @@ static noinline int search_ioctl(struct inode *inode,
        int ret;
        int num_found = 0;
        unsigned long sk_offset = 0;
+       unsigned long next_offset = 0;
 
        if (*buf_size < sizeof(struct btrfs_ioctl_search_header)) {
                *buf_size = sizeof(struct btrfs_ioctl_search_header);
@@ -2223,7 +2223,8 @@ static noinline int search_ioctl(struct inode *inode,
 
        while (1) {
                ret = -EFAULT;
-               if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset))
+               if (fault_in_writeable(ubuf + sk_offset + next_offset,
+                                       *buf_size - sk_offset - next_offset))
                        break;
 
                ret = btrfs_search_forward(root, &key, path, sk->min_transid);
@@ -2235,11 +2236,12 @@ static noinline int search_ioctl(struct inode *inode,
                ret = copy_to_sk(path, &key, sk, buf_size, ubuf,
                                 &sk_offset, &num_found);
                btrfs_release_path(path);
-               if (ret)
+               if (ret > 0)
+                       next_offset = ret;
+               else if (ret < 0)
                        break;
-
        }
-       if (ret > 0)
+       if (ret == -ENOSPC || ret > 0)
                ret = 0;
 err:
        sk->nr_items = num_found;

(not shown: the tedious bits where the existing 'ret = 1' are converted
to 'ret = -ENOSPC' in copy_to_sk())
 
(where __copy_to_user_nofault() is a new function that does exactly what
copy_to_user_nofault() does, but returns the number of bytes copied)

That way, the existing fault_in_writable() will get the fault, and we
don't need to probe every 16 bytes.

WARNING: multiple messages have this Message-ID (diff)
From: Matthew Wilcox <willy@infradead.org>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Josef Bacik <josef@toxicpanda.com>,
	David Sterba <dsterba@suse.com>,
	Andreas Gruenbacher <agruenba@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>,
	Andrew Morton <akpm@linux-foundation.org>,
	Will Deacon <will@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-arm-kernel@lists.infradead.org,
	linux-btrfs@vger.kernel.org
Subject: Re: [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults
Date: Wed, 24 Nov 2021 20:03:58 +0000	[thread overview]
Message-ID: <YZ6arlsi2L3LVbFO@casper.infradead.org> (raw)
In-Reply-To: <20211124192024.2408218-4-catalin.marinas@arm.com>

On Wed, Nov 24, 2021 at 07:20:24PM +0000, Catalin Marinas wrote:
> +++ b/fs/btrfs/ioctl.c
> @@ -2223,7 +2223,8 @@ static noinline int search_ioctl(struct inode *inode,
>  
>  	while (1) {
>  		ret = -EFAULT;
> -		if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset))
> +		if (fault_in_exact_writeable(ubuf + sk_offset,
> +					     *buf_size - sk_offset))
>  			break;
>  
>  		ret = btrfs_search_forward(root, &key, path, sk->min_transid);

Couldn't we avoid all of this nastiness by doing ...

@@ -2121,10 +2121,9 @@ static noinline int copy_to_sk(struct btrfs_path *path,
                 * problem. Otherwise we'll fault and then copy the buffer in
                 * properly this next time through
                 */
-               if (copy_to_user_nofault(ubuf + *sk_offset, &sh, sizeof(sh))) {
-                       ret = 0;
+               ret = __copy_to_user_nofault(ubuf + *sk_offset, &sh, sizeof(sh));
+               if (ret)
                        goto out;
-               }
 
                *sk_offset += sizeof(sh);
@@ -2196,6 +2195,7 @@ static noinline int search_ioctl(struct inode *inode,
        int ret;
        int num_found = 0;
        unsigned long sk_offset = 0;
+       unsigned long next_offset = 0;
 
        if (*buf_size < sizeof(struct btrfs_ioctl_search_header)) {
                *buf_size = sizeof(struct btrfs_ioctl_search_header);
@@ -2223,7 +2223,8 @@ static noinline int search_ioctl(struct inode *inode,
 
        while (1) {
                ret = -EFAULT;
-               if (fault_in_writeable(ubuf + sk_offset, *buf_size - sk_offset))
+               if (fault_in_writeable(ubuf + sk_offset + next_offset,
+                                       *buf_size - sk_offset - next_offset))
                        break;
 
                ret = btrfs_search_forward(root, &key, path, sk->min_transid);
@@ -2235,11 +2236,12 @@ static noinline int search_ioctl(struct inode *inode,
                ret = copy_to_sk(path, &key, sk, buf_size, ubuf,
                                 &sk_offset, &num_found);
                btrfs_release_path(path);
-               if (ret)
+               if (ret > 0)
+                       next_offset = ret;
+               else if (ret < 0)
                        break;
-
        }
-       if (ret > 0)
+       if (ret == -ENOSPC || ret > 0)
                ret = 0;
 err:
        sk->nr_items = num_found;

(not shown: the tedious bits where the existing 'ret = 1' are converted
to 'ret = -ENOSPC' in copy_to_sk())
 
(where __copy_to_user_nofault() is a new function that does exactly what
copy_to_user_nofault() does, but returns the number of bytes copied)

That way, the existing fault_in_writable() will get the fault, and we
don't need to probe every 16 bytes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

  reply	other threads:[~2021-11-24 20:04 UTC|newest]

Thread overview: 68+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-24 19:20 [PATCH 0/3] Avoid live-lock in fault-in+uaccess loops with sub-page faults Catalin Marinas
2021-11-24 19:20 ` Catalin Marinas
2021-11-24 19:20 ` [PATCH 1/3] mm: Introduce fault_in_exact_writeable() to probe for " Catalin Marinas
2021-11-24 19:20   ` Catalin Marinas
2021-11-24 19:20 ` [PATCH 2/3] arm64: Add support for sub-page faults user probing Catalin Marinas
2021-11-24 19:20   ` Catalin Marinas
2021-11-24 19:20 ` [PATCH 3/3] btrfs: Avoid live-lock in search_ioctl() on hardware with sub-page faults Catalin Marinas
2021-11-24 19:20   ` Catalin Marinas
2021-11-24 20:03   ` Matthew Wilcox [this message]
2021-11-24 20:03     ` Matthew Wilcox
2021-11-24 20:37     ` Catalin Marinas
2021-11-24 20:37       ` Catalin Marinas
2021-11-25 22:25       ` Andreas Gruenbacher
2021-11-25 22:25         ` Andreas Gruenbacher
2021-11-25 22:42         ` Catalin Marinas
2021-11-25 22:42           ` Catalin Marinas
2021-11-26 22:29         ` Andreas Gruenbacher
2021-11-26 22:29           ` Andreas Gruenbacher
2021-11-26 22:57           ` Catalin Marinas
2021-11-26 22:57             ` Catalin Marinas
2021-11-27  3:52             ` Andreas Gruenbacher
2021-11-27  3:52               ` Andreas Gruenbacher
2021-11-27 14:33               ` Catalin Marinas
2021-11-27 14:33                 ` Catalin Marinas
2021-11-27 12:39         ` Andreas Gruenbacher
2021-11-27 12:39           ` Andreas Gruenbacher
2021-11-27 15:21           ` Catalin Marinas
2021-11-27 15:21             ` Catalin Marinas
2021-11-27 18:05             ` Andreas Gruenbacher
2021-11-27 18:05               ` Andreas Gruenbacher
2021-11-29 12:16               ` Catalin Marinas
2021-11-29 12:16                 ` Catalin Marinas
2021-11-29 13:33                 ` Andreas Gruenbacher
2021-11-29 13:33                   ` Andreas Gruenbacher
2021-11-29 15:36                   ` Catalin Marinas
2021-11-29 15:36                     ` Catalin Marinas
2021-11-29 18:40                     ` Linus Torvalds
2021-11-29 18:40                       ` Linus Torvalds
2021-11-29 19:31                       ` Andreas Gruenbacher
2021-11-29 19:31                         ` Andreas Gruenbacher
2021-11-29 20:56                       ` Catalin Marinas
2021-11-29 20:56                         ` Catalin Marinas
2021-11-29 21:53                         ` Linus Torvalds
2021-11-29 21:53                           ` Linus Torvalds
2021-11-29 23:12                           ` Catalin Marinas
2021-11-29 23:12                             ` Catalin Marinas
2021-11-29 13:52               ` Catalin Marinas
2021-11-29 13:52                 ` Catalin Marinas
2021-11-24 23:00     ` Linus Torvalds
2021-11-24 23:00       ` Linus Torvalds
2021-11-25 11:10       ` Catalin Marinas
2021-11-25 11:10         ` Catalin Marinas
2021-11-25 18:13         ` Linus Torvalds
2021-11-25 18:13           ` Linus Torvalds
2021-11-25 20:43           ` Catalin Marinas
2021-11-25 20:43             ` Catalin Marinas
2021-11-25 21:02             ` Matthew Wilcox
2021-11-25 21:02               ` Matthew Wilcox
2021-11-25 21:29               ` Catalin Marinas
2021-11-25 21:29                 ` Catalin Marinas
2021-11-25 21:40               ` Andreas Gruenbacher
2021-11-25 21:40                 ` Andreas Gruenbacher
2021-11-26 16:42   ` David Sterba
2021-11-26 16:42     ` David Sterba
2021-11-24 21:36 ` [PATCH 0/3] Avoid live-lock in fault-in+uaccess loops " Andrew Morton
2021-11-24 21:36   ` Andrew Morton
2021-11-24 22:31   ` Catalin Marinas
2021-11-24 22:31     ` Catalin Marinas

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YZ6arlsi2L3LVbFO@casper.infradead.org \
    --to=willy@infradead.org \
    --cc=agruenba@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=dsterba@suse.com \
    --cc=josef@toxicpanda.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.